With multi- and many-core based systems, performance increase on the microprocessor side will continue according to Moore's Law, at least in the near future. However, as the number of cores and the complexity of on-chip memory hierarchies increases, performance limitations due to slow memory access are expected to get worse, making it hard for users to fully exploit the theoretically available performance. In addition, the increasingly sophisticated design of compute clusters, based on the use of accelerator components (GPGPUs by AMD and NVIDIA, Intel Xeon Phi, integrated GPUs etc.) add further challenges to achieving efficient programming of many-core-based HPC and high-end embedded systems.
Therefore, compute and data intensive tasks can only benefit from the hardware's full potential, if both processor and architecture features are taken into account at all stages - from the early algorithmic design, via appropriate programming models, up to the final implementation.
The APPMM Workshop topics of interest include (but are not limited to) the following:
Novel programming models and associated frameworks, or extensions of existing programming models, to ease offloading and parallelization of computation to multi- and many-cores.
Compiler, runtime and parallelization approaches to optimally exploit specific features of heterogeneous hardware (e.g., hierarchical communication layout, NUMA, scratchpad memory, accelerators, etc.) and to maximize performance, energy and other relevant metrics in many-core operation.
Architecture-assisted software design. Novel architectural concepts to boost software execution and to overcome scalability issues of current multi- and many-core systems. Hardware-assisted runtime environment services (e.g., synchronization, custom memory hierarchies, etc.)
Concepts for exploiting emerging vector extensions of instruction sets.
Software engineering, code optimization, and code generation strategies for parallel systems with multi- and many-core processors.
Tools for performance and memory behavior analysis (including cache simulation) for parallel systems with multi- and many-core processors.
Performance modeling and performance engineering approaches for multi-thread and multi-process applications.
Application parallelization use cases, benchmarking and benchmark suites. Hardware-aware, compute- and memory-intensive simulations of real-world problems in computational science and engineering (for example, from applications in electrical, mechanical, civil, or medical engineering).
Manycore-aware approaches for large-scale parallel simulations in both implementation and algorithm design, including scalability studies.