"An optimized task-based runtime system for resource-constrained parallel accelerators," 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, 2016.
Manycore accelerators have recently proven a promising solution for increasingly powerful and energy efficient computing systems. This raises the need for parallel programming models capable of effectively leveraging hundreds to thousands of processors. Task-based parallelism has the potential to provide such capabilities, offering flexible support to fine-grained and irregular parallelism. However, efficiently supporting this programming paradigm on resource-constrained parallel accelerators is a challenging task. In this paper, we present an optimized implementation of the OpenMP tasking model for embedded parallel accelerators, discussing the key design solution that guarantee small memory (footprint) and minimize performance overheads. We validate our design by comparing to several state-of-the-art tasking implementations, using the most representative parallelization patterns. The experimental results confirm that our solution achieves near-ideal speedups for tasks as small as 5K cycles.
"Response-Time Analysis of DAG Tasks under Fixed Priority Scheduling with Limited Preemptions", In the Design, Automation, and Test in Europe conference (DATE), March 14-18, 2016, Dresden (Germany).
Limited preemptive (LP) scheduling has been demonstrated to effectively improve the schedulability of fully preemptive (FP) and fully non-preemptive (FNP) paradigms. On one side, LP reduces the preemption related overheads of FP; on the other side, it restricts the blocking effects of FNP. However, LP has been applied to multi-core scenarios only when completely sequential task systems are considered. This paper extends the current state-of-the-art response time analysis for global fixed priority scheduling with fixed preemption points by deriving a new response time analysis for DAG-based task-sets.