Predicting multicore timing interference
How much is my application exposed to multicore timing interference? Can I get early figures on the worst-case impact of multicore execution on its timing behavior? And, finally, can I exploit such information to amend or guide the design of the hardware/software system configuration?
The Task Contention Model (TCM) aims at providing a positive answer to the above questions.
Multicore timing interference arises as an effect of multiple requests to the same hardware resource being sent in parallel from multiple cores. In the most common scenario, such resources are not capable of serving multiple requests at the same time, which implies that cores contend for obtaining the right to use the resource, and some requests will be necessarily delayed. The maximum cumulative delay suffered by the requests issued by a task determines the impact of multicore timing interference.
The TCM provides an analytical (as opposed to empirical) upper bound to the worst-case impact on the execution time of a given task when the latter executes in a multicore system as opposed to a single core one. The bounds computed by the TCM can be used in the early design stage in the software development life-cycle, to steer and optimize the design and configuration of the final system in view of the reduction (or even avoidance) of multicore timing interference
The baseline reference for the TCM is the worst-case execution time bound in isolation of the task (or process or runnable) under analysis. The TCM computes the worst-case effect of contention, which ultimately serves as delta factor to inflate the single-core execution time budgets to account for any possible multicore timing interference. The bounds computed by the TCM are, by construction, conservative with respect to all possible execution scenarios under a specific hardware and software configuration. The latter aspect is particularly relevant as the inflating factor or delta is inherently platform and configuration dependent: actual figures for the same task may largely vary across deployment scenarios. The TCM builds on the following elements:
A specific platform comes with its own set of shared resources and interference channels, and typically provides different configuration options that may positively or negatively affect the amount of interference in the system. A clear example is providing support to partition a shared cache. Building on an extensive analysis/characterization effort and long-standing hardware expertise, the TCM identifies and models the sources of multicore timing interference. Timing characterization of hardware resources is conducted by leveraging on architectural micro-benchmarks. The hardware platform is a rather static element in the TCM, once the target and configuration are selected.
Software architecture and execution model
The execution model enforced by the software architecture, in combination with the scheduling mechanisms implemented by the underlying Real-time Operating System and/or hypervisor, determine the multicore execution scenarios. The potential timing interference incurred by a task clearly depends on the degree of segregation (mostly timing but also space) guaranteed by the architectural paradigm: the amount of interference in a statically scheduled, fully partitioned ARINC653 system with space and time partitioning cannot be compared with the interference potentially arising in a fully preemptive globally-scheduled system. The TCM models the execution model for the system under analysis building on the architectural paradigm and RTOS/hypervisor specification.
As timing interference arises on simultaneous requests to the same resource, the share of interference suffered by a task depends on two application-specific aspects. On one hand, interference is a function of the actual usage of each shared resource in the system: it depends on both the number of accesses performed by the task under analysis itself and the number of accesses performed by those tasks running on the other cores. On the other hand, interference can only arise between tasks that execute in parallel. Both aspects are expected as an input to the TCM: the former is determined by the application profile as captured by hardware event monitors (HEMs), and the latter is defined by the specific task set and properties (e.g., scheduling attributes such as priority or activation pattern).
The TCM computes early bounds to the worst-case impact of multicore timing interference on the execution time of each task or process in the system under analysis. The underlying analytical model builds on conservative assumptions to cover those aspects that cannot be accurately determined but in very late development stages, close to the final product. The TCM results allow early and fast evaluation of different hardware and software configuration options: the produced bounds can be fed back to a configuration tool or optimization framework as part of a timing interference avoidance or reduction strategy.