Glossary

Last updated on 2025-06-28

Overview

This glossary presents key terminology relevant when using and interpreting LiME results. The following content is grouped into four main categories:

OS abstractions used by the Linux kernel,
analytical abstractions used in LiME 's real-time modeling,
a table connecting the Linux and real-time modelling abstractions, and
an explanation (and interpretation) of LiME 's output that characterizes the timing behaviour of the targeted threads.

The material presented here is based on the LiME paper. Readers interested in further details, formal definitions, and empirical results are encouraged to consult the original publication.

A detailed list of contents is provided below.

Content

Linux Preliminaries

Thread

A thread represents an ongoing computation defined by a register context and a call stack. Threads are the basic units of execution scheduled by the operating system. At any point in time, each thread in the system is either

runnable (i.e., either currently executing or ready to execute) or
blocked (i.e., in some state in which the thread’s computation cannot proceed until some thread-external event takes place).

Process

A process is a container for one or more threads that share system-level resources, including memory address space, file descriptors, and security capabilities. A process defines the resource context within which its threads execute.

🧩 Note: Linux documentation often refers to both threads and processes as “tasks.” To avoid ambiguity, LiME uses “thread” exclusively for the OS abstraction, and reserves “task” for real-time model entities.

System Call

A system call is a controlled entry point through which a user-space thread requests a service from the operating system kernel. System calls mediate access to privileged operations such as file I/O, memory management, inter-process communication, and scheduling.

🧩 In LiME , the file ${TASK_ID}.events.json contains a chronologically ordered partial trace of system calls, each annotated with its corresponding timestamp. The trace recorded by LiME is only partial as LiME monitors only potentially blocking system calls, and not all system calls.

Thread Activity

Fundamentally, at any point in time, a thread can be involved in one of three types of activity:

User-space computation — the thread computes in user space using only the processor and memory already mapped into its address space.
Non-blocking kernel interaction — the thread invokes system calls (or triggers exceptions) that result in kernel-space computation without blocking the thread (e.g., getpid()).
Blocking system interaction — the thread invokes potentially blocking system calls (or triggers exceptions) that change its state to blocked, preventing it from being scheduled until some external condition is satisfied.

Only category (iii) affects the thread's readiness to run. While activities in categories (i) and (ii) influence the thread's execution time, they do not affect whether the thread is ready to run.

Real-time System Preliminaries

Task and Task Model

A task is a mathematical abstraction of a thread for the purpose of real-time analysis. A task model formally characterizes the task’s timing behavior, including its arrival patterns, execution demands, and other temporal properties. These models are derived from traced thread activity and are used to support schedulability analysis, system validation, and anything else you may find them useful for.

🧩 In LiME , a task corresponds to a thread with specific properties (e.g., scheduling policy). Each task is assigned a unique TASK_ID, which combines the process ID (PID) with an identifier. When a thread’s properties change, the identifier is incremented, resulting in a new task with a different TASK_ID. Task information is recorded in ${TASK_ID}.infos.json.

Job

A job is a single invocation or instance of a task, corresponding to one discrete activation of the associated thread. Each job has an arrival (or release) time, a start time, a finish time, and observed execution time.

🧩 In LiME , jobs are inferred from the observed trace and form the fundamental units of analysis for execution profiling, and arrival pattern characterization.

Separator

In LiME , each blocking system call (also referred to as a separator) is interpreted as a distinct potential job boundary.

‼️ Important: LiME outputs a task model for each identified separator.

Mapping Real-Time and Linux Concepts

This section summarizes how abstract real-time concepts correspond to concrete Linux-level observations used by LiME .

Real-Time	Linux	Explanation
Task	Thread	In Linux, a thread is the smallest sequence of instructions that can be independently scheduled, while a task models the timing behavior of a thread.
Job	An instance of a thread activation between consecutive blocking calls	A job is a task instance which corresponds to a thread (re)activation, bounded by blocking system calls.
Separator	Blocking system call	A separator is a blocking system call used as a delimiter to infer job boundaries from the trace.
Release (of a job)	Thread unblocking / system call return	Both are interpreted as the time a thread becomes runnable after a blocking operation.
Execution Time (of a job)	CPU time spent running the thread between two blocking calls	Measured based on intervals when the thread is actively executing.

LiME Output: Characterising Timing Behaviour

LiME Pipeline Overview

Given an arbitrary black-box workload (e.g., a thread implementing periodic activations with clock_nanosleep as shown in the figure above), LiME performs the following steps:

1. Observes system activity by using eBPF to monitor key scheduling events and system calls.
2. Reconstructs the timeline for each target thread (task) based on the collected trace.
3. Identifies job boundaries by using its builtin understanding of Linux system call semantics (separators).
4. Characterizes timing behaviour by deriving classic task models.

Ultimately, the timing behaviour is characterized by the task models contained in the ${TASK_ID}.models.json file, which are explained in detail below.

Characterization of Arrival Behavior

We begin by defining and describing the models that formally characterize thread arrival behaviour and patterns.

Sporadic Arrival Curve

A sporadic arrival curve models task arrivals (~ thread reactivations) with the lower bound $T$ on the minimum time between any two consecutive job releases. It captures the notion that no two jobs of the same task can be released closer than $T$ time units apart. The sporadic task model is especially well-suited for representing infrequently triggered tasks, such as those activated by external events or interrupts.

Example with Explanatory Interactive Plot

Consider the following LiME 's output in ${TASK_ID}.models.json.

{ 
  "model": "sporadic",
  "mit": 5
}

The "mit": 5 field specifies the minimum inter-arrival time (MIT) of $T = 5$ time units between consecutive job releases. This is further exemplified and explained in the figure below.

💡Informal explanation (hover):
Hover over the plot to see the explanation.

See formal explanation

Let:

$r_j$ be the release time of the task’s $j$-th job,
$T$ be the task’s minimum inter-arrival time,
$\Delta \geq 0$ be the interval length under consideration.

Then, the number of job releases within any time window of length $\Delta$ is bounded by: $$ \forall t \in \mathbb{R}_+, \quad \big| { j \ : \ r_j \in [t, \ t + \Delta] } \big| \leq \left\lceil \frac{\Delta}{T} \right\rceil $$

Upper and Lower Arrival Curves

Upper and lower arrival curves provide bounds on the number of thread activations (releases) that can occur within any arbitrary time interval $\Delta$ within the observed trace.

Example with Explanatory Interactive Plot

💡Informal explanation (hover):
Hover over the plot to see the explanation for upper and lower arrival curves.

The above figure corresponds to the following LiME 's output in ${TASK_ID}.models.json.

"arrival_models": [
  {
    "model": "arrival_curve",
    "dmins": [3, 7, 20],
    "dmaxs": [4, 10, 20]
  }
]

The upper arrival curve is defined by the "dmins" which specifies time instants that upper-bound the number of thread activations. The $i$-th entry of "dmins" corresponds to a duration $\Delta$ such that there are at most $i + 1$ activations within $\textrm{dmins}[0]$ time units. For $\Delta = 0$, the number of jobs is 0, and for $0 < \Delta \leq \textrm{dmins}[0]$ it is equal to 1.

Similarly for the lower arrival curve, the "dmaxs" field specifies time instants that lower-bound the number of thread activations. The $i$-th entry of "dmaxs" corresponds to a duration $\Delta$ such that there are at least $i + 1$ activations within $\textrm{dmaxs}[0]$ time units. For $0 \leq \Delta \leq \textrm{dmaxs}[0]$, the number of jobs is 0.

See formal explanation

Let:

$r_j$ be the release time of the task’s $j$-th job,
$\Delta \geq 0$ be the interval length under consideration,
$\alpha_i^+(\Delta)$ be the upper arrival curve of task $\tau_i$,
$\alpha_i^-(\Delta)$ be the lower arrival curve of task $\tau_i$.

Then, the number of job releases of task $\tau_i$ in any time window of length $\Delta$ is bounded as follows:

$$ \forall t \in \mathbb{R}_+, \quad \alpha_i^-(\Delta) \leq \big| { j \ : \ r_j \in [t,\ t + \Delta) } \big| \leq \alpha_i^+(\Delta) $$

That is, for every time interval of duration $\Delta$, the number of jobs released by task $\tau_i$ lies between $\alpha_i^-(\Delta)$ and $\alpha_i^+(\Delta)$.

🧩 The upper arrival curve is, by definition, no more pessimistic than the sporadic arrival curve. It provides a potentially tighter (i.e., less conservative) upper bound on the number of job releases over time.

Periodic Arrival Curve

The jitter-aware periodic model defines regularly spaced thread activations, where each activation (release) is allowed to deviate within a bounded time window. That is, activations occur periodically with spacing $T$ (period), starting from $\Phi$ (offset), but each may be delayed by up to $J$ (jitter) time units.

Example with Explanatory Interactive Plot

Consider that LiME outputs the following in ${TASK_ID}.models.json.

{
  "model": "periodic",
  "on": "release",
  "period": 2,
  "offset": 10,
  "max_jitter": 1
}

Hover over the plot below to see this data interpreted for each job index.

💡Informal explanation (hover):
Hover over the plot to see the explanation for upper and lower bounds on release times.

In the plot above:

The green circles mark the earliest possible release times.
The red circles mark the latest possible release times.
The difference between the two circles corresponds to the jitter value (J = 1), and all actual releases must lie between these bounds.

In more detail, this data suggests that the first thread activation (release) occurs no earlier than the specified offset, i.e., at offset = 10 time units, and no later than offset + max_jitter = 11 time units. Subsequent activations follow with a spacing of period = 2 time units, each subject to a jitter of at most 1. That is, the j-th activation occurs within the interval $[\Phi + (j - 1) \cdot T \ , \ \Phi + (j - 1) \cdot T + J]$ with $\Phi = 10$, $T = 2$ and $J = 1$.

Note that the "on" field denotes if the exact intended thread activation times are observable, in which case its value is set to arrival. If the intended activation time is unobservable, and LiME can only observe activation times that may be delayed due to various system events, the value is release.

See formal explanation

Let:

$r_j$ be the release time of the task’s $j$-th job,
$\Phi$ be the offset (first release),
$T$ be the period,
$J$ be the maximum jitter.

In a jitter-aware periodic model, the $j$-th job of a task is released within a bounded time window:

$$ \forall j \in \mathbb{N}, \ r_j \in [\Phi + (j - 1) \cdot T \ , \ \Phi + (j - 1) \cdot T + J] $$

Characterization of Execution Behavior

Finally, we define and desribe the models that formally characterize thread execution behaviour and patterns.

Worst-Case Execution Time (WCET)

The worst-case execution time (WCET) limits the maximum execution time of any job.

Example with Explanatory Interactive Plot

Consider that LiME outputs the following in ${TASK_ID}.models.json.

{
  "wcet_n": [
        100,
        ...
      ],
}

The first element of "wcet_n" is WCET, specifying that the maximum observed execution time of any identified job is 100 time units.

See formal explanation

Let:

$c_j$ be the observed execution time of the $j$-th observed job of the targeted task (thread),
$J$ be the number of observed jobs,
$C$ be the WCET,

Then, for $1 \leq j \leq J, \ c_j \leq C$

In general, it rarely occurs that several consecutive jobs simultaneously experience the worst-case execution times, for this reason, LiME also reports the following, more general execution model.

Multi-job WCET

The multi-job worst-case execution time (WCET-n) limits the maximum cumulative execution time of any $n$ observed consecutive jobs. Thus, $\text{WCET}(1)$ reduces to the above-defined WCET bound.

Example with Explanatory Interactive Plot

Consider that LiME outputs the following in ${TASK_ID}.models.json.

{
  "wcet_n": [
        100,
        180,
        240,
        280,
        300
      ],
}

Hover over the plot below to see this data interpreted for each value of $n$.

💡Informal explanation (hover):
Hover over the plot to see how many jobs can execute within a given cumulative execution time.

See formal explanation

Let:

$c_j$ be the observed execution time of the $j$-th job of the target task (thread),
$J$ be the total number of observed jobs,
$\text{WCET}(k)$ be the maximum cumulative execution time of any $k$ consecutive jobs.

Then, for all $k \in ${$1, 2, \dots, J$} and for all valid windows of $k$ consecutive jobs, i.e. $\forall j \in ${$1, \dots, J - k + 1$}, we have:

$$ \sum_{k = j}^{j + k - 1} c_k \leq \text{WCET}(k) $$

That is, the total execution time of any $k$ consecutive jobs is upper bounded by $\text{WCET}(k)$.

Self-suspension

Many threads in Linux exhibit self-suspension, that is, they may voluntarily pause their execution due to I/O, synchronization, or blocking system calls. Because self-suspensions affect timing behaviour and ultimately schedulability, they are explicitly modeled.

LiME reports two complementary models of self-suspension: a simple dynamic bound, and a more precise segmented characterization.

Dynamic Self-suspension

The dynamic self-suspension model captures the maximum observed amount of time a job spent self-suspending.

Example with Explanatory Interactive Plot

Consider that LiME outputs the following in ${TASK_ID}.models.json.

{
  "dynamic_self_suspension": 40
}

💡Informal explanation:

This indicates that every observed job may suspend for at most 40 time units in total, regardless of the number of suspensions it experienced.

See formal explanation

Let:

$s_{j,k}$ be the duration of the $k$-th suspension of the $j$-th observed job,
$J$ be the number of observed jobs,
$m_j$ be the number of suspensions in job $j$,
$S$ be the dynamic self-suspension bound.

Then, for $ 1 \leq j \leq J, \quad \sum_{k=0}^{m_j} s_{j,k} \leq S$

Segmented Self-suspension

This model captures both the number of suspension segments and the upper bound on the duration of each segment. It is useful to capture the alternations between computation and suspension phases.

Example with Explanatory Plot

Consider that LiME outputs the following in ${TASK_ID}.models.json.

"segmented_self_suspensions": {
        "1": [[10,20]],
        "2": [[15,30],[25,50]],
        "3": [[10,20],[15,22],[8,14]],
}

The following plot visualizes the above results, specifying the total number of suspension-execution pairs identified in the observed jobs. The data primarily suggests that there were jobs that experienced

only one suspension ($y = \text{1 segment}$),
two suspensions ($y = \text{2 segments}$) and
three suspensions ($y = \text{3 segments}$).

💡Informal explanation:

In the 2 segments row, consider the second grey rectangle, 25 time units long. One interpretation is:

For any job experiencing 2 self-suspensions, its second suspension segment lasts no more than 25 time units.

Similarly, in the 3 segments row, consider the first black rectangle, 20 units long. One interpretation is:

For any job experiencing 3 self-suspensions, its first execution segment lasts no more than 20 time units.

See formal explanation

Let:

$m_j$ be the number of suspension segments in the $j$-th observed job,
$(s_{j,k}, c_{j,k})$ be the duration of the $k$-th suspension and execution pair of job $j$,
$\vec{S}^{(\ell)} = \langle (S_{\ell,0}, C_{\ell,0}), \dots, (S_{\ell,m}, C_{\ell,m}) \rangle$ be a segment vector for a job with $m$ suspensions, from a predefined set of vectors (a "bag of segments").

Then, for each observed job $j$, there exists an admissible vector $\vec{S}^{(\ell)}$ such that:

the number of segments matches: $|\vec{S}^{(\ell)}| = m_j + 1$, and
the following bounds are satisfied:

$$ \text{for} \ 0 \leq k \leq m_j, \quad s_{j,k} \leq S_{\ell,k} \quad \text{and} \quad c_{j,k} \leq C_{\ell,k} $$

This captures both the number of segments and the maximum duration for each segment, providing upper bounds on both suspension and execution segments.