Realtime Preemption

From: eLinux.org

Realtime Preemption

1 Description
2 Resources
3 Downloads
- 3.1 Patch
- 3.2 Utility programs
4 How To Use
- 4.1 Configuration variables
5 How to validate
6 Related projects
7 Sample Results
8 Status
9 Future Work/Action Items
- 9.1 people who expressed
  interest

Description

Overview

Realtime Preemption is (as of this writing 12/21/2004) a patch which
tries to improve realtime performance of the Linux kernel.

Recent patches from Ingo include a (large) number of technologies for
improving preemption and debugging preemption issues with the Linux
kernel.

An overview of the technologies is as follows:

voluntary preempt = a set of voluntary preemption points for the
kernel, to improve normal scheduling latency (These changes
basically
BKL change to semaphore
latency tracer

Voluntary Preempt

Overview:

if it’s on at compile time, it can be turned off at runtime with
the command line: “voluntary-preemption=0” or
“voluntary-preemption=off”
Creates a new function might_resched(), which is used by might_sleep().
- might_resched calls cond_resched() if voluntary preemption is on.
- Adds might_sleep in several places.

Conversion of Spinlocks to Mutexes

According to Ingo Molnar, it’s primary author, “the big change in this
release is the addition of PREEMPT_REALTIME, which is a new
implementation of a fully preemptible kernel model”

For a brief description of the overall technology, see:
http://kerneltrap.org/node/3995?PHPSESSID=4bc02ae16e5a27308031f3cd664fd574

Briefly, the technology makes spinlocks and rwlocks preemptible by
default.

the patch auto-detects at compile-time the type of lock to use
for a spinlock (mutex or original raw_spinlock)
it uses a feature of gcc to manage this (reducing patch size)
it uses native Linux semaphores for preemption
it convert rwlocks to rw-semaphores
apparently, about 90 locks are targetted for NON-conversion to
preemptibility (that is, they are preserved as RAW_SPINLOCKS)

Ingo mentioned at one time that this was about 20% of the locks in his
kernel configuration, implying that there were about 450 spinlocks
present in the kernel in his configuration.

Ingo said this about how well this works on Un-processor (UP) systems
versus SMP systems.

...and no matter how well UP works, to fix SMP one has to 'cover' all the
necessary locks first before fixing it, which (drastic) increase in raw
locks invalidates most of the UP efforts of getting rid of raw locks.
That's why i decided to go for SMP primarily - didnt see much point in
going for UP.

Normally, in UP the spinlocks are compiled away. When PREEMPT is turned
on (without the new patch) these spinlocks are turned into markers for
non-preemptible regions. When RT-PREEMPT is used,

people working on/interested in this stuff

Ingo Molnar, Red Hat, voluntary
preemption, Ingo real-time preemption
Sven Dietrich, Monta Vista, MV
real-time preemption
Daniel Walker, Monta Vista,
priority inheritance??
John Cooper, Time Sys, ???
Tim Bird, Sony, port to 2.6.10-native, port to PPC
Scott Woods, Time Sys, IRQ threading??

Bill Huey, Lynux Works??, mmlinux

miscellaneous comments

Comments regarding the scheduling of RT tasks

Ingo said (in this
message):

note that my -RT patchset includes scheduler changes that implement
“global RT scheduling” on SMP systems. Give it a go, it’s at:

  http://redhat.com/~mingo/realtime-preempt/

you have to enable CONFIG_PREEMPT_RT to active this feature. I’ve
designed this code to not hurt non-RT scheduling, and i’ve optimized
performance for the ‘lightly loaded case’ (which is the most common to
occur on mainline-using systems).

A very short description of the design: there’s a global ‘RT overload
counter’ - which is zero and causes no overhead if there is at most 1 RT
task in every runqueue. (i.e. at most 2 RT tasks on a 2-way system, at
most 4 RT tasks on a 4-way system, etc.) If the system gets into ‘RT
overload’ mode (e.g. the third RT task gets activated on a 2-way box),
then the scheduler starts to balance the RT tasks agressively. Also,
whenever an RT task is preempted on a CPU, or is woken up but cannot
preempt a higher-prio RT task on a given CPU, then it’s ‘pushed’ to
other CPUs if possible. This design avoids global locking (it avoids a
global runqueue), which simplifies things immensely. (I first tried a
global runqueue for RT tasks but the complexity impact was much bigger.)

(note that these scheduler changes are resonably self-contained and do
not depend on other parts of PREEMPT_RT, so in theory they could be
added to mainline too, after some time - given lots of testing and broad
agreement.)

comments regarding the hard parts of this work

Ingo says (at:
http://groups-beta.google.com/group/linux.kernel/msg/cf036477d30ab736)

some of the harder stuff:

the handling of per-CPU data structures (get_cpu_var())
RCU and softirq data structures
the handling of the IRQ flag

comments about the number of raw spinlocks needed

Ingo says (at:
http://groups-beta.google.com/group/linux.kernel/msg/e63b2860d2e993dd)

Sven Dietrich sdietr...@mvista.com wrote:

IMO the number of raw_spinlocks should be lower, I said teens before.

Theoretically, it should only need to be around hardware registers and
some memory maps and cache code, plus interrupt controller and other
SMP-contended hardware.

yeah, fully agreed. Right now the 90 locks i have means roughly 20% of
all locking still happens as raw spinlocks.

But, there is a ‘correctness’ minimum set of spinlocks that must be
raw spinlocks - this i tried to map in the -T4 patch. The patch does run
on SMP systems for example. (it was developed as an SMP kernel - in fact
i never compiled it as UP :-|.) If code has per-CPU or preemption
assumptions then there is no choice but to make it a raw spinlock, until
those assumptions are fixed.

Rationale

This feature is intended to provide much better realtime scheduling
response for a Linux system.

Resources

Projects

Various parties are working on ports: Time Sys
and Monta Vista, in particular, seem to have made ports to PPC and ARM
platforms.

Specifications

None that I’m aware of.

Online resources

The original announcement for voluntary-preemption:

http://people.redhat.com/mingo/realtime-preempt/older/ANNOUNCE-voluntary

Here’s some stuff by Jonathon Corbet:

There’s a page of links about RT for audio at:

http://www.affenbande.org/~tapas/wiki/index.php?Low%20latency%20for%20audio%20work%20on%20linux%202.6.x

A brief introduction of RT patch (Sorry, in Japanese only):

http://www.atmarkit.co.jp/fembedded/rtos03/rtos03a.html
Paper: “Embedded GNU/Linux and Real-Time an executive
summary“,
2010 by Robert Berger
- This papers, prepared for the Embedded World Conference 2010,
  compares different real-time approaches (including RT-preempt
  and dual-kernel approaches).
- The paper has an extensive list of references, which are very
  good.

Downloads

Patch

 See http://redhat.com/~mingo/realtime-preempt/

Utility programs

[other programs, user-space, test, etc. related to this technology]

How To Use

apply patch
choose desired preemption level
compile kernel

Configuration variables

The patch introduces (or modifies) the following configuration
variables:

Variable	Purpose
ASM_SEMAPHORES
BLOCKER
CRITICAL_IRQSOFF_TIMING
CRITICAL_PREEMPT_TIMING
CRITICAL_TIMING
FRAME_POINTER
LATENCY_TIMING
LATENCY_TRACE
MCOUNT
PREEMPT
PREEMPT_BKL
PREEMPT_DESKTOP
PREEMPT_HARDIRQS
PREEMPT_NONE
PREEMPT_RT
PREEMPT_SOFTIRQS
PREEMPT_TRACE
PREEMPT_VOLUNTARY
RTC_HISTOGRAM
RT_DEADLOCK_DETECT
RWSEM_GENERIC_SPINLOCK
RWSEM_XCHGADD_ALGORITHM
SPINLOCK_BKL
USE_FRAME_POINTER
WAKEUP_TIMING

retrieved from patch with command:

grep "[+-]config " realtime-preempt-2.6.10-mm1-V0.7.34-01 | sed "s/[+-]config //" | sort | uniq

How to validate

[put references to test plans, scripts, methods, etc. here]

use included trace feature, or
use included latency overrun reporting mechanism
Preemption_Instrumentation

Monta Vista released a similar technology,
which had the following features:

See
http://groups-beta.google.com/group/linux.kernel/msg/7eeef031d9ec1446

These RT enhancements are an integration of features developed by
others and some new MontaVista components:

Voluntary Preemption by Ingo Molnar
IRQ thread patches by Scott Wood and Ingo Molnar
BKL mutex patch by Ingo Molnar (with MV extensions)
PMutex from Germany’s Universitaet der Bundeswehr, Munich
MontaVista mutex abstraction layer replacing spinlocks with mutexes

Sample Results

[Examples of use with measurement of the effects.]

Case Study 1

Linux RT Benchmarking Framework
- http://www.opersys.com/lrtbf/
Summary of dicussion in LKLM (sorry in Japanese)
- http://japan.linux.com/kernel/05/07/25/2334226.shtml?topic=1
- http://japan.linux.com/kernel/05/08/29/0817208.shtml?topic=1

Case Study 2

Trevor Woerner published some results in November 2005 regarding some
latency measurements he have been recording on the 2.6.14 kernel with
Ingo’s patches.

See
http://geek.vtnet.ca/embedded/LatencyTests/html/index.html

Case Study 3

Status

Rt_Preempt_Subpatch_Table
Status: [not started??]

(one of: not started, researched, implemented, measured, documented, accepted)
Architecture Support:

(for each arch, one of: unknown, patches apply, compiles, runs, works, accepted)
- i386: unknown
- ARM: unknown
- PPC: unknown
- MIPS: unknown
- SH: unknown

Future Work/Action Items

Here is a list of things that could be worked on for this feature:

help with mainlining???
perform testing on multiple platforms
provide use cases for justification
what else?
break patch into manageable pieces - doesn’t Ingo use any kind of patch management system???

people who expressed interest

Manas Saksena, Jon Masters, Takeharu Kato, Ralph Siemsen, Jyunji Kondo

Categories:

实时抢占

Realtime Preemption

Contents

Description

Overview

Voluntary Preempt

Conversion of Spinlocks to Mutexes

people working on/interested in this stuff

miscellaneous comments

Comments regarding the scheduling of RT tasks

comments regarding the hard parts of this work

comments about the number of raw spinlocks needed

Rationale

Resources

Projects

Specifications

Online resources

Downloads

Patch

Utility programs

How To Use

Configuration variables

How to validate

Sample Results

Case Study 1

Case Study 2

Case Study 3

Status

Future Work/Action Items

people who expressed interest

实时抢占

Realtime Preemption

Contents

Description

Overview

Voluntary Preempt

Conversion of Spinlocks to Mutexes

people working on/interested in this stuff

people working on related stuff

miscellaneous comments

Comments regarding the scheduling of RT tasks

comments regarding the hard parts of this work

comments about the number of raw spinlocks needed

Rationale

Resources

Projects

Specifications

Online resources

Downloads

Patch

Utility programs

How To Use

Configuration variables

How to validate

Related projects

Sample Results

Case Study 1

Case Study 2

Case Study 3

Status

Future Work/Action Items

people who expressed interest