This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
realtime:documentation:rcu [2016/10/28 16:01] paulmck Move preemptible RCU to the end |
— (current) | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== RCU Configuration for Real-Time Systems ====== | ||
- | In mainline Linux, RCU carries out significant processing in softirq context, | ||
- | during which preemption is disabled. This wiki page describes how to configure | ||
- | |||
- | ===== RCU Callback Offloading ===== | ||
- | |||
- | By default in mainline Linux, RCU callbacks are invoked in softirq context. | ||
- | These callbacks often free memory, and the memory allocators can therefore | ||
- | impose large latencies when they take their slowpaths. Although these | ||
- | latencies cannot be avoided, they can be directed to the CPUs of your choice | ||
- | through use of RCU callback offloading. | ||
- | |||
- | To offload callbacks, build your kernel with ''CONFIG_RCU_NOCB_CPU=y''. | ||
- | To enable callback offloading on all CPUs, build with ''CONFIG_RCU_NOCB_CPU_ALL=y''. | ||
- | If you wish to be more selective, specify a list of CPUS to be offloaded with the | ||
- | ''rcu_nocbs'' kernel boot parameter. | ||
- | For example, ''rcu_nocbs=1,3-4'' would enable callback offloading on CPUs 1, 3, and 4. | ||
- | Note that offloading can be specified only at boot time, and cannot be changed | ||
- | at runtime. | ||
- | |||
- | Each CPU with offloaded callbacks will have a group of ''rcuo'' kthreads. | ||
- | For example, CPU 1 would have ''rcuob/1'' (for RCU-bh), ''rcuop/1'' (for RCU-preempt), | ||
- | and ''rcuos/1'' (for RCU-sched). | ||
- | These kthreads can be assigned to specific CPUs and can be assigned scheduling priorities | ||
- | as desired. | ||
- | |||
- | There is of course no free lunch. | ||
- | Use of RCU callback offloading means that ''call_rcu()'' incurs greater overhead due to atomic operations, cache misses, and wakeups. | ||
- | The wakeup overhead alone can result in tens of percent throughput degradation on some workloads, which is why Linux distributions default to no callback offloading. | ||
- | This wakeup overhead can be shifted from the task invoking ''call_rcu()'' to the ''rcuo'' kthreads using the ''rcu_nocb_poll'' kernel boot parameter, but at the expense of degraded energy efficiency due to the polling wakeups. | ||
- | Note that care is required when assigning the ''rcuo'' kthreads to specific CPUs, for example, placing all of these kthreads on a single CPU might overload that CPU, which could throttle callback invocation, potentially even OOMing the system. | ||
- | |||
- | Note that any ''nohz_full'' CPU will also have its RCU callbacks offloaded. This mode of operation also gracefully handles CPU-bound real-time user-space threads. | ||
- | |||
- | ===== RCU Priority Boosting ===== | ||
- | |||
- | One potential downside of preemptible RCU is that a low-priority task might be preempted in the middle of an RCU read-side critical section. | ||
- | If the system's higher-priority tasks consume all available CPU, that low-priority task might never resume, and thus might never leave its critical section. | ||
- | This in turn would prevent RCU grace periods from completing, eventually OOMing the system. | ||
- | |||
- | This normally indicates a design or configuration bug: Event-driven real-time applications should leave significant idle time in order to avoid queuing delays in the scheduler, among other things. This idle time would permit the low-priority task to proceed, in turn allowing grace periods to complete, thus avoiding OOM. | ||
- | |||
- | However, bugs can happen, including bugs involving infinite loops in high-priority real-time threads. | ||
- | Debugging these problems is more difficult if the system keeps hanging due to OOM. | ||
- | One way to ease debugging is to build with ''CONFIG_RCU_BOOST=y'', which by default will boost tasks blocking the current grace period for more than half a second to real-time priority level 1. | ||
- | Additional Kconfig options ''CONFIG_RCU_KTHREAD_PRIO'' and ''CONFIG_RCU_BOOST_DELAY'' provide additional control of RCU priority boosting. | ||
- | Please see the Kconfig help text for more information. | ||
- | |||
- | ===== Expedited RCU Grace Periods ===== | ||
- | |||
- | Embedded systems sometimes have severe boot-time requirements, and RCU's grace-period delays can be a problem for these systems. | ||
- | If so, using the ''CONFIG_RCU_EXPEDITE_BOOT'' Kconfig option will cause RCU to expedite grace periods until ''init'' is spawned, thus speeding up the early boot process. | ||
- | In addition, the ''rcupdate.rcu_expedited'' and ''rcupdate.rcu_normal'' sysfs parameters can be used to enable and disable expedited grace periods at runtime. | ||
- | |||
- | However, it is unwise to use too many expedited grace periods while an event-driven real-time application is running because expedited grace periods send IPIs to all non-idle CPUs (however, RCU considers ''nohz_full'' CPUs to be idle, so CPU-bound real-time threads are not impeded by these IPIs). | ||
- | Use the ''rcupdate.rcu_normal'' sysfs parameter to completely disable RCU's expedited grace periods. | ||
- | Note that this does not come for free: Some networking configuration operations run much more slowly when ''rcupdate.rcu_normal'' is in effect. | ||
- | |||
- | ===== Real-Time, RCU, and softirqs ===== | ||
- | |||
- | The ''-rt'' patchset contains a patch that causes RCU to substitute kthreads for most of its softirq execution. | ||
- | This patch is not yet in mainline due to large performance degradation for some workloads. | ||
- | It is hoped that mainline will gain this capability sooner rather than later, but it will need to be disabled by default for non-real-time builds/boots. | ||
- | RCU so as to minimize the resulting real-time latency degradation. | ||
- | |||
- | ===== Preemptible RCU ===== | ||
- | |||
- | Although real-time kernel builds typically enable ''CONFIG_PREEMPT_RCU=y'' by default, you should double-check this. | ||
- | Failing to enable this Kconfig option can result in excessive latencies due to non-preemptible RCU read-side critical sections. |