User Tools

Site Tools


realtime:documentation:howto:applications:cpuidle

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
realtime:documentation:howto:applications:cpuidle [2023/09/27 15:46]
costa.shul [Configurations to guard critical cores from interference] cpu-partitioning
realtime:documentation:howto:applications:cpuidle [2024/06/28 11:32] (current)
costa.shul [Reference] sysfs
Line 13: Line 13:
 ===== Configurations to guard critical cores from interference ===== ===== Configurations to guard critical cores from interference =====
  
-It would help to understand some basic configurations used in a real-time application environment to help reduce interference into the cores that run the real-time applications. ​ These configurations are done in kernel boot parameters. ​ Real-time applications can be run in “mixed mode” where some cores run real-time applications referred to as “critical cores” while other cores run regular tasks. If not running in mixed mode then all the cores would be running real-time applications and some of the configurations discussed below may not be necessary. See [[realtime:​documentation:​howto:​tools:​c:start|CPU partitioning]] and [[https://​docs.kernel.org/​admin-guide/​kernel-parameters.html#​cpu-lists|cpu lists in The kernel'​s command-line parameters]] for details.+It would help to understand some basic configurations used in a real-time application environment to help reduce interference into the cores that run the real-time applications. ​ These configurations are done in kernel boot parameters. ​ Real-time applications can be run in “mixed mode” where some cores run real-time applications referred to as “critical cores” while other cores run regular tasks. If not running in mixed mode then all the cores would be running real-time applications and some of the configurations discussed below may not be necessary. See [[realtime:​documentation:​howto:​tools:​cpu-partitioning:start|CPU partitioning]] and [[https://​docs.kernel.org/​admin-guide/​kernel-parameters.html#​cpu-lists|cpu lists in The kernel'​s command-line parameters]] for details.
  
 isolcpus=//​list of critical cores// – isolcpus=//​list of critical cores// –
Line 88: Line 88:
 ===== Tools used to measure latencies ===== ===== Tools used to measure latencies =====
  
-[[realtime:​documentation:​howto:​tools:​cyclictest:​start|Cyclictest]] is used to measure the latencies while turbostat is used to identify the C states that are selected and their residencies. ​+[[realtime:​documentation:​howto:​tools:​cyclictest:​start|Cyclictest]] is used to measure the latencies while [[https://​manpages.debian.org/​testing/​linux-cpupower/​turbostat.8.en.html|turbostat]] ​is used to identify the C states that are selected and their residencies. ​
 See [[https://​man.archlinux.org/​man/​cyclictest.8.en|cyclictest manpage]]. See [[https://​man.archlinux.org/​man/​cyclictest.8.en|cyclictest manpage]].
  
Line 134: Line 134:
 CPU 3 is the critical core running real-time workloads. ​ It is isolated and protected as described above. CPU 3 is the critical core running real-time workloads. ​ It is isolated and protected as described above.
  
-At each point we can use turbostat to check the C states used in a CPU as follows:+At each point we can use [[https://​manpages.debian.org/​testing/​linux-cpupower/​turbostat.8.en.html|turbostat]] ​to check the C states used in a CPU as follows:
 <​code>​ <​code>​
 $turbostat --debug $turbostat --debug
Line 200: Line 200:
 CPU topology plays an important role on how the processor utilizes the power saving capabilities of the different C states. Processors have multiple cores and the operating system groups logical CPUs within each core.  Each of these groupings has shared resources that can be turned off only when all the processing units in that group reach a certain C state. ​ If one logical CPU in a core can enter a deep C state but other logical CPUs are still running or at a lesser power saving C state, the CPU that can enter the deep state will be held at a less power saving state. ​ This is because if the shared resources are turned off, then the other CPUs that are still running, will not be able to run.  The same applies to package C states. ​ A package can enter a deep C state only when all the cores in that package enter a certain deep C state, when the package level components can be turned off. CPU topology plays an important role on how the processor utilizes the power saving capabilities of the different C states. Processors have multiple cores and the operating system groups logical CPUs within each core.  Each of these groupings has shared resources that can be turned off only when all the processing units in that group reach a certain C state. ​ If one logical CPU in a core can enter a deep C state but other logical CPUs are still running or at a lesser power saving C state, the CPU that can enter the deep state will be held at a less power saving state. ​ This is because if the shared resources are turned off, then the other CPUs that are still running, will not be able to run.  The same applies to package C states. ​ A package can enter a deep C state only when all the cores in that package enter a certain deep C state, when the package level components can be turned off.
  
-When designing a multi-core real-time application,​ assign tasks to a cluster of cores that can go idle at the same time.  This may require some static configuration and knowledge of processor topology. Tools like turbostat can be used to get an idea of the groupings.+When designing a multi-core real-time application,​ assign tasks to a cluster of cores that can go idle at the same time.  This may require some static configuration and knowledge of processor topology. Tools like [[https://​manpages.debian.org/​testing/​linux-cpupower/​turbostat.8.en.html|turbostat]] ​can be used to get an idea of the groupings.
  
 Another area to consider is cache optimization. ​ Deeper C states would cause caches and TLBs to be flushed. ​ Upon resume, the caches need to be reloaded for optimal performance. This reloading can cause latencies at places where it was not expected based on earlier calibrations. ​ This can be avoided by adding logic in the methods described above to also force the cache to get repopulated by critical memory regions. ​ As the application wakes up from deeper C states earlier than the approaching critical phase, it can access the memory regions it would need to reference in the critical phase, forcing them to get reloaded in the cache. ​ This cache repopulating technique can be incorporated into any general cache optimization scheme the real-time application may be using. The technique applies not only to C states but also to any situation where the cache must be repopulated. ​ Another area to consider is cache optimization. ​ Deeper C states would cause caches and TLBs to be flushed. ​ Upon resume, the caches need to be reloaded for optimal performance. This reloading can cause latencies at places where it was not expected based on earlier calibrations. ​ This can be avoided by adding logic in the methods described above to also force the cache to get repopulated by critical memory regions. ​ As the application wakes up from deeper C states earlier than the approaching critical phase, it can access the memory regions it would need to reference in the critical phase, forcing them to get reloaded in the cache. ​ This cache repopulating technique can be incorporated into any general cache optimization scheme the real-time application may be using. The technique applies not only to C states but also to any situation where the cache must be repopulated. ​
Line 206: Line 206:
 ===== Reference ===== ===== Reference =====
  
-Kernel parameters: ​https://​www.kernel.org/​doc/​Documentation/​admin-guide/​kernel-parameters.txt+[[https://​www.kernel.org/​doc/​html/latest/​admin-guide/​kernel-parameters.html#:​~:​text=cpuidle|Kernel parameters]]
  
-Kernel scheduling ticks: ​https://​www.kernel.org/​doc/​Documentation/timers/NO_HZ.txt+[[https://​www.kernel.org/​doc/​html/latest/timers/no_hz.html|NO_HZ: Reducing Scheduling-Clock Ticks]]
  
-PM QoS: https://​www.kernel.org/​doc/​Documentation/​power/​pm_qos_interface.txt+[[https://​www.kernel.org/​doc/​html/latest/​power/​pm_qos_interface.html|PM Quality Of Service Interface]]
  
-Cyclictest: https://​wiki.linuxfoundation.org/​realtime/documentation/howto/tools/cyclictest+[[realtime:documentation:howto:tools:cyclictest:​start|Cyclictest]]
  
-Reducing OS jitter: ​https://git.kernel.org/​pub/scm/linux/kernel/​git/​torvalds/​linux.git/​tree/​Documentation/​kernel-per-CPU-kthreads.txt?h=v4.14-rc2+[[https://www.kernel.org/​doc/html/latest/admin-guide/​kernel-per-CPU-kthreads.html|Reducing OS jitter due to per-cpu kthreads]]
  
-Good reference for C states: ​https://​books.google.com/​books?​id=DFAnCgAAQBAJ&​pg=PA177&​lpg=PA177&​dq=c+state+latency+MSR&​source=bl&​ots=NLTLrtN4JJ&​sig=1ReyBgj1Ej0_m6r6O8wShEtK4FU&​hl=en&​sa=X&​ved=0ahUKEwifn4yI08vZAhUFwVQKHW1nDgIQ6AEIZzAH#​v=onepage&​q=c%20state%20latency%20MSR&​f=false ​+[[https://​books.google.com/​books?​id=DFAnCgAAQBAJ&​pg=PA177&​lpg=PA177&​dq=c+state+latency+MSR&​source=bl&​ots=NLTLrtN4JJ&​sig=1ReyBgj1Ej0_m6r6O8wShEtK4FU&​hl=en&​sa=X&​ved=0ahUKEwifn4yI08vZAhUFwVQKHW1nDgIQ6AEIZzAH#​v=onepage&​q=c%20state%20latency%20MSR&​f=false|Good reference for C states]]
  
- +[[https://​manpages.debian.org/​testing/​linux-cpupower/​cpupower-idle-info.1.en.html|cpupower idle-info]] - Utility to retrieve cpu idle kernel information 
 + 
 +Sysfs: ''/​sys/​devices/​system/​cpu/​cpu*/​cpuidle/''​ 
 + 
 +Source: [[https://​git.kernel.org/​pub/​scm/​linux/​kernel/​git/​stable/​linux.git/​tree/​include/​linux/​cpuidle.h|include/​linux/​cpuidle.h]] 
 +[[https://​git.kernel.org/​pub/​scm/​linux/​kernel/​git/​stable/​linux.git/​tree/​drivers/​cpuidle|drivers/​cpuidle]]
  
realtime/documentation/howto/applications/cpuidle.1695829614.txt.gz · Last modified: 2023/09/27 15:46 by costa.shul