User Tools

Site Tools


realtime:documentation:howto:debugging:debug-steps

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
realtime:documentation:howto:debugging:debug-steps [2018/08/21 09:30]
ebugden [Measure the length] Update link
realtime:documentation:howto:debugging:debug-steps [2023/10/03 05:39] (current)
costa.shul Latency detection tools
Line 23: Line 23:
 If it is possible, also pay attention to when the latency happens. If the measured latencies are of a similar length and they happen in similar situations, then they are most likely caused by the same issue. If it is possible, also pay attention to when the latency happens. If the measured latencies are of a similar length and they happen in similar situations, then they are most likely caused by the same issue.
  
-A tool that is frequently used for measuring latencies is [[realtime:​documentation:​howto:​tools:​cyclictest:​start|Cyclictest]]. Using Cyclictest correctly can be challenging at first, but if the tool is configured correctly and is run for a sufficient amount of time, then it can provide reasonably accurate measurements for most latencies. The tool does have some limitations which are described in various places in its documentation such as on the Cyclictest [[realtime:​documentation:​howto:​debugging:​cyclictest:​test-design:start|test design]] page.+A tool that is frequently used for measuring latencies is [[realtime:​documentation:​howto:​tools:​cyclictest:​start|Cyclictest]]. Using Cyclictest correctly can be challenging at first, but if the tool is configured correctly and is run for a sufficient amount of time, then it can provide reasonably accurate measurements for most latencies. The tool does have some limitations which are described in various places in its documentation such as on the Cyclictest [[realtime:​documentation:​howto:​tools:​cyclictest:​test-design|test design]] page.
  
 ===== Isolate the source ===== ===== Isolate the source =====
Line 53: Line 53:
 Once again, it is best to eliminate the most obvious possible causes before moving on to the more complex possible causes. Bugs in the application or in the operating system are much more common than bugs in the firmware and in the hardware. So, start by confirming that interrupts and preemption are not disabled for too long and then explore other possibilities if necessary. Latencies can be caused by so many things. The task could be waiting for a resource, waiting for a lock, waiting for a device, etc. Once again, it is best to eliminate the most obvious possible causes before moving on to the more complex possible causes. Bugs in the application or in the operating system are much more common than bugs in the firmware and in the hardware. So, start by confirming that interrupts and preemption are not disabled for too long and then explore other possibilities if necessary. Latencies can be caused by so many things. The task could be waiting for a resource, waiting for a lock, waiting for a device, etc.
  
-If after looking at the code there does not seem to be anything that explains why the latency happens, then the latency could be caused by the firmware or the hardware. The documentation about identifying [[realtime:​documentation:​howto:​debugging:​cyclictest-smi-ftrace|SMI latencies]] with function tracing can help confirm if this is the case. If the latency is indeed caused by the firmware or the hardware, then determining exactly why the latency is happening can become extremely difficult. This is because there is often very little documentation available about the behavior of these parts of a system so it is sometimes challenging to understand exactly what causes the latency.+If after looking at the code there does not seem to be anything that explains why the latency happens, then the latency could be caused by the firmware or the hardware. The documentation about identifying [[realtime:​documentation:​howto:​debugging:​smi-latency:​cyclictest-tracing|SMI latencies]] with function tracing can help confirm if this is the case. If the latency is indeed caused by the firmware or the hardware, then determining exactly why the latency is happening can become extremely difficult. This is because there is often very little documentation available about the behavior of these parts of a system so it is sometimes challenging to understand exactly what causes the latency.
  
 ===== Fix the problem ===== ===== Fix the problem =====
Line 64: Line 64:
  
 After applying the supposed fix, test the system in the original conditions that caused the latency. If the latency does not occur, then this confirms that the latency has been fixed, or at least that it does not occur under those conditions as frequently. However, if the latency is still observed, then it could mean that the problem was correctly identified but that the solution is wrong. It could also mean that a different latency was resolved because the tracing overhead changed the behavior of the system. After applying the supposed fix, test the system in the original conditions that caused the latency. If the latency does not occur, then this confirms that the latency has been fixed, or at least that it does not occur under those conditions as frequently. However, if the latency is still observed, then it could mean that the problem was correctly identified but that the solution is wrong. It could also mean that a different latency was resolved because the tracing overhead changed the behavior of the system.
 +
 +
 +More information
 +  * [[realtime:​documentation:​howto:​tools:​start#​latency_detection|Latency detection tools]] ​
realtime/documentation/howto/debugging/debug-steps.1534843815.txt.gz ยท Last modified: 2018/08/21 09:30 by ebugden