The SIL2LinuxMP project is an collaborative research project to provide procedures and methods to qualify Linux on a multi-core embedded platform at safety integrity level 2 (SIL2) according to IEC 61508 Ed 2.
Some more information on the SIL2LinuxMP project is available here:
Primary mentoring contact: Lukas Bulwahn, lukas.bulwahn at gmail.com, Nicholas Mc Guire, der.herr at hofr.at
Recently, some efforts by Google's kernel developers have made it possible to compile large parts of the Linux kernel with the clang/LLVM compiler. The clang compiler is known to include a number of static analysis methods that identify typical bug patterns and bad code smells, which the compiler indicates to the user by compiler warnings. However, when one compiles the current Linux kernel with clang and switches on to warn on certain warning classes, the compiler shows a tremendous large number of warnings.
The final goal is to find a proper setup that reduces the number of warnings with clang compiler with a suitable post-processing and filtering, such that Linux kernel developers can practically improve their submitted patches during the development with the reported findings of potential bugs, bug patterns and code smells.
A first analysis of specific cases of the clang warnings showed that many of the found warnings fall into a comparatively small set of classes that follow a very specific and characteristic syntactic pattern and hence, many warnings can be assessed once.
As a first step, one needs to determine the relevance of the type of clang warnings for the Linux kernel and its contribution to the avoidance of certain typical bug patterns. If the clang warning type in general considered relevant, the next step is to identify the typical set of subclasses of this warning. Then, we can write the syntactic patterns in coccinelle scripts to automatically group all warnings of the identified subclasses. As a last step, the subclasses can then be assessed to determine if they are generally false reports (so, an irrelevant subclass within the relevant class of warnings), or if this subclass points to code where a bug can be uncovered with a comparatively high probability. Once the different classes have been identified and assessed, further warnings can be automatically marked and these warnings can be presented to the developers weighted by the confidence and relevance of the warnings.
This way, the developers should have useful static analysis report for their patches and the kernel community can gradually improve the different parts of the Linux kernel during the continuous evolution of the overall source code. If time within the project permits, this approach can further be applied to finding of other static analysis tools, e.g., clang tidy or sparse.
Benefit: The student project provides more evidence that kernel development follows a defined process and discussion on identified patterns shows that a controlled and informed process is followed to decide which code patterns should be generally followed and which are enforced by the different tools and their setup.
To work on this project, you should show that you are able to use all the needed tools, especially coccinelle, and that you can properly analyse Linux kernel source code and write a precise and well-written report on your finding. Details for what is required for a successful application is provided on request.
Goal of this project is to develop a tool that describes the conditions when a source file is included in the build of the Linux kernel with a symbolic term, i.e., a boolean formula over the kernel configuration flags and further kernel build variables.
For example, the `kernel/Makefile` includes the following build instructions:
obj-$(CONFIG_FUTEX) += futex.o ifeq ($(CONFIG_COMPAT),y) obj-$(CONFIG_FUTEX) += futex_compat.o endif
Here, the file `kernel/futex.c` is included in the build if and only if CONFIG_FUTEX is set. Similarly, the `kernel/futex_compat.c` is include if and only if CONFIG_FUTEX and CONFIG_COMPAT is set.
The developed tool should be able to analyse this and provide for this example an output that would look like this:
kernel/futex.c <- CONFIG_FUTEX kernel/futex_compat.c <- CONFIG_COMPAT && CONFIG_FUTEX
To achieve this in perfection, it would require a Herculean task: As the kernel build system is make, which is already quite complex by itself and can further invoke arbitrary shell scripts, it would require to write the complete make build system and shell evaluation as symbolic evaluator. This would just much too complex and laborious considering the value of the information that shall be collected.
However, we are convinced that the Makefiles in the kernel tree have only a rather low number of relevant patterns that are relevant to obtain the information that we are interested in. Hence, the symbolic expression can be quite simply be determined by writing a rather simple symbolic evaluator that understands the most prominent patterns and ignores the parts in the Makefile that are not relevant for the symbolic expression.
This symbolic evaluator provides a first good approximation of that information to answer and address further questions. With an increasing number of patterns covered, the approximation of the tool becomes increasingly better. At any point of the development, the quality of the symbolic evaluation can be evaluated to its ground truth, defined by comparing the tool's symbolic prediction against actual kernel builds with a large number of concrete build configurations.
Furthermore, these boolean expressions can be combined with the constraints of the kconfig tool to find potential contradictions in the overall kernel build configuration setup.
Once, this tool works properly on the granularity, we could consider to extend that even further on precise information for each line within a file. Again, we would follow a similarly pragmatic approach parsing a small set of patterns to determine if certain lines are only included under certain build configuration conditions.
The result of this symbolic evaluation is useful in multiple ways:
To work on this project, you should show that you are able to use all the needed tools, and that you can implement a simple interpreter, e.g., you can write a parser, interpreter and static analyser for a small toy programming language. Details for what is required for a successful application is provided on request.
Task: Run Facebook Infer on the Linux kernel source code, write models in Facebook Infer to improve the analysis.
This is a quite challenging project; in the application, we expect that you have proved that you can run Facebook Infer on the complete Linux kernel source code and you obtained a first analysis result.
If you cannot run Facebook Infer on the complete Linux kernel source, you should prove that you understand why Facebook Infer fails on certain parts, suggest different alternative work arounds and solutions, and already applied the work arounds, so that you can run Facebook Infer on the kernel source code with a fully known, understood and limited number of work arounds.
The project proposal should include first technical steps that show how you write models in Infer.
We use a number of tools, checkpatch.pl, coccinelle scripts, sparse, etc. and these tools report certain findings. While the valid ones are addressed by the kernel developers, the invalid tool findings are manually assessed and not acted upon. Over time with addressing the valid findings, the proportion of invalid findings increase compared to newly appearing valid findings, as invalid findings of those tools are not marked and tracked over the various versions.
In this GSoC project, the student should work out methods and tools to track the tool findings and make these tools useful in the Linux kernel community.
The Linux kernel community has a number of tools to ensure the quality of the continuous kernel development. Among these tools are coccinelle, sparse, checkpatch.pl, lock dependency validator, KASAN, syzkaller and many more.
In the GSoC project, the student should find suitable ways to make the Linux developers aware of the tools' findings. There are various way in which this could be implemented, e.g.:
This project idea is quite wide and we expect the student to provide a more specific description of the task to tackle with some evidence that he/she will be able to implement the proposal.
Goal of this project is to create and gather detailed information on bugs in the Linux kernel.To obtain this information, the student will have to analyse commits of the Linux kernel stabilisation process in a large number by manual assessment.
The assessment should allow us to answer these kind of questions:
This project will require to create an ontology of bugs; this will require some work reading general literature and quite some effort assessing and understanding the bug fix commits. You will not mainly program new features, but you will learn a lot about the kernel code by looking and assessing bug fixes and trying to derive general statements from these observations.