User Tools

Site Tools


gsoc:2018-gsoc-safety-critical-linux

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
gsoc:2018-gsoc-safety-critical-linux [2018/02/23 13:05]
lukas.bulwahn created first proposals
gsoc:2018-gsoc-safety-critical-linux [2018/03/19 13:04] (current)
lukas.bulwahn Two more proposals
Line 15: Line 15:
 ===== Student Project Proposals ===== ===== Student Project Proposals =====
  
-==== Filtering ​clang compiler warnings with coccinelle scripts ====+==== Tailoring ​clang compiler warnings with coccinelle scripts ====
  
-Recently, some efforts by Google'​s kernel developers have made it possible to compile large parts of the Linux kernel with the clang/LLVM compiler. +Recently, some efforts by Google'​s kernel developers have made it possible to compile large parts of the Linux kernel with the clang/LLVM compiler. The clang compiler is known to include a number of static analysis methods that identify typical bug patterns and bad code smells, which the compiler indicates to the user by compiler warnings. However, when one compiles the current Linux kernel with clang and switches on to warn on certain warning classes, the compiler shows a tremendous large number of warnings.
- +
-The clang compiler is known to include a number of static analysis methods that identify typical bug patterns and bad code smells, which the compiler indicates to the user by compiler warnings. +
- +
-However, when one compiles the current Linux kernel with clang and switches on to warn on certain warning classes, the compiler shows a tremendous large number of warnings.+
  
 The final goal is to find a proper setup that reduces the number of warnings with clang compiler with a suitable post-processing and filtering, such that Linux kernel developers can practically improve their submitted patches during the development with the reported findings of potential bugs, bug patterns and code smells. The final goal is to find a proper setup that reduces the number of warnings with clang compiler with a suitable post-processing and filtering, such that Linux kernel developers can practically improve their submitted patches during the development with the reported findings of potential bugs, bug patterns and code smells.
Line 27: Line 23:
 A first analysis of specific cases of the clang warnings showed that many of the found warnings fall into a comparatively small set of classes that follow a very specific and characteristic syntactic pattern and hence, many warnings can be assessed once. A first analysis of specific cases of the clang warnings showed that many of the found warnings fall into a comparatively small set of classes that follow a very specific and characteristic syntactic pattern and hence, many warnings can be assessed once.
  
-As a first step, one needs to determine the relevance of the type of clang warnings for the Linux kernel and its contribution to the avoidance of certain typical bug patterns. If the clang warning type in general considered relevant, the next step is to identify the typical set of subclasses of this warning. +As a first step, one needs to determine the relevance of the type of clang warnings for the Linux kernel and its contribution to the avoidance of certain typical bug patterns. If the clang warning type in general considered relevant, the next step is to identify the typical set of subclasses of this warning. ​Then, we can write the syntactic patterns in coccinelle scripts to automatically group all warnings of the identified subclasses. As a last step, the subclasses can then be assessed to determine if they are generally false reports (so, an irrelevant subclass within the relevant class of warnings), or if this subclass points to code where a bug can be uncovered with a comparatively high probability. Once the different classes have been identified and assessed, further warnings can be automatically marked and these warnings can be presented to the developers weighted by the confidence and relevance of the warnings.
- +
-We can write the syntactic patterns in coccinelle scripts to automatically group all warnings of the identified subclasses. As a last step, the subclasses can then be assessed to determine if they +
-are generally false reports (so, an irrelevant subclass within the relevant class of warnings), or if this subclass points to code where a bug can be uncovered with a comparatively high probability. +
- +
-Once the different classes have been identified and assessed, further warnings can be automatically marked and these warnings can be presented to the developers weighted by the confidence and relevance of the warnings+
- +
-This way, the developers should have useful static analysis report for their patches and the kernel community can gradually improve the different parts of the Linux kernel during the continuous evolution of the overall source code.+
  
-If time within the project permits, this approach can further be applied to finding of other static analysis tools, e.g., clang tidy or sparse.+This way, the developers should have useful static analysis report for their patches and the kernel community can gradually improve the different parts of the Linux kernel during the continuous evolution of the overall source code. If time within the project permits, this approach can further be applied to finding of other static analysis tools, e.g., clang tidy or sparse.
  
 Required Knowledge: Required Knowledge:
Line 75: Line 64:
 However, we are convinced that the Makefiles in the kernel tree have only a However, we are convinced that the Makefiles in the kernel tree have only a
 rather low number of relevant patterns that are relevant to obtain the information rather low number of relevant patterns that are relevant to obtain the information
-that we are interested in. +that we are interested in. Hence, the symbolic expression can be quite simply be determined by writing a rather simple symbolic evaluator that understands the most prominent patterns
- +
-Hence, the symbolic expression can be quite simply be determined by writing a +
-rather simple symbolic evaluator that understands the most prominent patterns+
 and ignores the parts in the Makefile that are not relevant for the symbolic and ignores the parts in the Makefile that are not relevant for the symbolic
 expression. expression.
Line 114: Line 100:
 To work on this project, you should show that you are able to use all the needed tools, and that you can implement a simple interpreter,​ e.g., you can write a parser, interpreter and static analyser for a small toy programming language. Details for what is required for a successful application is provided on request. To work on this project, you should show that you are able to use all the needed tools, and that you can implement a simple interpreter,​ e.g., you can write a parser, interpreter and static analyser for a small toy programming language. Details for what is required for a successful application is provided on request.
  
 +==== Apply Facebook Infer to Linux kernel source code ====
 +
 +Task: Run Facebook Infer on the Linux kernel source code, write models in Facebook Infer to improve the analysis.
 +
 +This is a quite challenging project; in the application,​ we expect that you have proved that you can run Facebook Infer on the complete Linux kernel source code and you obtained a first analysis result.
 +
 +If you cannot run Facebook Infer on the complete Linux kernel source, you should prove that you understand why Facebook Infer fails on certain parts, suggest different alternative work arounds and solutions, and already applied the work arounds, so that you can run Facebook Infer on the kernel source code with a fully known, understood and limited number of work arounds.
 +
 +The project proposal should include first technical steps that show how you write models in Infer.
 +
 +Required Knowledge:
 +  - Required: Very good understanding of static analysis
 +  - Required: Very good knowledge of C, skill to READ AND UNDERSTAND source code in the Linux kernel in independent work
 +  - Required: Very good knowledge of make and python
 +  - Required: Good knowledge and practical experience with OCaml
 +  - Required: good analytical skills to understand why static analysis reports findings in certain source code parts, solid documentation skills, good English writing skills with clear precise style
 +  - Desired: Basic knowledge of the kernel build system
 +
 +==== Develop Methods for Tracking Tool Analysis Findings over Time ====
 +
 +We use a number of tools, checkpatch.pl,​ coccinelle scripts, sparse, etc. and these tools report certain findings.
 +While the valid ones are addressed by the kernel developers, the invalid tool findings are manually assessed and not acted upon. Over time with addressing the valid findings, the proportion of invalid findings increase compared to newly appearing valid findings, as invalid findings of those tools are not marked and tracked over the various versions.
 +
 +In this GSoC project, the student should work out methods and tools to track the tool findings and make these tools useful in the Linux kernel community.
 + 
 +Required Knowledge:
 +  - Required: Very good knowledge of C, skill to READ AND UNDERSTAND source code in the Linux kernel in independent work
 +  - Required: Very good knowledge of python
 +  - Required: Good understanding of git
 +  - Recommended:​ Some understanding of static analysis tools
 +  - Recommended:​ Some understanding of coccinelle
 +
 +==== Make Linux kernel community aware of tool findings ====
 +
 +The Linux kernel community has a number of tools to ensure the quality of the continuous kernel development. Among these tools are coccinelle, sparse, checkpatch.pl,​ lock dependency validator, KASAN, syzkaller and many more.
 +
 +In the GSoC project, the student should find suitable ways to make the Linux developers aware of the tools' findings. There are various way in which this could be implemented,​ e.g.:
 +
 +  - Setting up an infrastructure that runs those tools on patches provided on the mailing list and reports the findings back to the patch authors
 +  - Including the tool findings in the elixir development service
 +  - Providing means to tag and comment on tool findings in the distributed Linux kernel development
 +
 +This project idea is quite wide and we expect the student to provide a more specific description of the task to tackle with some evidence that he/she will be able to implement the proposal.
 +
 +Required Knowledge:
 +  - Required: Very good knowledge of a suitable programming language, e.g., python
 +  - Required: Good understanding of git
 +  - Required: Good knowledge of C
 +  - Recommended:​ Basic understanding how the kernel community works
 +  - Recommended:​ Basic understanding of the kernel tools
 +
 +==== Assess, Aggregate and Provide Detailed Statistics on Kernel Bug Fixes ====
 +
 +Goal of this project is to create and gather detailed information on bugs in the Linux kernel.To obtain this information,​ the student will have to analyse commits of the Linux kernel stabilisation process in a large number by manual assessment.
 +
 +The assessment should allow us to answer these kind of questions:
 +
 +  - Which type of problems are identified in the early stage of the stabilisation process?
 +  - Which type of problems are identified in the later stages of the stabilisation process?
 +  - How are problems identified, by review, by testing, by tools, by production use?
 +
 +This project will require to create an ontology of bugs; this will require some work reading general literature and quite some effort assessing and understanding the bug fix commits.
 +You will not mainly program new features, but you will learn a lot about the kernel code by looking and assessing bug fixes and trying to derive general statements from these observations.
 +
 +Required Knowledge:
 +  - Required: Very good knowledge of C, skill to READ AND UNDERSTAND source code in the Linux kernel in independent work
 +  - Required: Very good understanding of software development processes and software development methods
 +  - Required: Good understanding of git
gsoc/2018-gsoc-safety-critical-linux.1519391142.txt.gz · Last modified: 2018/02/23 13:05 by lukas.bulwahn