User Tools

Site Tools


gsoc:2018-gsoc-lsb-projects

2018 GSoC LSB projects

Go back to the main GSoC Linux Foundation page

Mailing list: lsb-discuss at lists dot linux-foundation dot org

IRC: #lsb on irc.freenode.net

Workgroup Resources

Code License: GPL/BSD, specs: GNU FDL

Mentors: Alexey Khoroshilov (alexey dot khoroshilov at gmail dot com), Jeff Licquia (licquia at linuxfoundation dot org), Vadim Mutilin, Denis Silakov (dsilakov at gmail dot com).

Distributed scheduler development for Linux kernel modules static verification

New Klever framework used for static verification of kernel modules in Linux Driver Verification (LDV) program consists of three components: Web interface “Bridge”, verification tasks generator “Core” and orchestration system “scheduler”. The first one allows a user to configure and start a verification process. The second one generates an environment model and inserts asserts into the source code to check given modules against rule specifications, then it prepares verification tasks as a result. The third component schedule runs of Core and Verification tools to generate verification tasks using Core and solve them using available static verification tools.

Currently “Scheduler” allows employing a single workstation or a virtual machine to perform verification. In practice, verification requires many verification resources (days and even weeks of CPU time, dozens and sometimes hundreds of GB of RAM). So there is a need in the development of a distributed version of the Scheduler.

It is supposed to use already existing orchestration systems for this task such as Kubernetes. Moreover, Scheduler was initially designed for distributed execution, so there is either no necessity in rewriting scheduling algorithms or developing the component from scratch.

Desired knowledge: Python, Kubernetes
Mentors: Ilja Zakharov, Eugene Novikov (eunovm at gmail dot com)

Develop New LSB Desktop Test Suite

LSB Desktop tests [1,2] is a part of LSB certification tests. Existing tests do not cover all interfaces included into LSB and LSB specification continues to grow (http://linuxbase.org/navigator/browse/status.php). The goal of the project is to develop tests for newly included libraries and for interfaces that have no tests yet (http://linuxbase.org/navigator/coverage/cov2.php).

Gtk3/Gdk3 and friends are at the moment very high on our list of untested items. The test development ideally should be coordinated with upstream projects so the tests can be useful for them as well.

[1] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/desktop-test/files
[2] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/t2c-desktop-tests/files

Desired knowledge: C, Testing
Mentors: Alexey Khoroshilov, Roman Zybin

Analysis and fixing of race condition warnings in the Linux kernel

One of the most difficult to catch causes of bugs are race conditions — they may manifest itself only on rare schedules, and they are hard to fix — they usually require rethinking and careful selection of synchronization mechanism. Various methods exist for detecting race conditions. One of them is a static analysis, which allows to find errors in the module parts hardly reachable with testing.

The LDV Tools static verification framework performs the analysis of Linux kernel modules and detects both errors of incorrect usages of API between modules and kernel core and data race conditions – when two or more threads can access the same shared data simultaneously without proper synchronization. Race conditions are symptoms of bugs in the kernel, but not always bugs.

As the result LDV Tools reports a number of warnings for a kernel module. Not all warning are real bugs, because of inaccuracy of LDV Tools in both analysis of module itself and assumptions about an environment of the module. Depending on analyzed subsystem the true positives rate is from 15% to 40%.

For each data race warning the LDV Tools reports one or more error paths. An error path is a sequence of statements in the module source code including call to module callbacks and other functions, branches of if statements, the number of loop iterations. For a data race condition the tool reports two paths describing the execution sequences of statements for each thread participating in the race. The accesses to shared data and the synchronization primitives are highlighted, e.g. the set of acquired mutex_locks are shown.

For each warning reported by LDV Tools a student should determine whether it is a bug or false warning. For that purpose a student should analyze each path for the feasibility in the Linux kernel. Moreover, a student should make sure the paths representing the threads may work in parallel in the Linux kernel and the assumptions about module environment are correct.

For accomplishing the task it is required to use the whole gamut of available means. A student should analyze kernel source code and documentation to determine the possibility of parallel execution of threads. For more precise analysis a student should use debugging tools and monitor thread creation inside the kernel subsystem for which the module is developed. It is required to use special tools modeling erroneous conditions, like RaceHound for reproducing the conditions along the path of the warning.

If a warning is classified as a bug then a student should prepare a patch fixing the bug and send it to the developers of the module. At the worst case, if by some reason it is impossible to prepare a patch a student should prepare and send a bug report.

The false warnings should be classified. A student should determine if the warning is caused by inaccuracy of assumptions about the module environment, like the order of module callback invocations and the possibility to execute in parallel with each other, with interruptions, workqueue callbacks, etc. The warning may also be caused by inaccuracies in handling synchronization primitives, analysis of shared data, path conditions.

A student is expected to analyze and to classify about 500 warnings in a recent Linux kernel.

Desired knowledge: Linux kernel development, synchronization primitives, debugging tools for Linux kernel.
Mentors: Vadim Mutilin, Alexey Khoroshilov

Development of environment model specifications for static verification of Linux Kernel

The Linux Driver Verification (LDV) program aims to apply heavy-weight static verification tools to find bugs in Linux kernel modules. In contrast to widely used static analysis tools, the approach considers a module as a whole and performs thorough verification empowered by formal techniques.

The LDV framework generates an environment model as an additional C code for each provided kernel module on the base of its interface and runs a static verification tool which checks only that source code which is reachable with the generated environment model. Before the framework can create the environment model, an expert should prepare two kinds of specifications. The first one just describes declarations of an interface (types, references between them). The second specifies how environment model should triggers callbacks of a kernel module which implement the interface. Environment model specifications strongly influence verification results. Deficiencies in the model can lead to missed bugs as well as to false alarms.

The task assumes the following steps:

  • Study verification coverage reports generated by the framework. Choose several kernel interfaces which are not covered by the existing specification set.
  • Study existing environment model specifications and tests (simple kernel modules which implement the interface with stubs) examples. Develop own specifications for interfaces chosen during the previous step. Specifications of both referred kinds are quite short and easy to prepare. But before development, it is necessary to learn DSL languages and study kernel code to understand how the kernel invokes callbacks implemented by kernel modules, how it registers and deregisters them and how it initializes callback parameters.
  • Obtain and analyze new verification results. Prove then now more code is covered. Analyze new verification results and prepare patches in case of found errors.

Desired knowledge: C, understanding of Linux kernel including interrupt handling and deferred works execution.
Mentors: Ilja Zakharov, Alexey Khoroshilov

Improve environment model for memory safety verification of the Linux Kernel

The project is similar for the previous one, but the main goal of improvements is not to improve code coverage, but it is to make environment model more precise to avoid false positives in memory safety verification. For example, GSoC project of 2017 reported that 340 warning are false alarms because of environment model inaccuracies. The root causes of these false alarms have to be fixed.

A student is expected to write model and analyze changes in verification results for about 70 functions from recent Linux kernels. Models are C functions representing memory sensitive paths similar to the original code from static verification point of view but they are more simple. Those function models which help to improve verification will be included in LDV environment model. If a student detects a bug in the kernel then she should either prepare a patch fixing the bug and send it to the developers of the module or to send a bug report.

Desired knowledge: C, Linux kernel.
Mentors: Ilja Zakharov, Alexey Khoroshilov

Development of Rule Specifications for Correct Usage of Kernel Core API

Linux Driver Verification (LDV) program is devoted to developing an integrated framework for device drivers verification and adopting leading edge research results to enhance quality of verification tools. Currently LDV tools are used to check Linux device drivers against several safety rule specifications, each of them formally defines what is correct usage of a particular part of kernel core API. Another way of looking at the safety rules is that a rule represents one or more typical misuse of kernel core API.

By the moment, the framework contains around 30 rule specifications that has already helped to find and fix more than 300 issues in device drivers of vanilla Linux kernel.

This project (or several ones) is aimed at extending number of rules supported by the framework.

We have a collection of several dozens rules written in informal way, but other rules can be used as well. Formalization process usually includes the following steps:

  • identification of all elements of kernel core API relevant to the rule;
  • writing a rule specification that consists of
    • model state (one or more C variables);
    • transitions between states linked to appropriate events in device driver (usually calls of kernel core API);
    • a set of assertions (that represents misuse of kernel core API);
  • writing unit tests covering all the interfaces;
  • checking all drivers against the rule, analyzing results, reporting bugs and fixing the rule specification.

Rule specifications are written in a mix of C code and AspectJ-like notation that allows to merge the model code with original source code of device driver.

More than one application can be approved. Formalization of ~5 rules is expected from each student.

Desired knowledge: C, aspect-oriented programming, Linux kernel space API
Mentors: Alexey Khoroshilov, Vadim Mutilin, Eugeny Novikov

Extend and Update LSB Core Test Suite

LSB Core tests [1,2] is the main part of LSB certification tests. Originally they were developed for POSIX (IEEE Std 1003.1-1990) and LSB 3.1. LSB specification was expanded since that time (http://linuxbase.org/navigator/browse/status.php). The goal of the project is to fix open issues in the tests, i.e. to fix issues appearing on modern Linux distributions and to accomodate all relevant changes in text of specifications, as well as to expand coverage of the test suite to newly added LSB Core interfaces.

VSX-PCTS was implemented using TET/VSXgen test harness. The OLVER test suite was developed using model-based testing technology UniTESK. The technology follows well-known Design by Contract approach to represent formal specifications. Program contracts consist of preconditions and postconditions of interface operations and data type invariants. The main advantage of program contracts is that they enable automatic construction of test oracles that check the conformance of the target system behavior to the specifications.

Traceability of formal specifications to text of original LSB/POSIX specifications is supported using the following ideas:

  • each elementary requirement in original specification obtains a unique identifier;
  • each of the requirements is explicitly linked to a fragment of text in the original specification;
  • all checks in preconditions and postconditions are marked by identifiers of corresponding requirements.

[1] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/olver-core-tests/files
[2] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/runtime-test/files

Desired knowledge: C, SEC, model based testing
Mentors: Alexey Khoroshilov, Roman Zybin

Add support for C++11 to LSB

It is required to capture data of C11 ABI for [[http://ispras.linuxbase.org/index.php/LSB_Database_Home , C++ ABI, SQL databases, Perl
Mentors: Jeff Licquia, Denis Silakov, Alexey Khoroshilov

gsoc/2018-gsoc-lsb-projects.txt · Last modified: 2018/01/29 22:34 by khoroshilov