Table of Contents

2016 GSoC LSB projects

Go back to the main GSoC Linux Foundation page

Mailing list: lsb-discuss at lists dot linux-foundation dot org

IRC: #lsb on irc.freenode.net

Workgroup Resources

Code License: GPL/BSD, specs: GNU FDL

Mentors: Alexey Khoroshilov (alexey dot khoroshilov at gmail dot com), Jeff Licquia, Vladimir Rubanov, Denis Silakov (dsilakov at gmail dot com).

Package Dependencies Support in LSB Database and Navigator

LSB Navigator is a web portal that provides users with different information about LSB itself and Linux ecosystem in general. In particular, it displays data about packages, libraries, binary symbols and other entities available in different Linux distributions. All this information is stored in the LSB Database and collected by automated tools.

An important aspect of distribution structure which is not currently reflected in the database in package dependencies. For users and developers it would be useful to know what every package provides/requires and which packages should be installed in particular distribution to satisfy given dependencies. The reason for the lack of this feature is that LSB itself always tried to eliminate the need for package dependencies. However, in real life it is hard to completely get rid of them and services that displays infrormation about package dependencies are quite popular among developers.

However, existing services (such as http://rpm.pbone.net/http://pkgs.org/ or http://rpmfind.net/) are maintained by volunteers and provides information about only a limited set of distributions with preference for the community systems. The LSB Database contains data about wider range of enterprise distributions, so we believe that LSB Navigator with additional data will supplement existing services, not compete with them. And in any case the advantage of LSB Navigator is that it can provide additional information about package executable files, libraries, exported symbols, etc.

The aim of this project is to add possibility to collect data about package dependencies to the LSB data collection tools and implement support for these data in LSB Navigator (lists of package dependencies provides/requires, search among package dependencies and more intellectual features such as comparison of packages from different distributions). It is possible that the schema of the LSB Database will have to be improved, as well.

Desired knowledge: Perl, PHP, MySQL/Mariadb, basic knowledge of Linux packages (RPM and Deb formats).
Mentor: Denis Silakov

Extend and Update LSB Core Test Suite

LSB Core tests [1,2] is the main part of LSB certification tests. Originally they were developed for POSIX (IEEE Std 1003.1-1990) and LSB 3.1. LSB specification was expanded since that time (http://linuxbase.org/navigator/browse/status.php). The goal of the project is to fix open issues in the tests, i.e. to fix issues appearing on modern Linux distributions and to accomodate all relevant changes in text of specifications, as well as to expand coverage of the test suite to newly added LSB Core interfaces.

VSX-PCTS was implemented using TET/VSXgen test harness. The OLVER test suite was developed using model-based testing technology UniTESK. The technology follows well-known Design by Contract approach to represent formal specifications. Program contracts consist of preconditions and postconditions of interface operations and data type invariants.  The main advantage of program contracts is that they enable automatic construction of test oracles that check the conformance of the target system behavior to the specifications.

Traceability of formal specifications to text of original LSB/POSIX specifications is supported using the following ideas:

[1] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/olver-core-tests/files
[2] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/runtime-test/files

Desired knowledge: C, SEC, model based testing
Mentors: Alexey Khoroshilov, Roman Zybin

Develop New LSB Desktop Test Suite

LSB Desktop tests [1,2] is a part of LSB certification tests. Existing tests do not cover all interfaces included into LSB and LSB specification continues to grow (http://linuxbase.org/navigator/browse/status.php). The goal of the project is to develop tests for newly included libraries and for interfaces that have no tests yet (http://linuxbase.org/navigator/coverage/cov2.php).

Gtk3/Gdk3 and friends are at the moment very high on our list of untested items. The test development ideally should be coordinated with upstream projects so the tests can be useful for them as well.

[1] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/desktop-test/files
[2] http://bzr.linuxfoundation.org/loggerhead/lsb/devel/t2c-desktop-tests/files

Desired knowledge: C, Testing
Mentors: Alexey Khoroshilov, Roman Zybin

Development of Rule Specifications for Correct Usage of Kernel Core API

Linux Driver Verification (LDV) program is devoted to developing an integrated framework for device drivers verification and adopting leading edge research results to enhance quality of verification tools. Currently LDV tools are used to check Linux device drivers against several safety rule specifications, each of them formally defines what is correct usage of a particular part of kernel core API. Another way of looking at the safety rules is that a rule represents one or more typical misuse of kernel core API.

By the moment, the framework contains around 30 rule specifications that has already helped to find and fix more than 200 issues in device drivers of vanilla Linux kernel.

This project (or several ones) is aimed at extending number of rules supported by the framework.

We have a collection of several dozens rules written in informal way, but other rules can be used as well. Formalization process usually includes the following steps:

Rule specifications are written in a mix of C code and AspectJ-like notation that allows to merge the model code with original source code of device driver.

More than one application can be approved. Formalization of ~5 rules is expected from each student.

Desired knowledge: C, aspect-oriented programming, Linux kernel space API
Mentors: Alexey Khoroshilov, Vadim Mutilin, Eugeny Novikov

Environment model specifications

Linux Driver Verification (LDV) program is devoted to developing an integrated framework for device drivers verification and adopting leading edge research results to enhance quality of verification tools. The framework requires as input manually prepared rule specifications with formalized restrictions on correct Linux kernel API usage and environment model specifications that help to generate complete and precise environment models which are required to apply software model checkers to kernel modules.

A new version of LDV Tools that is under development (Klever) implements a new approach to generate environment models that allows to make them more accurate to reduce false positive rate and to reach more kernel code for its analysis. Environment model specifications consist of a module callbacks API description and restrictions on callbacks invocation. The former describes just API which cannot be extracted automatically and the latter contains generic patterns for a callbacks invocation and specific restrictions for the most complex and important ones. The specifications format is inspired by process algebras (like CCS or CSP), so acquaintance with them is preferable.

During the project a student should develop environment model specifications and tests to check their correctness to reduce the current false positive rate in at least two times for all modules from the latest Linux kernel versions verified against 10 core rule specifications. The task also includes works on code coverage improvement required to increase a number of functions which can be reached during analysis and thus reach currently unknown bugs. With the new set of specifications the number of confirmed bugs in a set of Linux kernel modules that are currently missed due to lack of good environment models should be decreased in at least two times. This also requires development environment model specifications. All new revealed bugs should be either fixed by the student or should be at least reported.

Desired knowledge: C, Python, Linux kernel space API, process algebra
Mentors: Alexey Khoroshilov, Vadim Mutilin, Eugeny Novikov

Analysis of race condition warnings in the Linux kernel

Race conditions are a kind of bugs that are hard to detect — they may manifest itself only on rare schedules, and they are hard to fix — they often require rethinking and careful selection of synchronization mechanism.

The LDV Tools static verification framework performs the analysis of Linux kernel modules and detects both errors of incorrect usages of API between modules and kernel core and data race conditions – when two or more threads can access the same shared data simultaneously without proper synchronization. Race conditions are symptoms of bugs in the kernel, but not always bugs.

As the result LDV Tools reports a number of warnings for a kernel module. Not all warning are bugs, because of inaccuracy of LDV Tools in both analysis of module itself and assumptions about an environment of the module. For example, for drivers/net/wireless subsystem of kernel version 3.16-rc1 the tool reports 493 data race warnings. For 23 of 117 analyzed warnings there is a possibility of data race, and 94 are false warnings.

For each warning the LDV Tools reports one or more error paths. An error path is a sequence of statements in the module source code including call to module callbacks and other functions, branches of if statements, the number of loop iterations. For a data race condition the tool reports two paths describing the execution sequences of statements for each thread participating in the race. The accesses to shared data and the synchronization primitives are highlighted, e.g. the set of acquired mutex_locks are shown.

Here are examples of warnings about potential race conditions:
http://linuxtesting.org/results/impl_reports?action=component_bug&name=ppob&status=Unreported

For each warning reported by LDV Tools a student should determine whether it is a bug or false warning. For that purpose a student should analyze each path for the feasibility in the Linux kernel. Moreover, a student should make sure the paths representing the threads may work in parallel in the Linux kernel and the assumptions about module environment are correct.

For accomplishing the task it is required to use the whole gamut of available means. A student should analyze kernel source code and documentation to determine the possibility of parallel execution of threads. For more precise analysis the student could use debugging tools and monitor thread creation inside the kernel subsystem for which the module is developed. Sometimes it is useful to use special tools modeling erroneous conditions, like RaceHound for reproducing the conditions along the path of the warning.

If warning is classified as bug then the student should either prepare a patch fixing the bug and send it to the developers of the module or at least send a bug report.

The false warnings should be classified. The student should determine if the warning is caused by inaccuracy of assumptions about the module environment, like the order of module callback invocations and the possibility to execute in parallel with each other, with interruptions, workqueue callbacks, etc. The warning may also be caused by inaccuracies in handling synchronization primitives, analysis of shared data, path conditions.

The student is expected to analyze about 300-500 warnings in the Linux kernel version 4.5 or higher, to classify warnings to real bugs/false warnings, to develop patches or bug reports and to describe reasons of false warnings.

Desired knowledge: Linux kernel development, synchronization primitives, debugging tools for Linux kernel, dynamic analysis tools, static analysis tools
Mentors: Alexey Khoroshilov, Vadim Mutilin