At the LSB Plenary 2014, the LSB workgroup met and talked about some of the problems the LSB has been having lately. The problem the LSB is intended to solve is still a problem, but it's moved a bit--as cloud, mobile, and the Web have taken priority over local app development, the problems with cross-distro development have moved along with it. Additionally, our rather heavy infrastructure, and the learning curve associated with it, have held us back, turning away potential contributors and making our products less useful to people who want to use them outside the narrow use cases we've been targeting.
Those present at the 2014 Collab Summit it's time for the LSB to become a bit more nimble and contributor-friendly.
Deficiencies in the current approach
We've identified a number of deficiencies in our current infrastructure and workflow.
Our current software and infrastructure have some amazing capabilities, but they also have a correspondingly steep (perhaps: amazing high) learning curve associated with them. As they are purpose designed, they are fairly inflexible. Historically, we have used very limited commit permissions, as the tools needed to remain usable for certification purposes on short notice. Some of the problems we've identified:
- The LSB database is useful, but difficult to change. It is essentially a binary blob modifiable only by a select few experts. The scripts and infrastructure we've built to help lower that learning curve are themselves difficult to use and buggy. Submissions made during the past several years are not always adequately robust, either. Our current efforts at getting GTK+ 3 in shape for LSB 5.0 serve as a great example of these characteristics.
- We rely on an LF provided version control infrastructure, using a tool no longer actively developed (BZR), and with a workflow that isn't generally familiar to today's developers. To some degree, this is easy to partially address with a simple data conversion. But part of that change involves moving to embrace some good features from the new workflow established by sites like GitHub: easy forking and pull requests. We've assumed all along that developers can set up their own hosting for their version control changes or figure out how to submit changes via email; the GitHub model turns out to work with lower barriers to entry for this. We will have to provide clear pointers and/or fairly detailed documentation for developers outside of typical Linux environments who use other code control systems, due to differences in philosophy.
- Historically, tests have been difficult to write or even contribute in the LSB system. Using TET for integration has resulted in a complex API and build system that's difficult to work with. We've made some efforts at simplifying this, which have largely not succeeded, as our experiment with the T2C ALSA test suite demonstrated. In the meantime, our bug tracker is full of simple one-off tests that would make excellent material for tests, except for the work needed to port and maintain these in our framework.
In short: the LSB is not appealing or widely touted as a useful project. The value chain between Linux distributors, commercial and other software providers, and customers has not been sold across the industry. The advantage of a binary standard over the previous generation of API standards has not been exploited. This has pretty much limited our developer resources to the "old guard".
- Traditional standards work isn't "sexy". This is just a fact of life to some degree, especially if there is no substantial controversy or incompatibility to be solved. The fact that we seem to work "in the shadows" of upstream bug trackers and impenetrable tests makes this problem more pronounced.
- The learning curve scares off those people who do decide this is important work, and it prevents volunteers with limited time from being effective in helping to make progress.
- Being bound into the infrastructure also makes it more difficult for outside groups to take advantage of what we do and to adapt it to their QA environments, for example.
- The industry focus on software at a higher level up the 'foodchain' than the OS (e.g., OpenStack contributors) results in more of the LF excitement and focus coming from those groups. With decreased focus on the Linux OS itself and LF decreasing expenditures on the LSB, there is less LF-funded effort. The balance between basic 'fixed costs' of running the existing setup, and yet still addressing the 'variable costs' of trying to produce timely updates and new releases just cannot be met by the resources presently alloted for the LSB (essentially: less than one FTE).
We have problems actually releasing stuff according to agreed 'feature based' release criteria for LSB 5 (compare to: time-based). A new major release probably HAS to have compelling features, or it is a vain act, so we really could not pursue time-based this round. This is due to the combination of our heavy and difficult infrastructure and our lack of developer resources. There are too few of us still working on the next release. We spend too much time fighting with our current toolset.
To solve these problems, we're are laying the foundations for a "new LSB" project, which should absorb at least the current LSB resources in time while starting to solve the above problems.
Future work in the LSB will focus on solving practical problems current Linux distributors, software providers and customers are having with using and writing cross-distro Linux software. These won't need to necessarily be focused on our current areas of emphasis, such as symbol coverage in glibc or desktop environments, although these problems won't be expressly excluded. Rather, we want to address problems across the board of Linux development and deployment: cross-distro cloud deployment issues, Web framework problems, virtualization issues, etc.
For example, one of the early problems proposed to be solved by the LSB involves user identification and group identification harmonization between the distributions. This is perceived by some to be a problem for horizontal cloud-based deployments, as many of the common ways to share data across the cloud require harmonization between user identities. This was an issue we considered and rejected for traditional Linux development, as it was seen as more of a sysadmin / implementation problem; now, however, it affects the development of software that is explicitly designed to be distributed across multiple ephemeral systems, which is becoming an accepted model for new development and deployment on Linux today.
Specification vs. Distributions
One thing that's been clear: the abstract 'trailing standard' documentation of specifications have had their use, but the emphasis on them is too slow and heavyweight for the current pace of the Linux industry. By contrast, documentation of distribution differences (such as what we do in the Application Checker) is perceived by many to have more value in many places where we have emphasized the spec.
So, for the new LSB, we will not neglect our task of documenting the differences that exist. This will be important both for software developers and for us; we can't fix distro incompatibilities if we don't know what and where they are.
Part of this work may be to refactor the database to focus more on the actual content of the distributions, which may point the way to making it easier to draft specifications automatically based on things that are commonly shared by all distributions.
Test APIs, or the lack thereof
It's been observed that nearly everything about a test API is easily modeled in the traditional Unix API:
- Tests return either success or some failure code, which map well to familiar Unix process return codes (zero of non-zero).
- Logging typically is to various facilities (stdout, log files), and at various levels informational, warning, or error notice (also sometimes expressed to stderr at more severe levels).
As mentioned, our bug tracker, and those of others are filled with test code, either written or readily adapted to this simple test API. This can serve as the foundation for building up more complete regression test suites. Presently though, our current toolset requires a significant configuration effort to make use of these tests, which is a lot of work and can reduce their accuracy. (This isn't just our problem, either; every general testing effort has this problem.)
So why have more complicated test APIS? Mostly, it's about the helper APIs on top of this base functionality (like the xUnit libraries that exist in pretty much every language) and the test tools and frameworks that manage test result data and make that data useful to developers. In addition, the previous explicit focus was to help distributors attain the goal of completing a certification assessment in a single action.
Thus, the top-heavy TET-based workflow we've used up to now will be considered legacy. New work will focus on a very streamlined test API, that looks like the standard C/Unix API, with optional helper libraries that can mimic many of the features of the sophisticated test suites. On top of this, we will develop frameworks that will be focused on making working with our tests easy: both on the test development side, and on the framework side. This will allow more effective selection of subsets of tests that are useful for more purposes than just a full certification, allowing integrating subsets of our tests with other people's workflows.
Instead of living in a read-only self-hosted repository with limited commit access, we've initially set up a new organization at GitHub here:
The main project driving future development is the "lsb" project, which contains our current "works in progress". The emphasis will be on a light process, easy collaboration, and flexibility. Documentation and perhaps process wrappers that allow the “uninitiated” to participate fully will need to be provided so that suggestions for work efforts can be made by all of the members of the value chain – distributors, software providers and customers.
The current LSB work in 5.0 will be finished and released. It will be maintained with regular bugfix updates for the foreseeable future. Linux distributors should certify to LSB 5.0 to provide that assurance to their partners and customers, as likewise ISV's may deploy against its promises of behaviour and interfaces. Going forward, the new LSB will 'most likely refactor the current LSB to fit the new project structure and workflow.
Much of the current LSB has value:
- The SDK and Application Checker are seen as good things; indeed, they're almost seen as wizardry in some cases.
- The current distribution tests continue to have value in exposing bugs in current Linux distros.
- The database, heavy as it is, remains a valuable resource for documenting the differences between distros and potential pain points for portability.
- The standards themselves are still referenced and considered authoritative.
Future specifications from the LSB will likely look much different from how they look today, and the processes by which distributions and ISVs use the LSB will change quite a bit as well. At this point (April 2014 as of this writing), we don't know what that will look like.