Table of Contents

<< Back to CHAOSS start page ]

Ideas for Google Summer of Code projects

Idea #1: Support of Standard CHAOSS Formats for Description of Projects

[ Micro-tasks and place for questions ]

Currently, GrimoireLab uses its own format for describing a project, including the data sources (repositories to retrieve information from), the internal organization of the project (e.g., in subprojects), and specifics about how the data is to be presented. For this information, some standard formats already exist, that can be directly used, or used with some modifications. Among them, DOAP is one of the most interesting ones, but there are many others.

This idea is about identifying formats used by projects to describe themselves and adding support to GrimoireLab. This includes not only static formats, but also APIs.

The aims of the project are as follows:

The aims may require modifications to Mordred and other related tools to make them modular and simplify the implementation of support for future formats or APIs.

Idea #2: Reporting of CHAOSS Metrics

[ Micro-tasks and place for questions ]

Currently, GrimoireLab includes a tool for reporting: Manuscripts. This tool reads data from a GrimoireLab ElasticSearch database, and produces with it a PDF report with relevant metrics for a set of analyzed projects. Internally, Manuscripts uses some Python code to produce charts and CSV tables, which are integrated into a LaTeX document to produce the final PDF. Other approaches, such as producing Jupyter notebooks, will be explored too.

This idea is about adding support to Manuscripts to produce reports based on the work of the CHAOSS Community. Since Manuscripts is still a moving target, this will be also a chance to participate in the general development of the tool itself, to convert it into a generic reporting system for GrimoireLab data.

The aims of the project are as follows:

Other aims, such as producing Jupyter notebooks as a final result or an intermediate step are completely within scope.

Idea #3: Prototype New CHAOSS Metrics

[ Micro-tasks and place for questions ]

Create a library that can be used by CHAOSS Community Software projects like GHData to express open source software project level similarities. There are two components: A set of algorithms for integrating similarity measures on an array of project data and implementation of visualizations using our existing framework and possibly adding to the framework.

The aims of the project are as follows:

  1. Build new metrics in a Python/Flask/MetricsJS for the open source project GHData. This will create familiarity with different metrics as currently defined by the CHAOSS project, as well as introduce user interaction design goals of:
    1. Enabling comparisons between GitHub, Mozilla, and other open source project repositories and projects as a default design mechanism.
    2. Considering the different ways of building software to do temporal comparisons.
  2. Build machine learning algorithms that identify candidate “toxic interactions” in open source mailing lists and IRC channels, with the aim of making open source a more welcoming environment for diverse populations.
  3. Design and evaluate exploratory mechanisms for presenting project data, metrics, and analysis using a complex, hierarchical, and networked set of data structures. For example, there are two main ways a “commit” is defined in open source software: a) The explicit, individual “commit” record and b) “unique commits”. For each of these metrics, which can be reasonably calculated from source repositories, there are interests in CHOASS project stakeholders in understanding them:
    1. By project
    2. Project organization
    3. Foundation
    4. Dependencies (including integration with libraries.io and other data sets)
    5. Individual
    6. Corporate organization
    7. Roles in a project (including people evolving from the periphery to the core).

Each of these are significant opportunities for a Google Summer of Code participant to engage and learn and become part of a project.