Measuring and Assessing Open Source Project Impact and Community Health

a CZI EOSS Community Call Resource

Authors

Yani Bellini Saibene

Beth Duckles

Jonah Duckles

Kate Hertweck

Daniel S. Katz

Emily Lescak

Daniel McCloy

Reshama Shaikh

Dan Sholler

Adam Tyson

Published

October 29, 2024

Background

Open source software (OSS) projects are increasingly interested in tracking the impact of their packages and the health of their communities. The motivation to measure these attributes comes from both internal needs and external demands: Internally, project leaders use these measurements to set development roadmaps, make strategic decisions about resource allocation, and ensure that they are providing a welcoming, productive environment for contributors and users. Externally, projects use impact and health measurements to raise funds, report to funders, market their tools, and support career development for people who have contributed.

OSS Project Impact

Projects define and measure impact in a number of ways. In scientific OSS, impact is often defined as the intensity of usage in a given discipline: How much of the field and its work relies on scientists using the software? How critical is the software to advancing science in the domain? What other OSS projects depend on the software? What kinds of breakthroughs does the software enable that other options do not?

Projects tend to measure impact in a number of ways, most of which do not adequately answer the questions proposed above. The following activities instead act as rough proxies for assessing impact, often because they involve metrics that are easy to extract from open source tools:

Recording package download counts
Tracking citations in the scientific literature
Monitoring downstream uses of the software (e.g., dependencies)
Counting GitHub stars
Quantifying engagement at in-person and online events
Assessing interest from institutions, science and computing facilities, and industry partners in supporting software on their systems
Measuring changes in the number of contributors or contributions
Calculating past funding support and projecting future support needs

The above metrics can be useful as a starting point in assessing interest in and usage of software. But as pointed out in a 2015 blog post by Daniel S. Katz, the most difficult metrics to track often offer the most value when compared to easier-to-track metrics. Download counts, for example, are easy to record, but those counts tell us little about how the software is being used to advance science or any other goal. Tracking citations (and chains of citations) in scientific papers can be labor-intensive (particularly when examining what the authors used the software for) yet the activity paints a much clearer picture of actual usage and impact.

In the ideal scenario, then, projects would be able to track usage at a more detailed level and address critical questions, such as those in the table below. For each question, we have proposed possible metrics (or information-gathering activities) that might help provide a detailed view of impact. More work is certainly needed to identify other existing ways to answer these questions and to build new approaches to impact assessment.

Impact Question	Possible Metrics
What fraction of the field relies on the software to carry out its work?	Citations to the software in published papers Citations of papers citing the software
What breakthroughs has the software enabled?	Literature reviews of papers citing the software Case studies of software-enabled breakthroughs
What value have users, contributors, maintainers, and other stakeholders received from the software (e.g., better work, advancing skills and careers, or finding community and support)?	Qualitative interviews with the software community Surveys and skills assessments Soliciting testimonials Tracking career progression Social network analyses
Do other projects depend on this software?	Dependency graphs

OSS Project Community Health

Community health is somewhat more abstract and subjective than impact, making it difficult to define and measure. If we were to strive for a general definition of community health, it would likely include the degree of psychological safety¹ in a software community, the level of diversity in community members’ backgrounds and characteristics, the inclusiveness of the community, activity levels (e.g., active raising and closing of issues, traffic on communication forums, attendance at events), and the formalization of pathways in the community (e.g., contributor and developer on-ramps, leadership opportunities). Although these attributes are perhaps less quantifiable than impact, projects often take the following steps to measure community health:

¹ Psychological safety as we’re using it here means broadly that people feel they can voice their opinions, ideas, and concerns without negative consequences or repercussions. Psychologically safe environments tend to promote open communication, collaboration, learning and innovation and can support constructive disagreement, feedback culture and taking risks for growth and improvement.

Assessing the diversity of the community via surveys
Tracking the number of users, contributors, and developers/maintainers over time
Gauging churn and turnover
Quantifying the number of issues raised and closed
Counting attendees at meetings and surveying attendees about their experience
Conducting interviews with community members about their experiences

The above description and measurements of community health apply broadly to open source projects (and likely any community), but research/scientific software projects may have contextual differences that require alternative or additional description and measurement. We could, for example, assess:

The degree to which scientists in a given field benefit similarly from the software (i.e., is the community excluding some members of the discipline)?
The effects of “competitors” and the project’s relationship to them, including:
- Proprietary/commercial software vendors
- Other open source projects
The degree of support the project receives from the research/science community, including:
- Financial support
- Development and maintenance
Use of the software in scientific education

The Relationship between Software Impact and Community Health

Impact and health overlap in both definition and measurement. Activity levels such as contributions or engagement on communication forums, for example, can signal that a project has both significant uptake and a welcoming atmosphere (especially when there is a steady increase in new contributors and/or a low rate of churn). It is therefore useful to map impact and health onto one another and consider the characteristics of projects that have either Broad or Narrow Impact and either Low Community Health or High Community Health.

Characteristics of Projects by Impact and Health

	Low Community Health	High Community Health
Broad Impact	This project is used broadly, but the community is troubled. Characteristics: Critical in one or more ecosystems Difficult to maintain due to low community engagement Few newcomers	This project is the “ideal” - it is both critical to software ecosystems and has an active, engaged, welcoming community. Characteristics: Critical in one or more ecosystems Welcoming environment Formalized on-boarding processes Contains supportive materials such as contributor guides, Codes of Conduct
Narrow Impact	This project is targeted toward a small, niche community or is a “pet project.” Leaders and developer(s) are not necessarily responsive to a wider community. Characteristics: Used for niche scientific work Decisions are made by one or a small group of people with few formalized decision-making processes Newcomers are not recruited or actively welcomed	This project has a strong community, but the software has limited impact. Community members remain engaged due to a sense of belonging, networking, and skill-building. Characteristics: Software development is secondary to community-building efforts Issues, forums, and communication channels discuss community needs more than software development needs May have issues with sustainability

The Relationship between Impact, Health, and Sustainability

Software projects of the above quadrants can be sustained. For example, low-health broad-impact projects may survive many years on the strength of one or a few dedicated maintainers, as long as maintenance burden is kept low (e.g., by rejecting most feature enhancement requests, providing limited user support, etc). However, sustainability as a forward-looking property is a forecast, where a number of factors such as elements of impact and community health can be combined into a probabilistic measure of sustainability. Another way of looking at this is that software sustainability is a measure of the software project’s ability to survive challenge events that each have some chance of happening, such as the main developer leaving the project, a funding grant ending, etc. (see Towards Defining Lifecycles and Categories of Research Software), and that there is some set of likelihoods of the project surviving each such event.

General Questions for Discussion

How do you move from low community health to high community health?
How do you move from narrow impact to broad impact?
- How can we enable projects to preserve the “small-and-mighty,” innovative spirit while maximizing impact?
  - Small science teams drive innovations that enable big science organizations to have impact. Relevant reference: Ed Yong’s “Small Teams of Scientists Have Fresher Ideas”
  - In other words, a project could have a narrow impact as its stated vision (at first or in the long-term), and that can be a very good thing.

Questions for Discussion in Your Project’s Community

What is our desired impact?
- How has it changed?
- How can we continually reevaluate our desired impact?
How are we currently tracking impact and health, if at all?
What is our process for interpreting quantitative data, especially when data collection is automated?
- How can we supplement our quantified measures with qualitative data?
- Who needs to be at the table when assessing these data?

Resources on Metrics

CHAOSS - https://chaoss.community
Code for Science and Society’s Tracking Impact and Measuring Success in Data Education Events
PLoS Computational Biology’s Ten Simple Rules for Measuring the Impact of Workshops
CSIRO’s organization-wide approach to impact assessment