Measuring and Assessing Open Source Project Impact and Community Health
a CZI EOSS Community Call Resource
Background
Open source software (OSS) projects are increasingly interested in tracking the impact of their packages and the health of their communities. The motivation to measure these attributes comes from both internal needs and external demands: Internally, project leaders use these measurements to set development roadmaps, make strategic decisions about resource allocation, and ensure that they are providing a welcoming, productive environment for contributors and users. Externally, projects use impact and health measurements to raise funds, report to funders, market their tools, and support career development for people who have contributed.
OSS Project Impact
Projects define and measure impact in a number of ways. In scientific OSS, impact is often defined as the intensity of usage in a given discipline: How much of the field and its work relies on scientists using the software? How critical is the software to advancing science in the domain? What other OSS projects depend on the software? What kinds of breakthroughs does the software enable that other options do not?
Projects tend to measure impact in a number of ways, most of which do not adequately answer the questions proposed above. The following activities instead act as rough proxies for assessing impact, often because they involve metrics that are easy to extract from open source tools:
- Recording package download counts
- Tracking citations in the scientific literature
- Monitoring downstream uses of the software (e.g., dependencies)
- Counting GitHub stars
- Quantifying engagement at in-person and online events
- Assessing interest from institutions, science and computing facilities, and industry partners in supporting software on their systems
- Measuring changes in the number of contributors or contributions
- Calculating past funding support and projecting future support needs
The above metrics can be useful as a starting point in assessing interest in and usage of software. But as pointed out in a 2015 blog post by Daniel S. Katz, the most difficult metrics to track often offer the most value when compared to easier-to-track metrics. Download counts, for example, are easy to record, but those counts tell us little about how the software is being used to advance science or any other goal. Tracking citations (and chains of citations) in scientific papers can be labor-intensive (particularly when examining what the authors used the software for) yet the activity paints a much clearer picture of actual usage and impact.
In the ideal scenario, then, projects would be able to track usage at a more detailed level and address critical questions, such as those in the table below. For each question, we have proposed possible metrics (or information-gathering activities) that might help provide a detailed view of impact. More work is certainly needed to identify other existing ways to answer these questions and to build new approaches to impact assessment.
Impact Question | Possible Metrics |
---|---|
What fraction of the field relies on the software to carry out its work? | Citations to the software in published papers Citations of papers citing the software |
What breakthroughs has the software enabled? | Literature reviews of papers citing the software Case studies of software-enabled breakthroughs |
What value have users, contributors, maintainers, and other stakeholders received from the software (e.g., better work, advancing skills and careers, or finding community and support)? | Qualitative interviews with the software community Surveys and skills assessments Soliciting testimonials Tracking career progression Social network analyses |
Do other projects depend on this software? | Dependency graphs |
OSS Project Community Health
Community health is somewhat more abstract and subjective than impact, making it difficult to define and measure. If we were to strive for a general definition of community health, it would likely include the degree of psychological safety1 in a software community, the level of diversity in community members’ backgrounds and characteristics, the inclusiveness of the community, activity levels (e.g., active raising and closing of issues, traffic on communication forums, attendance at events), and the formalization of pathways in the community (e.g., contributor and developer on-ramps, leadership opportunities). Although these attributes are perhaps less quantifiable than impact, projects often take the following steps to measure community health:
1 Psychological safety as we’re using it here means broadly that people feel they can voice their opinions, ideas, and concerns without negative consequences or repercussions. Psychologically safe environments tend to promote open communication, collaboration, learning and innovation and can support constructive disagreement, feedback culture and taking risks for growth and improvement.
- Assessing the diversity of the community via surveys
- Tracking the number of users, contributors, and developers/maintainers over time
- Gauging churn and turnover
- Quantifying the number of issues raised and closed
- Counting attendees at meetings and surveying attendees about their experience
- Conducting interviews with community members about their experiences
The above description and measurements of community health apply broadly to open source projects (and likely any community), but research/scientific software projects may have contextual differences that require alternative or additional description and measurement. We could, for example, assess:
- The degree to which scientists in a given field benefit similarly from the software (i.e., is the community excluding some members of the discipline)?
- The effects of “competitors” and the project’s relationship to them, including:
- Proprietary/commercial software vendors
- Other open source projects
- Proprietary/commercial software vendors
- The degree of support the project receives from the research/science community, including:
- Financial support
- Development and maintenance
- Financial support
- Use of the software in scientific education
The Relationship between Software Impact and Community Health
Impact and health overlap in both definition and measurement. Activity levels such as contributions or engagement on communication forums, for example, can signal that a project has both significant uptake and a welcoming atmosphere (especially when there is a steady increase in new contributors and/or a low rate of churn). It is therefore useful to map impact and health onto one another and consider the characteristics of projects that have either Broad or Narrow Impact and either Low Community Health or High Community Health.
Characteristics of Projects by Impact and Health
Low Community Health | High Community Health | |
---|---|---|
Broad Impact | This project is used broadly, but the community is troubled. Characteristics: Critical in one or more ecosystems Difficult to maintain due to low community engagement Few newcomers |
This project is the “ideal” - it is both critical to software ecosystems and has an active, engaged, welcoming community. Characteristics: Critical in one or more ecosystems Welcoming environment Formalized on-boarding processes Contains supportive materials such as contributor guides, Codes of Conduct |
Narrow Impact | This project is targeted toward a small, niche community or is a “pet project.” Leaders and developer(s) are not necessarily responsive to a wider community. Characteristics: Used for niche scientific work Decisions are made by one or a small group of people with few formalized decision-making processes Newcomers are not recruited or actively welcomed |
This project has a strong community, but the software has limited impact. Community members remain engaged due to a sense of belonging, networking, and skill-building. Characteristics: Software development is secondary to community-building efforts Issues, forums, and communication channels discuss community needs more than software development needs May have issues with sustainability |
The Relationship between Impact, Health, and Sustainability
Software projects of the above quadrants can be sustained. For example, low-health broad-impact projects may survive many years on the strength of one or a few dedicated maintainers, as long as maintenance burden is kept low (e.g., by rejecting most feature enhancement requests, providing limited user support, etc). However, sustainability as a forward-looking property is a forecast, where a number of factors such as elements of impact and community health can be combined into a probabilistic measure of sustainability. Another way of looking at this is that software sustainability is a measure of the software project’s ability to survive challenge events that each have some chance of happening, such as the main developer leaving the project, a funding grant ending, etc. (see Towards Defining Lifecycles and Categories of Research Software), and that there is some set of likelihoods of the project surviving each such event.
General Questions for Discussion
- How do you move from low community health to high community health?
- How do you move from narrow impact to broad impact?
- How can we enable projects to preserve the “small-and-mighty,” innovative spirit while maximizing impact?
- Small science teams drive innovations that enable big science organizations to have impact. Relevant reference: Ed Yong’s “Small Teams of Scientists Have Fresher Ideas”
- In other words, a project could have a narrow impact as its stated vision (at first or in the long-term), and that can be a very good thing.
- Small science teams drive innovations that enable big science organizations to have impact. Relevant reference: Ed Yong’s “Small Teams of Scientists Have Fresher Ideas”
- How can we enable projects to preserve the “small-and-mighty,” innovative spirit while maximizing impact?
Questions for Discussion in Your Project’s Community
- What is our desired impact?
- How has it changed?
- How can we continually reevaluate our desired impact?
- How has it changed?
- How are we currently tracking impact and health, if at all?
- What is our process for interpreting quantitative data, especially when data collection is automated?
- How can we supplement our quantified measures with qualitative data?
- Who needs to be at the table when assessing these data?
- How can we supplement our quantified measures with qualitative data?
Resources on Metrics
- CHAOSS - https://chaoss.community
- Code for Science and Society’s Tracking Impact and Measuring Success in Data Education Events
- PLoS Computational Biology’s Ten Simple Rules for Measuring the Impact of Workshops
- CSIRO’s organization-wide approach to impact assessment