Study Finds Gender Bias in Open-Source Programming

Image credit: Jeff_golden. Image shared under a Creative Commons license. Click for more information.

A study comparing acceptance rates of contributions from men and women in an open-source software community finds that, overall, women’s contributions tend to be accepted more often than men’s – but when a woman’s gender is identifiable, they are rejected more often.

“There are a number of questions and concerns related to gender bias in computer programming, but this project was focused on one specific research question: To what extent does gender bias exist when pull requests are judged on GitHub?” says Emerson Murphy-Hill, corresponding author of a paper on the study and an associate professor of computer science at North Carolina State University.

GitHub is an online programming community that fosters collaboration on open-source software projects. When people identify ways to improve code on a given project, they submit a “pull request.” Those pull requests are then approved or denied by “insiders,” the programmers who are responsible for overseeing the project.

For this study, researchers looked at more than 3 million pull requests from approximately 330,000 GitHub users, of whom about 21,000 were women.

The researchers found that 78.7 percent of women’s pull requests were accepted, compared to 74.6 percent for men.

However, when looking at pull requests by people who were not insiders on the relevant project, the results got more complicated.

Programmers who could easily be identified as women based on their names or profile pictures had lower pull request acceptance rates (58 percent) than users who could be identified as men (61 percent). But woman programmers who had gender neutral profiles had higher acceptance rates (70 percent) than any other group, including men with gender neutral profiles (65 percent).

“Our results indicate that gender bias does exist in open-source programming,” Murphy-Hill says. “The study also tells us that, in general, women on GitHub are strong programmers. We don’t think that’s because gender affects one’s programming skills, but likely stems from strong self-selection among women who submit pull requests on the site.

“We also want to note that this paper builds on a previous, un-peer-reviewed version of the paper, which garnered a lot of input that improved the research,” Murphy-Hill says.

The paper, “Gender Differences and Bias in Open Source: Pull Request Acceptance of Women Versus Men,” is published in the open-access journal PeerJ Computer Science. The paper was co-authored by Josh Terrell, a former undergraduate at Cal Poly; Andrew Kofink, a former undergraduate at NC State; Justin Middleton, a Ph.D. student at NC State; Clarissa Rainear, an undergraduate at NC State; Chris Parnin, an assistant professor of computer science at NC State; and Jon Stallings, an assistant professor of statistics at NC State. The work was done with support from the National Science Foundation under grant number 1252995.

-shipman-

Note to Editors: The study abstract follows.

“Gender Differences and Bias in Open Source: Pull Request Acceptance of Women Versus Men”

Authors: Josh Terrell, Cal Poly; Andrew Kofink, Justin Middleton, Clarissa Rainear, Emerson Murphy-Hill, Chris Parnin, and Jon Stallings, North Carolina State University

Published: May 1, PeerJ Computer Science

DOI: 10.7717/peerj-cs.111

Abstract: Biases against women in the workplace have been documented in a variety of studies. This paper presents a large scale study on gender bias, where we compare acceptance rates of contributions from men versus women in an open source software community. Surprisingly, our results show that women’s contributions tend to be accepted more often than men’s. However, for contributors who are outsiders to a project and their gender is identifiable, men’s acceptance rates are higher. Our results suggest that although women on GitHub may be more competent overall, bias against them exists nonetheless.

8 responses on “Study Finds Gender Bias in Open-Source Programming

  1. Male? says:

    What a joke of a study.

    1. Matt Shipman says:

      How so? The paper garnered a great deal of discussion as a pre-print, and underwent a lengthy amount of revision. The n for the study is large. It’s undergone peer review. Substantive critiques are welcome, but simply calling something a “joke” doesn’t make it so. What are your concerns with the work?

  2. Skeptic says:

    Correctly applied the “correlation doesn’t imply causation” when saying that female programmers are not necessarily better because their PRs are accepted more often (if unidentifiable).

    Fails to apply same principle when saying that female programmers are victims of bias because their PRs are accepted less often (if identifiable).

    Typical case where the conclusion comes first, then the hunt for supporting evidence. When similar undesired conclusions might appear, the author hedges. Very selective reasoning.

Leave a Response

Your email address will not be published. All fields are required.