- Home
- >
- Software Development
- >
- Developing a Methodology for Analyzing Open Source Communities – InApps 2022
Developing a Methodology for Analyzing Open Source Communities – InApps is an article under the topic Software Development Many of you are most interested in today !! Today, let’s InApps.net learn Developing a Methodology for Analyzing Open Source Communities – InApps in today’s post !
Read more about Developing a Methodology for Analyzing Open Source Communities – InApps at Wikipedia
You can find content about Developing a Methodology for Analyzing Open Source Communities – InApps from the Wikipedia website
InApps is developing a methodology for analyzing open source communities. To begin this effort, we decided to look at the composition of several open source projects. In an initial analysis, we’ve learned a bit about the companies and the people who are participating in the development of OpenStack, Docker, Kubernetes, and other new stack technologies. Some of the initial research can be found in our recent ebook about the container ecosystem, where 50 of the 71 open source projects we cataloged had an identifiable corporate sponsor.
When we looked at the contributors of a few of these projects, we found in many cases, development was led by a single party. The table below shows that many of the more popular projects associated with containers are dominated by just a few companies.
Percentage of Contributions Coming From Employees: Select Projects in the Container Ecosystem
Project | Top Contributor | Secondary Contributor | ||
Kubernetes | 72% | Red Hat | 15% | |
Docker | Docker | 58% | Red Hat | 7% |
Cloud Foundry | Pivotal/VMware | 67% | IBM | 11% |
Mesos | Mesosphere | 49% | 14% |
The Cloud Foundry number requires a bit more explanation. Although, our initial analysis found that Pivotal contributed 58 percent of the Cloud Foundry code, we also found another 10 percent from VMware, which shares the same parent company — EMC — as Pivotal. An additional 10 percent of the contributions come from “bots,” or continual integration software pipelines that automatically submit code that could come from Pivotal or third-parties, such as IBM. So Pivotal/VMware could be contributing as much as 77 percent of the Cloud Foundry code.
It is worth noting not all projects were dominated by a single company. Both Linux and OpenStack are more heterogeneous communities, compare to the projects listed above:
Percentage of Contributions Coming From Employees: Projects With a More Diverse Contributor Base
Project | Top Contributor | Secondary Contributor | ||
OpenStack | HPE | 18% | Red Hat | 17% |
Linux | Intel | 11% | Red Hat | 8% |
Comparing OpenStack with Cloud Foundry contributions in a more visual way, would look something like this:
If nothing else, the above numbers show that open source software development, at least for the enterprise, may not always be a community driven process. And this is nothing new: Open source has long enjoyed a strong helping hand from corporations. We plan to investigate in a follow-up article whether this a good or bad thing for our readers and the technologies they use. And we’d like to hear your feedback.
Methodology
To create the first table above, data about contributors was collected using a tool called Blockspring that accessed the GitHub API to pull information about contributors to specific repositories. Although each project has multiple repositories, TNS chose to focus on the primary repository for each.
Since GitHub does not identify a contributor’s employer, we identified this information as follows: TNS used company domain names that were in the email or website fields. However, because a majority of contributors provided Gmail addresses or no email address at all, we used other means to identify their employers’ name. Blockspring, for instance, has an algorithm that cross-checks a person’s email address and username across several social networks and databases. Clearbit and FullContact APIs were also used to collect information.
While none of these methods are perfect, they were accurate a vast majority of the time. For those people that still did not have company information, every personal website that was provided was reviewed. Additionally, if a real user name was provided, a search for the person on LinkedIn was conducted and then verified that their picture and other information was similar to what was included on their GitHub profile.
Note that the number of contributions reviewed differs from that seen on GitHub’s own dashboards because of how we counted contributions from merged repositories and those handled by bots.
The second table, with Linux and OpenStack, contains data from reports published by the organizations themselves, the Linux Foundation and The OpenStack Foundation, respectively. In both cases, they mined profiles from GitHub and other version control systems as well as the text of commits and communication about issue resolution.
Docker, Hewlett Packard Enterprise, IBM, Intel, Pivotal, Red Hat and VMware are sponsors of InApps.
Feature image Groucho Marx via Pixabay. Chart icons via Freepik.
InApps is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Docker.
Source: InApps.net
Let’s create the next big thing together!
Coming together is a beginning. Keeping together is progress. Working together is success.