ClickHouse has come out of seemingly nowhere to rival Elasticsearch as the database-related open source software project with the most active contributors. Runa Capital’s Konstantin Vinogradov continued his ongoing work with open source metrics in a new blog post that analyzed GitHub data for 358 of the 735 databases listed on dbdb.io. In his analysis, an active contributor is defined as anyone who makes a commit within a 12-month period of time.
ClickHouse originated out of Russia’s Yandex and its commercialization is being led by Altinity. ClickHouse is column-oriented and allows for analytics reports to be generated using SQL queries in real-time. ClickHouse’s rise in popularity began in 2016, which happens to be when Apache Spark’s peak. TiDB also had more than 200 active contributors in 2020. CockroachDB, Prometheus, MongoDB, and TrinoDB were in the second group of contenders with 150–170 active contributors.
Skeptics will complain that some of the aforementioned data sources are no longer governed by open source licenses, so shouldn’t be called open source. Let’s put that debate aside for the moment, but perhaps a recent controversy is partly responsible drop in contributors seen in the chart. A closer look at the graphic shows that other databases have also seen drops in active contributors after they made similar changes in their licenses. However, those drops were short-term in nature.
Having a lot of developers regularly committing changes to a project shows that people and probably several companies are investing in a technology. It is not a proxy for actual adoption, nor does it mean there is deep community engagement in the development effort. That said, an assessment of the data shows the robust number of developers working on projects at key junctures of the new data pipeline.
Cockroach Labs and MongoDB are sponsors of InApps.