This is a public report, and its contents may be used as long as the source is appropriately credited.

Methodology

Scope of respondents

More than 47,000 people participated in the 2021 Developer Ecosystem Survey. This report is based on the input of 31,743 developers from 183 countries or regions. The data was weighted according to several criteria, as described in the following paragraphs.

Data cleaning process

We used partial responses, except in cases where the respondent left the survey before answering the questions about their primary programming languages. We also used a set of criteria to identify and exclude suspicious responses. Here are some of the indicators we checked for:

Surveys that were filled out too fast.
Surveys from identical IP addresses, as well as surveys with responses that were overwhelmingly similar. If two responses were more than 75% identical according to their Szymkiewicz-Simpson overlap coefficient, we kept the one with more answered questions.
Surveys with conflicting answers, for example, “18-20 years old” combined with “more than 16 years of professional experience”.
Surveys with only a single option chosen for almost all the multiple-choice questions.
If multiple surveys were submitted from the same email address, we kept the survey that was the most complete.

Reducing the response burden

To shorten the survey and reduce its response burden, some sections were shown to respondents randomly. There were seven randomized sections, of which each respondent saw only two:

Continuous Integration, Issue Tracking, and VCS
Testing
DevOps and Hosting
Static analysis, Open-source, etc.
Education
Cross-platform and Microservices
Communication tools

For example, if a respondent selected Tester / QA Engineer or DevOps Engineer / Infrastructure Developer as their job role, they would be given one definite section about their job role plus one other section selected randomly.

Despite our measures to reduce the work required of respondents while still pursuing our goal of covering as many research topics as possible, we’ve found that respondents on average spend more time taking the survey than we can reasonably request. We will revise the survey structure next year to try to improve the experience.

Targeting

To invite potential respondents to complete the survey, we used Twitter ads, Facebook ads, Instagram, Quora, VK, and JetBrains’ own communication channels. We also posted links to some user groups and tech community channels, and we asked our respondents to share the link to the survey with their peers.

Countries

This year we changed our targeting criteria and expanded our geographical coverage. We collected responses from across the world, allocating respondents to 6 regions, with the exception of the 18 countries that we’ve targeted in previous years’ research.

We collected sufficiently large samples from 23 geographical entities. These entities include 17 countries, which account for approximately 70% of all the developers worldwide: Argentina, Belarus, Brazil, Canada, China, France, Germany, India, Japan, Mexico, Russia, South Korea, Spain, Turkey, Ukraine, the United Kingdom, and the United States. The remaining countries were distributed among 6 regions:

Africa, the Middle East, and Central Asia
European countries not listed above
Southeast Asia and Oceania, Australia, and New Zealand
Central and South America
Eastern Europe, the Balkans, and the Caucasus
Northern Europe and Benelux

For each geographical region (except for Canada and Japan), we collected at least 300 responses from external sources, such as ads. Inside some regions we got abnormally large amounts of responses for some countries (e.g. Nepal and Kenya). Some of these responses were excluded from the analysis to ensure a more representative distribution.

Localization

To minimize possible bias against non-English speaking respondents, the survey was also available in 9 additional languages: Chinese, French, German, Japanese, Korean, Portuguese, Russian, Spanish, and Turkish.

Sampling bias reduction

To minimize bias, the report is based on the data weighted with regard to responses coming from Twitter ads, Facebook ads, Instagram, Quora, VK, and respondents’ referrals. We took into account each respondent’s source individually to generate the results based on the weighting procedures. We performed three stages of weighting to get a less biased picture of the worldwide developer population.

First weighting stage: populations of professional developers in 23 regions

In the first stage, we assembled the responses collected while targeting different countries, and then we applied our estimations of the populations of professional developers in each country to these data.

We took the survey data on professional developers and working students that came from ads posted on various social networks in the 23 regions, along with the data that came from various peer referrals. Then we weighted all these responses according to our estimated populations of professional developers in those 23 regions. This ensured that the distribution of the responses corresponded to the estimates of the numbers of professional developers in each country.

Second weighting stage: the proportions of currently employed and unemployed developers

In the second stage, we forced the proportion of students and unemployed respondents (who came to us through the same external ad campaigns) to be 17% in every country. We did this to maintain consistency with the previous year’s methodology, as that is the only estimate of their populations we have available.

As a result, we had a distribution of 19,281 responses from external sources weighted by country and employment status.

Third weighting stage: employment status, programming languages, JetBrains products usage

The third stage was rather sophisticated, as it included calculations obtained by solving systems of equations. We took those weighted 19,281 responses. For developers from each country, in addition to their employment status, we calculated the shares for each of the 30+ programming languages, as well as the shares for those who answered “I currently use JetBrains products” and “I have never heard of JetBrains or its products”. Those shares became constants in our equations.

The next step was to add two more groups of responses from other sources: JetBrains internal communication channels, such as JetBrains social-network accounts and our research panel, and social-network ad campaigns targeted at users of certain programming languages. This yielded 12,462 more responses, which we weighted to keep all those shares the same.

Solving the system of 30+ linear equations and inequalities

We composed a system of 30+ linear equations and inequalities that described:

The weighting coefficients for the respondents (for example, Fiona from our sample represents on average 180 software developers from France).
The specific values of their responses (Pierre uses C++, he is fully employed, and he has never heard of JetBrains).
The necessary ratios among their responses (for example, 27% of developers have used C++ in the past 12 months, and so on).

In order to solve this system of equations with the minimum variance of the weighting coefficients (which is important!), we used the dual method of Goldfarb and Idnani (1982, 1983), which helped us collate optimal individual weighting coefficients for the respondents.

Lingering bias

Despite these measures, some bias is likely present, as JetBrains users might have been more willing on average to complete the survey.

Also, our community ecosystem is developing, and there might be some data fluctuations despite our weighting stages and efforts. For instance, the share of Kotlin users who compile their applications for JVM has grown in our data owing to Kotlin/JVM bias in our sources, although there have been no changes to the overall share of the Kotlin language.

We will continue to update and improve our weighting methodology in the future. Stay tuned for DevEco 2022!

Demographics

Find the right tool

Choose your technology

Choose

See all tools

Download the tool you need