This is a public report, and its contents may be used as long as the source is appropriately credited.
Scope of respondents
More than 47,000 people participated in the 2021 Developer Ecosystem Survey. This report is based on the input of 31,743 developers from 183 countries or regions. The data was weighted according to several criteria, as described in the following paragraphs.
Data cleaning process
We used partial responses, except in cases where the respondent left the survey before answering the questions about their primary programming languages. We also used a set of criteria to identify and exclude suspicious responses. Here are some of the indicators we checked for:
- Surveys that were filled out too fast.
- Surveys from identical IP addresses, as well as surveys with responses that were overwhelmingly similar. If two responses were more than 75% identical according to their Szymkiewicz-Simpson overlap coefficient, we kept the one with more answered questions.
- Surveys with conflicting answers, for example, “18-20 years old” combined with “more than 16 years of professional experience”.
- Surveys with only a single option chosen for almost all the multiple-choice questions.
- If multiple surveys were submitted from the same email address, we kept the survey that was the most complete.
Reducing the response burden
To shorten the survey and reduce its response burden, some sections were shown to respondents randomly. There were seven randomized sections, of which each respondent saw only two:
- Continuous Integration, Issue Tracking, and VCS
- DevOps and Hosting
- Static analysis, Open-source, etc.
- Cross-platform and Microservices
- Communication tools
For example, if a respondent selected Tester / QA Engineer or DevOps Engineer / Infrastructure Developer as their job role, they would be given one definite section about their job role plus one other section selected randomly.
Despite our measures to reduce the work required of respondents while still pursuing our goal of covering as many research topics as possible, we’ve found that respondents on average spend more time taking the survey than we can reasonably request. We will revise the survey structure next year to try to improve the experience.
To invite potential respondents to complete the survey, we used Twitter ads, Facebook ads, Instagram, Quora, VK, and JetBrains’ own communication channels. We also posted links to some user groups and tech community channels, and we asked our respondents to share the link to the survey with their peers.
This year we changed our targeting criteria and expanded our geographical coverage. We collected responses from across the world, allocating respondents to 6 regions, with the exception of the 18 countries that we’ve targeted in previous years’ research.
We collected sufficiently large samples from 23 geographical entities. These entities include 17 countries, which account for approximately 70% of all the developers worldwide: Argentina, Belarus, Brazil, Canada, China, France, Germany, India, Japan, Mexico, Russia, South Korea, Spain, Turkey, Ukraine, the United Kingdom, and the United States. The remaining countries were distributed among 6 regions:
- Africa, the Middle East, and Central Asia
- European countries not listed above
- Southeast Asia and Oceania, Australia, and New Zealand
- Central and South America
- Eastern Europe, the Balkans, and the Caucasus
- Northern Europe and Benelux
For each geographical region (except for Canada and Japan), we collected at least 300 responses from external sources, such as ads. Inside some regions we got abnormally large amounts of responses for some countries (e.g. Nepal and Kenya). Some of these responses were excluded from the analysis to ensure a more representative distribution.
To minimize possible bias against non-English speaking respondents, the survey was also available in 9 additional languages: Chinese, French, German, Japanese, Korean, Portuguese, Russian, Spanish, and Turkish.
Sampling bias reduction
To minimize bias, the report is based on the data weighted with regard to responses coming from Twitter ads, Facebook ads, Instagram, Quora, VK, and respondents’ referrals. We took into account each respondent’s source individually to generate the results based on the weighting procedures. We performed three stages of weighting to get a less biased picture of the worldwide developer population.
First weighting stage: populations of professional developers in 23 regions
In the first stage, we assembled the responses collected while targeting different countries, and then we applied our estimations of the populations of professional developers in each country to these data.
We took the survey data on professional developers and working students that came from ads posted on various social networks in the 23 regions, along with the data that came from various peer referrals. Then we weighted all these responses according to our estimated populations of professional developers in those 23 regions. This ensured that the distribution of the responses corresponded to the estimates of the numbers of professional developers in each country.
Second weighting stage: the proportions of currently employed and unemployed developers
In the second stage, we forced the proportion of students and unemployed respondents (who came to us through the same external ad campaigns) to be 17% in every country. We did this to maintain consistency with the previous year’s methodology, as that is the only estimate of their populations we have available.
As a result, we had a distribution of 19,281 responses from external sources weighted by country and employment status.
Third weighting stage: employment status, programming languages, JetBrains products usage
The third stage was rather sophisticated, as it included calculations obtained by solving systems of equations. We took those weighted 19,281 responses. For developers from each country, in addition to their employment status, we calculated the shares for each of the 30+ programming languages, as well as the shares for those who answered “I currently use JetBrains products” and “I have never heard of JetBrains or its products”. Those shares became constants in our equations.
The next step was to add two more groups of responses from other sources: JetBrains internal communication channels, such as JetBrains social-network accounts and our research panel, and social-network ad campaigns targeted at users of certain programming languages. This yielded 12,462 more responses, which we weighted to keep all those shares the same.
Solving the system of 30+ linear equations and inequalities
We composed a system of 30+ linear equations and inequalities that described:
- The weighting coefficients for the respondents (for example, Fiona from our sample represents on average 180 software developers from France).
- The specific values of their responses (Pierre uses C++, he is fully employed, and he has never heard of JetBrains).
- The necessary ratios among their responses (for example, 27% of developers have used C++ in the past 12 months, and so on).
In order to solve this system of equations with the minimum variance of the weighting coefficients (which is important!), we used the dual method of Goldfarb and Idnani (1982, 1983), which helped us collate optimal individual weighting coefficients for the respondents.
Despite these measures, some bias is likely present, as JetBrains users might have been more willing on average to complete the survey.
Also, our community ecosystem is developing, and there might be some data fluctuations despite our weighting stages and efforts. For instance, the share of Kotlin users who compile their applications for JVM has grown in our data owing to Kotlin/JVM bias in our sources, although there have been no changes to the overall share of the Kotlin language.
We will continue to update and improve our weighting methodology in the future. Stay tuned for DevEco 2022!
Thank you for your time!
We hope you found our report useful. Share this report with your friends and colleagues.
Participate in future surveys
If you have any questions or suggestions, please contact us at email@example.com.