Data Collection and Use Policy
The JetBrains AI service can collect two types of data related to the usage of AI features: behavioral and detailed data. Both of these types of data collection are fully controlled by the user.
Behavioral Data Collection
Behavioral data collection includes such data as:
Types of AI features used.
Rates of acceptance for suggestions from different AI features.
Performance data (such as the amount of time it took to generate AI suggestions).
User feedback on the quality of results produced by different AI features.
This type of data does not include any personally identifiable data, or any source code files or fragments from the user’s project.
This data is used by various teams at JetBrains for analyzing product usage, improving product features, and training machine learning (ML) models that control the behavior of different product features (for example, controlling the automatic activation of ML features). It is not used for training ML models that generate code, text, or another type of data from which outputs could be extracted.
Collection of this type of data is controlled by the standard data sharing settings (see the product documentation for details). It is disabled by default in EAP and release builds.
Detailed Data Collection
Detailed data collection includes full data about the interactions with large language models. This means the full text of inputs sent by the IDE to the large language model and its responses, including source code snippets.
Collection of this type of data is controlled by the option Tools | AI Assistant | Data Sharing | Allow detailed data collection. It is disabled by default in both EAP and release builds. Detailed data collection is only performed when the user enables this option and gives explicit consent to the collection of detailed data.
Access to the collected data will be restricted only to the teams at JetBrains that specifically work on large language model development and integration. This data will be analyzed to understand product usage and identify opportunities for improvement. It will not be used for training any ML models that generate code or text, or revealed in any form to any other users.
We will also implement a retention policy for this data; it will be stored only for a limited amount of time not exceeding one year.
If the user does not opt in to detailed data collection, the inputs will be sent directly to the LLM provider and processed according to their data collection and use policy, and the outputs will be sent directly to the user IDE. The inputs and outputs will not be persistently stored on JetBrains servers.