Migrations

After a language has been published and users have started using it, the language authors have to be careful with further changes to the language definition. In particular, removing concepts or adding and removing properties, children and references to concepts will introduce incompatibilities between the previous and the next language version. This impacts the users of the language if they update to the next language version, since they may discover that their model no longer matches the language definitions and get appropriate errors reported from their models.

MPS tracks versions of languages used in projects and provides automatic migrations to upgrade the usages of a language to the most recent versions. The language designers can create maintenance "migration" code to run automatically against the user code and thus change the user's code so that it complies with the changes made to the language definition. This is called language migration.

The full language migration story has several aspects:

Language designers can write scripts for migrating the user code and bundle them with the language
MPS automatically tracks language versions used in the client code
MPS controls that the user's project is up-to-date with all language changes
MPS runs the necessary migrations, when necessary

There are two types of migrations available in MPS:

Language migrations- migrations that upgrade the project to comply with the next version of the language definition. Each language migration is attached to a version of the language definition.
Project migrations - these are not triggered by language usages, but instead they themselves define the conditions, under which they should be run. These migrations are always applied to the whole project.

Language version

Languages store a version number in their module definition (.mpl) file. This number increases when a new migration is created in a language's "migration" aspect

Modules that use languages contain a version number associated with each used language reference in the module (.msd, .mpl) file. These represent the language version used by the module. The number changes when the corresponding migration is run against this module to migrate it to a later language version.

The version number of a language can be viewed and modified manually in the Properties dialog for a language:

Notice that there are two numbers available:

Language version - updated each time the structure of the language changes
Module version - updated each time the references to the nodes in the module were migrated. If you perform a migration on a module with sources, e.g. moving nodes, you need a migration, which will be run on references or on depending modules. Module version tracks that.

Migration assistant

When MPS detects that the modules within the currently open project refer to versions of languages older than the ones present, a Migration assistant is run. It prompts the user whether the migrations should be run in order to update the project to the most recent versions of the languages.

A detailed list of the migrations that will be run is presented to the user:

If the user triggers the migration, the project is fully migrated. In case of problems preventing the migration, a list of problems together with the list of not migrated code is presented to the user.

Defining language migrations

Migrations are defined as Migration Classes in the migrations aspect of your language definition. Migration Classes are nodes of the MigrationScript concept defined in the jetbrains.mps.lang.migration language.

Numbering of languages and migrations

The name of each migration script holds a number
Each migration script defines a from version property

When a new migration script is created, the language version is increased by 1 and the fromVersion field in the migration is set to old value of the language version. We can now say that the created migration script performs the migration from an old version to a new one.

Numbering of languages and migrations tips and tricks

No migrations can be "missed". If a language contains a migration from version X and from version Y, it should also contain a migration for each versions between X and Y. If a migration is not found for some version, this means that no user is able to migrate from version X. Generation of such languages will end up with an error.
It's not necessary to store all migrations for a language. If some language was "published" and it's necessary to remove some of the older migrations, they could be removed. The from-versions of migrations left should form a range A..B, where A is any older version and B = <current version> - 1
If a migration is created by mistake and wasn't published (meaning no user has run it on his project), it can be freely removed. After removing the migration, execute "Correct Language version" from the language's context menu - this action allows to synchronize the language's version with the last migration's version. BE VERY CAREFUL when doing this.

Structure of a migration

There are several optional elements that migrations may provide:

execute after - to put an ordering constraint among migration scripts
produces annotation data - specifies the ConceptDeclaration that will be used to hold the migration data produced by this script and possibly consumed by a later migration script.
requires annotation data - specifies the ConceptDeclaration that will be used to represent the migration data produced by an earlier migration script. It also gives the data a logical name to represent it within this migration script.
produces data (deprecated)- legacy variant of transferring migration data, uses external files instead of node annotations.
requires data (deprecated)- legacy variant of tranferring migration data.
description - a helpful textual description of the script
execute method - each migration defines an execute() method, which performs the actual model conversion for user models. The method receives the user module as a parameter and may refer to the defined elements in the required annotation data section.

Data production and consumption

The ability to pass data among migration scripts is useful in partitioning the migration process. One migration script may, for example, migrate nodes from an old concept to a new one, while a following migration script will migrate all references to the original nodes to point to the new nodes. For this to work, the first script has to store ids of the old and new nodes and publish the mapping as its produced data. The second migration script will consume the data as required data. Each time a reference to an old node has to be updated, the data will be used to find an id of the new node. Technically, producing data is simply attaching a special attribute containing data to any node that is close enough to the place to which the data is related. If there is no specific place to put annotation because it is related to the whole model, the data node will be attached as a new root in the current model.

Migration scripts producing nodes with data should declare the concept of such nodes and use the putData () construction to insert each of such annotations into the model:

Nodes containing data can be retrieved by some other migration script running on another module depending on the module for which the data was produced:

Ordering of migration scripts

The implicit dependencies between migration scripts expressed through the requires annotation data and produces annotation data sections will take care of proper ordering of migration scripts. When script is migrating some module, it can use data stored for this module and all its dependencies, so consuming script will start migrating the module only after having run all the required producers on all dependencies of the module. There is no need to express those dependencies explicitly.

However, in cases when it is necessary to execute some script only after some other scripts has been executed against the same module (without taking care about dependencies), such ordering constraint can be expressed through the execute after section. If, for example, some property was moved from one concept to its superconcept, which happens to be declared in another language, the migration can be expressed with two migration scripts. The first script, applicable to the subconcept, copies the property value from the old deprecated property to the new one. The second script is applicable to the superconcept, it initiates the new property for such instances of the superconcept, which are not instances of the subconcept, with some default value. And let us suppose that the second script does some other initialization which depends on value of the moved property. So, the second script should be executed only after the first one, and that on every module.

Languages for defining migrations

The jetbrains.mps.lang.migration language defines all concepts specific to migration scripts. When defining your migrations, you can use BaseLanguage together with the jetbrains.mps.lang.smodel and .query languages to manipulate the models. The ofType<model> construct may be of particular use to obtain models contained in the passed-in SModule:

sequence<SModel> models = m.getModels(); 
models.ofType<model>.selectMany({~model => model.nodes(BaseDocComment); }).forEach({~node => ... });

A typical migration first excludes the migration aspect models from migration and then scans for nodes that need to be migrated. A new node is created and initialised with the values and children of the old node. The old node is then replaced with the new node. Setting the id of the new node to the value of the id of the old node will allow references to this node to be migrated without loosing their target:

void execute(SModule m) {
  sequence<model> models = ((sequence<model>) m.getModels()).where({~it => !it.isAspectModel(migration); });
  models.selectMany({~m => m.nodes(OldComponent); }).forEach({~oldNode =>
    node<NewComponent> newNode = <NEW component $( oldNode.name )$ {>;
                                            *( oldNode.member )*
    ((SNode) newNode/).setId(((SNode) oldNode/).getNodeId());
    oldNode.replace with(newNode);
  });
}

Schematically:

The transformation is applied to some node. As a result, we have a reference to old node (call in No), and a new node (Nn).
IDs of No's descendants are preserved automatically: if a was-descendant node is a descendant of the output node after the transformation, it already has the same id.
ID of No: MPS determines whether No is a descendant of an output node.
1. If yes, we already have the target for references that pointed to the No (this is for "wrap" cases - the node is "wrapped" in another node as a result of the transformation)
2. If no, the Nn gets the ID of No (that's for the case when we changed the concept of a node, but the old node is semantically equivalent to the new one)
No is replaced with Nn in the containing model.

Concept replacement

If a language designer decides to remove a language concept and perhaps replace it with a new one, she should not remove the concept definition from the language immediately. Instead, the concept should be deprecated first and a migration script should be provided to migrate the user code away from the deprecated concept.

The deprecated concept can be completely removed (but does not need to) in the version following after the one, in which it was deprecated. The migration scripts that refer to the deprecated concept have then be removed, too.

Defining project migrations

Project migrations are not typically used by language developers, but rather by the MPS team to describe changes in the model file format, in the module dependencies system and other project-wide things.

Project migration are run against the whole project, so it's up to the MPS developer to think about how his migration will work when a part of a project changes. E.g. the user can update her project from the VCS, and in this case it may be not enough to know, that the project was migrated once; updated modules may still have to be migrated.
MPS does not guarantee the order, in which project migrations will be run, so you basically can't write mutually dependent project migrations.

Nevertheless, users can write their own project migrations. There's no special language for project migrations, so they are basically written as Java/BaseLanguage classes and are contributed through plugin.xml. So' further we'll suppose that you already have an MPS plugin and write the project migration in it.

Note that if a project migration is written in a solution, this solution must have the IdeaPlugin enabled in the Facets tab of the Solution Properties dialog and the plugin id set in the Idea Plugin tab.

Adding a new project migration

Create a class for the migration implementing the ProjectMigration interface. For most cases, it's convenient to inherit from the BaseProjectMigration class.
Create an ApplicationComponent that will contribute the new migrations. Do not forget to register it in plugin.xml
Contribute all your project migrations from created ApplicationComponent using the ProjectMigrationsRegistry.addProjectMigration() method

Saving data from project migrations.

Project migrations can use the MigrationProperties project component to persist their data. The persisted data is stored in the .mps folder of the project and so it is shared between project's developers through VCS.

Multiple branches

Migrating projects that use multiple branches has a few additional challenges. Check out the Using Migration with branching documentation for details.

Migration Ant Task

There's an ant task to run all migrations in a project from an ant script. This task can be used for automatic testing of migrations and/or for checking whether a project has been migrated.

This task requires the MPS home path to be set by

defining mpshome task attribute or
defining mps_home environment property or
defining mps.home environment property - this is the preferred way

Home path is the path to the folder that contains the build.txt file. E.g. under Mac OS this will end with "/Contents/"

Repository contents may be specified using the <repository> tag:

If a plugin is needed for a project to migrate, this can be specified in the <migrate> ant task. The corresponding plugin will be enabled, together with its dependencies.

Examples

For concrete examples on how to define migrations you can check out the migrations sample project that comes bundled with MPS. You will see migration scripts to migrate two simple mutually interconnected languages. One of them uses data to pass information about migrated nodes between two migration scripts, while the other relies of node id manipulation.

Changes made by migrations in Local History view

Migrations cooperate with the Local History functionality.

After running migrations, it's possible to review all the changes made to the project by each of the migration. Open the Local History view for the project's folder, a module or a model, select any two changes and press Ctrl + D to see the difference.

It's also possible to revert a change or a group of changes from the Local History view as well as from the Diff dialogs.

Migration assistant in IntelliJ IDEA

The IntelliJ IDEA plugin can also run language migrations. Just like in MPS itself, the Migration assistant will update models in IDEA projects to match the currently installed versions of used languages.

Discovering deprecated code

Deprecation is a recommended mechanism to indicate to the users of a language that an element will be removed in one of the next versions of your language. MPS provides several handy finders to help users eliminate deprecated code. Find Usages of Deprecated can find all usages of deprecated elements. The report of the found usages groups the entries by the expected version of the code removal. This makes it easier to recognise their severity and prioritise their elimination.

Last modified: 26 February 2021