MPS 2020.1 Help

Regexp language

Regular expressions is one of the earliest DSLs in wide use. Nearly all modern programing languages these days support regular expressions in one way or another. MPS is no exception. Regular expression support in MPS is implemented through a base language extension.

We also recommend checking out the Regular Expressions Cookbook, to get a more thorough introduction into the language.

Defining regular expression.

Regexp language allows you to create an instance of java.util.regex.Pattern class using a special pattern expression: /regexp/. In the generated code, MPS creates for each defined pattern expression a final static field in the outermost class, so the pattern is compiled only once during application runtime.

Pattern pattern = /[a-z]+/

There are three options, you can add after the ending slash of the regexp.

/i

Case-insensitive matching

/s

Treat string as single line, dot character class will include newline delimiters

/m

Multiline mode: ^ and $ characters matches any line within the string (instead of start or end of the string)

The options can be turned on/off by typing or deleting the character in the editor, or through the Inspector. Generated regular expression preview is available in the Inspector.

Re-using Definitions

To reuse a regular expression for a frequently used pattern accross your project, create a separate root:

model -> New -> jetbrains.mp.baseLanguage.regexp -> Regexps

Each reusable regular expression should have a name and optionally a description.

regexp Identifier { // no description (identifier: [a-z A-Z _] [a-z A-Z _ 0-9]+) }

Pattern Match Operator

The =~ operator returns true if the string matches against the specified pattern.

"string or variable" =~ /regexp/

Capturing Text

Optional use of parentheses in the expression creates a capture group. To be able to refer to the group later, the best practice is to give it a name.

/^ (name: [a-z A-Z _] [a-z A-Z _ 0-9]+) /

Examples

if ("any string" =~ /^ # define (identifier: [a-z A-Z _] [a-z A-Z _ 0-9]+) /) { process(identifier); }

If the pattern matches against the string, the matched value is captured into identifier and can be accessed in the if-block.

Don't forget to check out the Regular Expressions Cookbook, to get a more thorough introduction into the language.

Last modified: 18 June 2020