Code inspection: Use UTF-8 string literal
UTF-8 is one of the most commonly used character encodings, particularly on the internet. However, in .NET, the char
and string
types use UTF-16 to represent their values. This necessitates an additional step to obtain the UTF-8 representation of a string, such as invoking System.Text.Encoding.UTF8.GetBytes()
, which makes the conversion at runtime. To avoid this runtime cost, some developers might choose to perform the encoding in advance and then incorporate the output byte array in the source code as follows:
C# 11 introduces a new, simpler way to represent UTF-8 strings in the source code without any runtime overhead:
This inspection helps recognize existing ways of representing UTF-8 strings and replace them with the new language feature to improve the readability of your code.
It also detects usages of Encoding.Utf8.GetBytes()
with string literals and helps transform it to the new UTF-8 string literal. This not only improves the readability but also enhances performance by eliminating the need for runtime encoding.