Removing special characters and diacritic marks in C#
I did this trick in JavaScript to remove diacritic marks a while back and the need to perform a similar transformation in C# came up this week.
The following method simplifies strings such as “façade” into simple string like “façade”.
private static string Simplify(string input) { string normalizedString = input.Normalize(NormalizationForm.FormD); StringBuilder stringBuilder = new StringBuilder(); foreach (char c in normalizedString) { UnicodeCategory unicodeCategory = CharUnicodeInfo.GetUnicodeCategory(c); if (unicodeCategory != UnicodeCategory.NonSpacingMark) { stringBuilder.Append(c); } } return stringBuilder.ToString().Normalize(NormalizationForm.FormC); }
Written by Steve Fenton on