Normalize Unicode text in CotEditor on Mac

Unicode normalization lets you convert text into a consistent Unicode form. In CotEditor, you can normalize the text in a document using various Unicode normalization methods.

Unicode normalization is the process of converting text into a standardized Unicode representation. In Unicode text, characters that look the same can sometimes be represented by different Unicode sequences, and normalization helps eliminate those differences.

CotEditor supports the following Unicode normalization forms:

Form Description
NFD

Canonical decomposition.

NFC

Canonical decomposition, followed by canonical composition.

NFKD

Compatibility decomposition.

NFKC

Compatibility decomposition, followed by canonical composition.

NFKC Case-Fold

Applying NFKC, case folding, and removal of default-ignorable code points.

Modified NFD

Unofficial NFD-based normalization form used in HFS+.

Modified NFC

Unofficial NFC-based normalization form corresponding to Modified NFD.

Note: Modified NFD and Modified NFC follow the modified Unicode normalization used by macOS file systems, such as HFS+, where file names are normalized using a variant of Unicode normalization. It is useful when working with text derived from macOS file names.

Normalize Unicode text

  1. In the CotEditor app on your Mac, open a document.

  2. Select text to normalize.

  3. Choose Text > Unicode Normalizations, then choose a normalization form.