Unicode Normalizer | NFC, NFD, NFKC & NFKD Forms
Normalize Unicode text to NFC, NFD, NFKC, or NFKD forms. Fix text comparison issues and standardize data for storage or processing.
How to Use This Tool
- 1
Select the desired normalization form (NFC is recommended for general web use).
- 2
Paste your text into the input area.
- 3
The tool instantly converts the text to the selected form.
- 4
Copy the result. You can also see the byte size change in the stats.
Use Cases & Examples
Fixing 'Mac vs Windows' File Names
macOS uses NFD (decomposed) while Windows/Linux use NFC (composed). Normalize filenames to NFC to ensure cross-platform compatibility.
Database Search & Indexing
Ensure consistent search results by normalizing all user input and stored data to a single form (usually NFC) before indexing.
Sanitizing User Input
Use NFKC/NFKD to break down compatibility characters (like ℍ to H, or ½ to 1/2) for easier parsing and filtering.
Understanding Normalization Forms
NFC (Canonical Composition): The standard for the web (W3C) and Linux. Combines characters and diacritics into single code points where possible.
NFD (Canonical Decomposition): Used by macOS file systems. Separates characters and diacritics into distinct code points.
NFKC/NFKD (Compatibility): Similar to NFC/NFD but also normalizes compatibility characters (e.g., converting 'fi' ligature to 'fi'). Note: This is a lossy conversion.
Frequently Asked Questions
Q.Why does my text look the same but fail comparison?
A. Unicode allows the same character to be represented in multiple ways (e.g., 'é' as a single code point or 'e' + '´'). Normalization unifies these representations.
Q.Which form should I use?
A. For most web applications and databases, NFC is the standard recommendation. Use NFD if you are dealing specifically with macOS file systems.
Q.Is NFKC safe for all text?
A. No. NFKC is a 'compatibility' normalization and can change the meaning of text (e.g., converting distinct mathematical symbols to plain letters). Use it with caution.
Related Tools
Explore more developer tools
Unicode Escape Encoder & Decoder | \uXXXX & \u{XXXXXX}
Encode text to Unicode escapes (\uXXXX or \u{XXXXXX}) or decode to text.
UTF-8 to UTF-16 Converter | Text Encoding Conversion
Convert text to UTF-16 Hex representation (LE/BE).