Character Set Detector | Identify Text Encoding

EncodingRuns in Your Browser (No Uploads)
Loading…

Detect the character encoding of a file. Supports UTF-8, UTF-16, ASCII, and checks for Byte Order Mark (BOM).

How to Use This Tool

  1. 1

    Drag and drop a file into the upload zone, or click to select one.

  2. 2

    The tool instantly analyzes the file header and content.

  3. 3

    View the detected Encoding, Confidence score, and BOM presence.

  4. 4

    Use this information to open the file correctly in your editor or application.

Use Cases & Examples

Fixing Garbled Text (Mojibake)

Determine if a broken file is actually UTF-8, ISO-8859-1, or Windows-1252 so you can decode it correctly.

Verifying Data Pipelines

Ensure that files generated by your system have the correct encoding and BOM settings before sending them to customers.

Codebase Migration

Scan source code files to identify legacy encodings before converting everything to UTF-8.

How Charset Detection Works

Byte Analysis: The tool reads the raw byte patterns of the file to guess the encoding.

BOM Check: It looks for a Byte Order Mark (BOM) at the beginning (e.g., `EF BB BF` for UTF-8).

Validation: It validates byte sequences against rules for UTF-8, ASCII, and other common encodings.

Frequently Asked Questions

Q.Is my file uploaded to a server?

A. No. The analysis happens entirely in your browser using JavaScript. Your file data never leaves your device.

Q.Can it detect all encodings?

A. It reliably detects common encodings like UTF-8, UTF-16, and ASCII. For legacy single-byte encodings (like Windows-1252 vs ISO-8859-1), detection is a best-guess estimate.

Q.What is a BOM?

A. A Byte Order Mark (BOM) is a hidden character at the start of a text file that identifies the encoding and endianness. It is optional in UTF-8 but common in Windows files.

Related Tools

Explore more developer tools

Browse All Tools