Character Set Detector Online | Check File Encoding and BOM

EncodingRuns in Your Browser (No Uploads)

Detect text file encoding and Byte Order Mark details before importing CSVs, logs, source files, translations, or customer exports. Checks UTF-8, UTF-16, ASCII, BOM presence, confidence, and privacy-safe browser processing.

Loading…

What to do next

Continue with a related workflow or open the next tool that usually follows this task.

How to Use This Tool

  1. 1

    Upload or drag a text, CSV, log, source, or export file into the detector.

  2. 2

    Review the detected encoding, confidence score, and BOM result.

  3. 3

    Compare the result with the expected import target, such as UTF-8 without BOM or UTF-16LE.

  4. 4

    If text is garbled, reopen or convert the file using the detected encoding in your editor or data pipeline.

  5. 5

    Use the result before bulk imports, migrations, or customer data cleanup.

When to Use This Tool

Fix garbled text and mojibake

Identify whether a broken file is likely UTF-8, UTF-16, ASCII, or a legacy encoding candidate before reopening it with the correct decoder.

Check CSV files before import

Inspect exported CSV files before loading them into spreadsheets, databases, ETL jobs, or customer support systems that expect a specific encoding.

Audit source files before migration

Find files with BOM markers or non-UTF-8 encodings before converting a repository or documentation set to a consistent encoding.

Debug multilingual content exports

Review encoding clues when names, accents, Korean, Japanese, Chinese, or symbols appear corrupted after export or upload.

Common Mistakes

Assuming every text file is UTF-8

Many legacy tools and Windows workflows still create UTF-16, Windows-1252, or BOM-marked files. Check encoding before blaming the importer.

Ignoring BOM differences

A UTF-8 BOM can be harmless in some tools but break headers, scripts, or CSV field names in others. Confirm whether the target accepts BOM markers.

Treating encoding detection as certainty

Some legacy encodings share similar byte patterns, so detection is a best-effort signal. Verify suspicious files with a small preview before bulk conversion.

Examples

Check a CSV export before upload

Detect BOM and encoding before importing customer data into a database or spreadsheet.

Input
customers-export.csv
Bytes start with: EF BB BF 69 64 2C 6E 61 6D 65
Output
Encoding: UTF-8
BOM: Yes
Confidence: High
Next step: remove BOM if the importer treats the first header as id.

Diagnose garbled multilingual text

Use encoding clues to reopen a file that shows mojibake after export.

Input
support-log.txt
Preview: 안녕하세요
Output
Likely issue: UTF-8 bytes decoded as a legacy single-byte encoding
Next step: reopen the file as UTF-8.

How character set detection works

The detector reads raw bytes in the browser and checks for known Byte Order Mark sequences such as UTF-8, UTF-16LE, and UTF-16BE markers at the start of the file.

It validates byte patterns against common Unicode encodings and ASCII-compatible ranges, then reports a confidence score rather than claiming perfect certainty for every legacy encoding.

Encoding detection is most reliable for BOM-marked files and valid UTF-8 or UTF-16 data. Similar single-byte encodings can require human review or a known sample of expected text.

Frequently Asked Questions

Q.Is my file uploaded when checking character encoding?

A. No. The file is read and analyzed in your browser. The raw bytes do not need to leave your device, which is useful for logs, exports, and customer files.

Q.Can the detector distinguish every legacy encoding?

A. No detector can guarantee every legacy encoding from bytes alone. UTF-8, UTF-16, ASCII, and BOM markers are much more reliable than close single-byte encodings.

Q.What does BOM mean in a text file?

A. BOM means Byte Order Mark. It is a hidden byte sequence at the start of a file that can identify UTF-8 or UTF-16 variants and sometimes affects CSV headers or scripts.

Q.Why does my CSV look garbled after import?

A. The import tool may be reading the file with a different encoding than the one used to create it. Detect the source encoding, then import or convert with that encoding selected.

Q.Should I remove a UTF-8 BOM?

A. Remove it only if the target system mishandles it. Some Windows tools add BOM markers, while some scripts and CSV importers treat the marker as part of the first field name.

Related workflow guides

Use these focused guides when you need a practical workflow before opening the tool.

Related Tools

Explore more developer tools

Browse All Tools