Unicode Escape & Unescape — \uXXXX Sequences

Guide

About Unicode Escape

Escape non-ASCII characters as <code>\uXXXX</code> sequences (UTF-16, the JavaScript / JSON form) or <code>\xXX</code> bytes. Decode the reverse. Useful when debugging encoding issues, inspecting strings in logs, and porting strings between languages with different string conventions.

What this fixes

Encoding bugs are subtle. A backend writes UTF-8, a middleware reinterprets as Latin-1, the front-end displays mojibake. Escaping captures the raw codepoints, surviving even broken transcoding.

Output forms

é — JavaScript / JSON / Java legacy (4-digit BMP codepoint)
\u{1F600} — JavaScript ES2015 (any codepoint, no surrogates)
\xE9 — single-byte hex (Python bytes literal)
\U0001F600 — Python long form (8-digit codepoint)

Common workflows

Make a non-ASCII string survive a Latin-1 logger. Escape before logging; decode the lines later when investigating.

Embed a unicode char in a source file under a restricted charset. Generate the escape sequence, paste in source.

Inspect a string from a log. Paste escaped text, see what the codepoints actually represent.

Transcribe between languages. A é in JavaScript becomes é in Java but a different form in Python — the tool helps translate.

Frequently asked questions

\\u or \\x?

\u is 4 hex digits — covers any 16-bit codepoint (the BMP, most characters). \x is 2 hex digits — covers a single byte. For codepoints above U+FFFF (emoji, rare CJK), JavaScript uses surrogate pairs (two \u escapes); Python uses \U with 8 digits.

Why escape?

Some systems mishandle UTF-8 — log aggregators that store text as Latin-1, JSON consumers that crash on raw non-ASCII, source files that need to compile under restricted charsets. Escaping flattens to ASCII while preserving meaning.

Surrogate pairs?

Codepoints U+10000 to U+10FFFF (emoji, CJK extension) split into two 16-bit code units in UTF-16. JavaScript strings store this; \u escapes follow suit. Toggle to ES2015 \u{...} for the modern alternative.

Round-trip?

Yes. Escape → unescape → original.

Is the input sent anywhere?

No — runs locally.

Will it touch ASCII?

Default off. Toggle Escape all non-ASCII to escape only chars > 127, or Escape all to escape every char.

Related tools

Last updated: 2025-01-15