Unicode Escape — Online \uXXXX Encoder & Decoder

🔒 Runs in your browser — nothing is sent to a server

Unicode escape converter that flips any text into JavaScript / Java / Python / C escape sequences and back, in a single click. Pick a direction (encode or decode), choose an escape format — `\uXXXX`, `\u{XXXXX}`, `\UXXXXXXXX`, `\xXX`, or legacy `%uXXXX` — and a scope (only non-ASCII, every character, or just control codes), and the converter walks each Unicode code point and emits the matching escape. Surrogate pairs for emoji and rare scripts are produced and consumed automatically. Everything runs 100% inside your browser; your input never leaves your device, nothing is uploaded, logged or sent to any server.

\uXXXX uses surrogate pairs for code points above U+FFFF (e.g. emoji).

Two-pane view: input and output side by side
Copied!

When to use a unicode escape converter

You need unicode escapes any time a string has to travel through a channel that does not handle the full Unicode range cleanly: embedding non-ASCII text into a JavaScript or Java source file, writing a literal that survives an ASCII-only transport, generating a JSON payload where a downstream parser is fussy about UTF-8, building a regex that needs explicit surrogate halves, or sanity-checking a log line that arrived already-escaped from another system. Running the convert in a trustworthy, offline-first page means the original text — and any escapes it might reveal — never touch a server.

How unicode escape and unescape work

Encoding walks the input one Unicode code point at a time (note: not one UTF-16 code unit — a surrogate pair counts as one character). Each code point is checked against the chosen scope (non-ASCII only, all characters, or control codes only) and, if it should be escaped, formatted in the chosen escape form. For `\uXXXX` and `%uXXXX`, code points above U+FFFF are emitted as a surrogate pair; for `\u{XXXXX}` and `\UXXXXXXXX`, the entire code point goes into one escape. Decoding is a single regex pass that recognises all five forms simultaneously, parses the hex digits with `parseInt(..., 16)`, and emits the character via `String.fromCodePoint`.

Examples

Input
Γεια
Output
\u0393\u03B5\u03B9\u03B1
Escape unicode characters — Greek "Γεια" into \uXXXX
Input
café
Output
caf\u00E9
Unicode escape with diacritic — café (non-ASCII only)
Input
😀
Output
\uD83D\uDE00
JavaScript unicode escape — emoji 😀 as surrogate pair
Input
\u0048\u0065\u006C\u006C\u006F\u002C\u4E16\u754C
Output
Hello,世界
\uXXXX converter — decode back into the original text
Input
\u{1F600}
Output
😀
Unicode unescape — ES6 \u{...} form for full code points

FAQ

How do I unicode-escape a string?

Pick the Encode mode at the top, paste the text into the input, choose a format (`\uXXXX` is the JavaScript / Java default) and a scope, then click Escape. The converter walks each Unicode code point in the input and emits the matching escape sequence. Decoding is the same flow in reverse — flip the toggle to Decode.

What is a \uXXXX escape?

`\uXXXX` is JavaScript and Java's escape for a single 16-bit code unit, where `XXXX` is four hexadecimal digits. Code points 0–U+FFFF (the Basic Multilingual Plane) fit in one `\uXXXX`. Code points above U+FFFF — emoji, mathematical symbols, rare scripts — are written as a surrogate pair: two `\uXXXX` escapes, the first in the D800–DBFF range, the second in DC00–DFFF.

How do I escape an emoji like 😀?

In `\uXXXX` mode the emoji 😀 (code point U+1F600) becomes a surrogate pair: `\uD83D\uDE00`. JavaScript and Java parse those two halves as a single character because that is exactly how UTF-16 stores supplementary code points. In `\u{XXXXX}` mode (ES6+, supported in modern JS) the same emoji is the single escape `\u{1F600}`. Pick the format your target language understands.

What is the difference between \uXXXX and \u{XXXXX}?

`\uXXXX` is fixed-width — exactly four hex digits, supports U+0000 to U+FFFF in one escape, surrogate pairs for everything else. `\u{XXXXX}` (introduced in ES6) accepts 1–6 hex digits and represents the full Unicode range up to U+10FFFF in a single escape. Use `\u{...}` when you can; fall back to `\uXXXX` for older JavaScript engines, Java string literals, and JSON.

When should I use \xXX vs \uXXXX?

`\xXX` is a two-digit hex escape limited to bytes 0x00–0xFF — the first 256 Unicode code points. It is common in Python, C and shell scripts for non-printable bytes. Anything above U+00FF needs `\uXXXX` (BMP) or `\u{XXXXX}` (full range). The encoder throws a clear error if you ask for `\xXX` on a code point that does not fit.

What does the "non-ASCII only" scope do?

It escapes only code points ≥ U+0080 — so plain English text passes through unchanged and only diacritics, CJK glyphs, Arabic, Greek, Cyrillic and emoji are turned into escape sequences. This is the most common need: making a string safe to paste into source code or JSON without losing any characters. Pick "All characters" for fully escaped output.

What is the "Control characters only" scope for?

It escapes just the C0 control codes (`\x00`–`\x1F`) and `\x7F` (DEL). That is exactly what JSON serialisers and log formatters need: a string with raw newlines, tabs and `\0` bytes is unsafe to embed, but printable text — including non-ASCII — can stay unescaped. Pick this scope when you want minimal noise but still need to neutralise control bytes.

Can the decoder handle every escape format?

Yes. The decoder is auto-detect: it scans the input for `\uXXXX`, `\u{XXXXX}`, `\UXXXXXXXX`, `\xXX` and `%uXXXX` simultaneously, decodes each match, and combines surrogate-pair `\uXXXX` halves back into the original supplementary code point automatically. Anything that is not a well-formed escape passes through unchanged, so partial input never gets silently dropped.

Glossary

Unicode code point

A Unicode code point is the integer that uniquely identifies a character in the Unicode standard, written `U+` followed by 4 to 6 hex digits. The range runs from U+0000 to U+10FFFF — about 1.1 million slots, of which roughly 150,000 are currently assigned. A unicode escape is just a textual representation of one code point in source code.

\uXXXX escape

`\uXXXX` is the four-hex-digit Unicode escape used in JavaScript, Java, JSON and C#. It addresses one UTF-16 code unit (U+0000 to U+FFFF). For code points above U+FFFF the source must use a surrogate pair — two `\uXXXX` escapes — because that is how UTF-16 represents supplementary characters. A `\uXXXX` converter is any tool that reads or writes this form.

\u{XXXXX} (ES6 escape)

`\u{XXXXX}` is the variable-width Unicode escape introduced in ECMAScript 2015. It accepts 1–6 hex digits between the curly braces and addresses any Unicode code point up to U+10FFFF in a single escape, no surrogate pair required. Use it whenever your runtime supports it; it is supported by every modern browser and Node.js since 6.

Surrogate pair

A surrogate pair is two consecutive 16-bit code units that together encode one supplementary Unicode code point (U+10000 to U+10FFFF). The first unit lies in U+D800–U+DBFF (high surrogate), the second in U+DC00–U+DFFF (low surrogate). UTF-16 strings — which is what JavaScript uses internally — store every emoji and astral character as a surrogate pair.

Unicode unescape

Unicode unescape is the reverse of unicode escape — converting `\uXXXX`, `\u{XXXXX}`, `\UXXXXXXXX`, `\xXX` or `%uXXXX` back into the literal Unicode characters they represent. A robust unescape function is format-agnostic: it scans for any of the common forms, parses the hex digits as a code point, and combines surrogate-pair halves into single characters.

JavaScript unicode escape

JavaScript supports three unicode escape forms: `\uXXXX` (always available), `\u{XXXXX}` (ES2015+, accepted in source and inside `String.raw`-tagged templates), and the legacy `%uXXXX` produced by the deprecated `escape()` global. Modern code should write `\u{...}` where possible; the other forms exist for compatibility with older engines, JSON, and URL-decoding routines that still emit `%uXXXX`.

Related tools