HTML Decode — Online HTML Entity Decoder

🔒 Runs in your browser — nothing is sent to a server

HTML decode any string containing entities back into its original plain-text characters in a single click. Paste content with escapes like `&`, `<`, `>`, `"`, `'` or hex forms like `'` — exactly what you get from a scraped page, a log line, a CMS export or a database field — and this HTML decoder resolves every entity to the character it represents. Named entities, decimal numeric references and hex numeric references are all handled in one pass. Everything runs 100% inside your browser; your input never leaves your device, nothing is uploaded, logged or sent to any server.

Two-pane view: input and output side by side
Copied!

When to use an HTML decoder

You reach for an HTML decoder whenever data that was once embedded in HTML needs to be treated as plain text again: a CMS export that stored comment bodies in pre-escaped form, an RSS feed that arrives with entities inside `<title>` and `<description>`, a scraper that collected `innerHTML` instead of `textContent`, a webhook payload where a partner re-encoded your strings for transport, or a log line that double-escaped user input. Running the value through a trustworthy, offline-first decoder is the fastest way to get back the literal characters — no copy-pasting through a remote service that might log what you paste.

How HTML decoding works

HTML decoding is a single pass over the input string with one regex. The decoder looks for `&name;`, `&#decimal;` or `&#xhex;` patterns; each match is classified by its first character after the ampersand. Named matches are looked up in the entity table (`amp` → `&`, `lt` → `<`, and so on). Decimal matches parse the digits into an integer. Hex matches parse the digits after `&#x` as base 16. The integer is then converted via `String.fromCodePoint`, which correctly emits surrogate pairs for characters above U+FFFF such as emoji. Malformed or unknown sequences pass through unchanged — a safer default than silently dropping them.

Examples

Input
&lt;div&gt;Hello &amp; welcome&lt;/div&gt;
Output
<div>Hello & welcome</div>
HTML decode tags and body text
Input
&lt;a title=&quot;Paul&#39;s blog&quot;&gt;Read&lt;/a&gt;
Output
<a title="Paul's blog">Read</a>
Decode HTML entities inside an attribute
Input
&lt;a href=&quot;/search?q=cats&amp;sort=new&quot;&gt;Cats&lt;/a&gt;
Output
<a href="/search?q=cats&sort=new">Cats</a>
HTML unescape a query string inside a link
Input
Caf&#233; &#x2014; Z&#xfc;rich &#128512;
Output
Café — Zürich 😀
Convert numeric HTML entities to characters

FAQ

How do I HTML decode a string?

Paste the encoded string into the input box above — this tool walks every entity (named like `&amp;`, decimal numeric like `&#39;`, or hex numeric like `&#x27;`) and substitutes the character it represents. No upload, no account. Programmatically, the same result is produced by `document.createElement("textarea").innerHTML = s; return el.value` in the browser, or `html.unescape(s)` in Python.

What is HTML decoding (HTML entity decoding)?

HTML decoding is the inverse of HTML encoding. It reads a string that contains HTML entities — tokens like `&amp;`, `&lt;`, `&gt;`, `&quot;`, `&#39;`, `&#x26;` — and replaces each one with the Unicode character it stands for. After decoding, you get back the original text exactly as it existed before the encoding step.

Which HTML entities does this decoder understand?

The decoder resolves the core named entities used by HTML: `&amp;`, `&lt;`, `&gt;`, `&quot;`, `&apos;`, `&nbsp;`, `&copy;`, `&reg;`, `&trade;`, `&hellip;`, `&mdash;`, `&ndash;`, `&lsquo;`, `&rsquo;`, `&ldquo;`, `&rdquo;`. On top of that it handles every decimal numeric reference (`&#NN;`) and every hex numeric reference (`&#xHH;`), which together cover the full Unicode range including diacritics, CJK and emoji.

How do I decode numeric HTML entities like &#39; or &#x27;?

Numeric entities point directly at a Unicode code point — `&#39;` is decimal 39 and `&#x27;` is hex 0x27, both the apostrophe `'`. The decoder parses digits after `&#` as decimal, or after `&#x` as hex, and emits `String.fromCodePoint` of that value. Paste the string and every numeric reference resolves in one pass.

What is the difference between HTML decode and HTML unescape?

"HTML decode" and "HTML unescape" are two names for the same operation — converting HTML entities back into their literal characters. Some libraries expose the function as `decode`, others as `unescape`, `unescapeHtml` or `html.unescape`. The behaviour is identical: read entities, write characters, leave anything that is not a well-formed entity alone.

Why would encoded HTML appear in my data?

HTML entities leak into logs and exports whenever HTML text is re-serialised: a CMS that stores comments pre-escaped, a scraper that reads `innerHTML` instead of `textContent`, an RSS feed, a JSON API that double-encodes user content, or a database column filled by form handlers. Running the value through an HTML decoder restores the text a human expects to see.

Does this HTML decoder handle UTF-8, diacritics and emoji?

Yes. Numeric entities can reference any Unicode code point, so `&#233;` decodes to `é`, `&#xfc;` to `ü` and `&#128512;` to 😀. The decoder uses `String.fromCodePoint`, which handles characters outside the Basic Multilingual Plane — including emoji and CJK supplementary ideographs — without the surrogate-pair bugs older implementations had.

Glossary

HTML entity

An HTML entity is a short token in the form `&name;` or `&#number;` that stands in for a single Unicode character inside HTML markup. Named forms like `&amp;`, `&lt;`, `&gt;`, `&quot;` exist for the handful of markup-meaningful characters; numeric forms address any code point by decimal or hex. An HTML decoder is the tool that walks these tokens and emits the character each one represents.

HTML decoder

An HTML decoder is any function or tool that accepts text containing HTML entities and returns the plain-character version. It scans the string for `&…;` sequences, classifies each as named, decimal numeric or hex numeric, and substitutes the referenced character. Well-behaved HTML decoders leave unknown or malformed sequences untouched instead of silently dropping them. This page is an HTML decoder that runs fully client-side.

HTML unescape

HTML unescape is the informal name for HTML decoding — the act of converting `&amp;`, `&lt;`, `&gt;`, `&quot;`, `&#39;` and every numeric reference back into the literal characters they stand for. Standard library helpers named `unescape`, `unescapeHtml` or `html.unescape` perform exactly this operation. The output is safe to treat as plain text, not as markup.

Named vs numeric entities

Named entities like `&amp;`, `&quot;`, `&nbsp;` are readable aliases defined by the HTML spec — roughly 250 of them. Numeric entities like `&#38;` (decimal) and `&#x26;` (hex) point directly at a Unicode code point and can represent any character, including emoji and CJK glyphs. A robust HTML decoder resolves both forms; named entities are looked up in a table, numeric entities are parsed and passed to `String.fromCodePoint`.

Encoding vs decoding direction

HTML encoding and HTML decoding are mirror operations. Encoding takes `<div>Tom & Jerry</div>` and produces `&lt;div&gt;Tom &amp; Jerry&lt;/div&gt;` so it can safely sit inside HTML source. Decoding reverses that — read the entities, write back the characters — so the data is readable again. You encode on the way into an HTML document and decode on the way out of one.

Related tools