HTML Entity Encoder/Decoder

Encode special characters to HTML entities or decode them back to plain text.

About This Tool

Paste raw text in one pane and read the HTML-entity-encoded version in the other. Reverse the direction by toggling decode mode, which converts entities (named like `&` and numeric like `"`) back to their literal characters.

Use it when escaping content for safe embedding in HTML, debugging double-encoded output (`&` is the smell), or pulling apart what a CMS did to your text on save.

The encoder handles the five HTML-significant characters (`&`, `<`, `>`, `"`, `'`) by default, and a strict mode also encodes every non-ASCII character as a numeric entity. The decoder accepts both named and numeric entities, including hex (`&#x4F;`).

HTML entities exist because some characters (`<`, `>`, `&`) have structural meaning in HTML — using them as literal text requires escape. The encoder substitutes named or numeric entities for these characters: `<` becomes `&lt;`, `>` becomes `&gt;`, `&` becomes `&amp;`, `"` becomes `&quot;`, `'` becomes `&#39;` (or `&apos;` in XML-strict mode). The decoder runs the reverse, accepting both named (`&copy;`) and numeric (`&#169;`, `&#xA9;`) forms.

Worked example. Input: `<p>Tom &amp; Jerry</p>` after typical CMS encoding. Decoded: `<p>Tom & Jerry</p>`. Encoded again (strict mode, all HTML-significant chars): `&lt;p&gt;Tom &amp; Jerry&lt;/p&gt;`. The double-encoding gotcha: input `Tom &amp; Jerry` (already encoded once) re-encoded becomes `Tom &amp;amp; Jerry`, which then decodes to `Tom &amp; Jerry` (still encoded). Recognize the smell — `&amp;amp;` in display means double-encoding happened somewhere upstream and the data needs to be decoded once more.

When to use which form. For HTML body text where you want literal characters: minimal entities (the five above) are sufficient. For HTML attribute values (especially within quoted attributes): also escape the quote character matching the attribute's quote style. For email templates rendered in legacy clients: use named entities for non-ASCII characters because some clients lose UTF-8 encoding metadata. For modern web HTML with proper UTF-8 declared: don't entity-encode non-ASCII characters at all — the bytes go through fine and entities just bloat the output.

The XSS angle. HTML entity encoding is a standard defense against cross-site scripting (XSS) when reflecting user input into HTML. Encoding `<` as `&lt;` prevents user input from being parsed as a tag. The encoder is a useful tool, but for production XSS protection use your framework's built-in templating escape (React JSX, Handlebars triple-stash vs double-stash, Jinja autoescape) — they handle context-dependent escaping (URL context, JS context, HTML attribute context) that this generic encoder doesn't distinguish.

Numeric vs named entities. Named entities are more readable (`&copy;` vs `&#169;`) but limited to the predefined list (about 250 in HTML5). Numeric entities work for any Unicode codepoint but are less readable. For characters not in the named list, decimal (`&#xxx;`) and hex (`&#xXX;`) are equivalent — most decoders handle both. The tool emits decimal numeric entities by default because they're more widely supported in older parsers.

The about text and FAQ on this page were drafted with AI assistance and reviewed by a member of the Coherence Daddy team before publishing. See our Content Policy for editorial standards.

Frequently Asked Questions