HTML Entity Encoder
Encode and decode HTML entities for safe rendering
About This Tool
Encodes characters as HTML entities and decodes entity references back to plain text. Supports named references (&, <, >, ", '), numeric decimal (&), and hexadecimal (&) forms.
Useful for embedding raw HTML inside attributes, code samples, or user-generated content where reserved characters would otherwise be parsed as markup.
The HTML5 specification defines roughly 2,000 named character references, the vast majority of which are mathematical and typographic symbols (Greek letters, em-dash, copyright sign, etc.). The five always-required references are & for ampersand, < and > for angle brackets, " for double quote, and ' for apostrophe (the last added in HTML5; older HTML4 documents used ' instead). Numeric character references use either decimal (& for the ampersand at code point 38) or hexadecimal (&) syntax and can encode any Unicode code point; named references are limited to the predefined set.
A worked example: the HTML fragment <code>let x = 5 < 10 && y > 0;</code> contains three reserved characters. Encoding produces 'let x = 5 < 10 && y > 0;', which the browser then parses as text rather than as malformed markup. Reverse direction: a string like '© 2026 — All rights reserved' decodes to '© 2026 — All rights reserved'. The full Unicode escape syntax handles characters outside the BMP: '😀' decodes to the grinning face emoji.
Limitations and security implications matter. Decoding entities and then injecting the result into an HTML document re-opens any cross-site scripting (XSS) vector that the encoding had closed. The pattern 'decode user input → display as HTML' is one of the most common XSS introductions in legacy PHP and ASP code. Modern frameworks (React, Vue, Svelte) escape by default and require explicit dangerous-html opt-in to inject raw markup, which is the safe default. Encoding the same string twice produces double-encoded output ('&' becomes '&amp;'), which the decoder then takes only one step toward recovery; this is a frequent source of '&' artifacts visible in production pages.
Not every special character requires encoding in every context. In a UTF-8 document body, only the five reserved characters above must be escaped; non-ASCII characters can appear literally. In an attribute value with double-quote delimiters, additionally the double quote must be escaped (or single-quote delimiters used). In a script element's content, the rules differ entirely: entities are not parsed, but the literal sequence '</script>' must be avoided.
The about text and FAQ on this page were drafted with AI assistance and reviewed by a member of the Coherence Daddy team before publishing. See our Content Policy for editorial standards.