HTML Entity Encoding: What You Need to Know
HTML entities are special codes that represent characters in HTML. They’re essential for displaying special characters, preventing XSS attacks, and ensuring your content renders correctly across all browsers.
What are HTML Entities?
HTML entities are codes that represent reserved or special characters:
<div> <!-- Renders as <div> -->
&copy; <!-- Renders as © -->
"Hello" <!-- Renders as "Hello" -->
An entity starts with & and ends with ;. They come in two forms:
Named Entities
< < less than
> > greater than
& & ampersand
" " quote
' ' apostrophe
Numeric Entities
Using the character’s Unicode code point:
< < decimal
< < hexadecimal
Why Use HTML Entities?
1. Reserved Characters
HTML reserves certain characters for syntax. To display them, you must use entities:
<!-- Wrong: breaks HTML -->
<p>5 < 10</p>
<!-- Correct: displays properly -->
<p>5 < 10</p>
2. XSS Prevention
Encoding user input prevents cross-site scripting attacks:
// Dangerous: user input rendered directly
element.innerHTML = userInput;
// Safe: entities encode dangerous characters
element.innerHTML = encodeHTML(userInput);
3. Special Symbols
Display characters not available on keyboards:
© © Copyright
™ ™ Trademark
— — Em dash
– – En dash
Common HTML Entities
Essential Characters
| Character | Entity | Description |
|---|---|---|
| < | < | Less than |
| > | > | Greater than |
| & | & | Ampersand |
| ” | " | Quote |
| ’ | ' | Apostrophe |
Currency Symbols
| Character | Entity | Description |
|---|---|---|
| € | € | Euro |
| £ | £ | Pound |
| ¥ | ¥ | Yen |
| ¢ | ¢ | Cent |
Mathematical Symbols
| Character | Entity | Description |
|---|---|---|
| × | × | Multiplication |
| ÷ | ÷ | Division |
| ± | ± | Plus-minus |
| ² | ² | Superscript 2 |
Arrows
| Character | Entity | Description |
|---|---|---|
| ← | ← | Left arrow |
| → | → | Right arrow |
| ↑ | ↑ | Up arrow |
| ↓ | ↓ | Down arrow |
Using Our Encoder Tool
Our HTML Entity Encoder makes encoding and decoding easy:
Encoding
- Paste your text containing special characters
- Select “Encode”
- Click the button
- Get your encoded HTML-safe text
Decoding
- Paste text with HTML entities
- Select “Decode”
- Click the button
- Get your decoded plain text
Code Examples
JavaScript
// Encode HTML entities
function encodeHTML(str) {
return str
.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
.replace(/'/g, ''');
}
// Decode HTML entities
function decodeHTML(str) {
{
const doc = new DOMParser().parseFromString(str, 'text/html');
return doc.body.textContent;
}
// Or use browser API
const encoded = new Option(str).innerHTML;
const decoded = new Option(encoded).textContent;
Python
import html
# Encode
encoded = html.escape('<script>alert("XSS")</script>')
# Result: <script>alert("XSS")</script>
# Decode
decoded = html.unescape('<div>Hello</div>')
# Result: <div>Hello</div>
Security Considerations
XSS Prevention
Always encode user-generated content before displaying it in HTML:
// NEVER do this with untrusted input
element.innerHTML = userInput;
// Always encode first
element.innerHTML = encodeHTML(userInput);
Context Matters
Different contexts need different encoding:
- HTML content: Use
<,>,& - Attribute values: Also encode quotes
" - JavaScript strings: Use Unicode escapes
\u003C - URLs: Use URL encoding
%3C
Don’t Over-encode
Only encode what’s necessary. Over-encoding can cause issues:
<!-- Wrong: double-encoded -->
<p>&lt;div&gt;</p>
<!-- Correct -->
<p><div></p>
Conclusion
HTML entity encoding is a fundamental skill for web developers. It ensures your content displays correctly and helps prevent security vulnerabilities.
Use our HTML Entity Encoder for quick encoding and decoding. Remember: when in doubt, encode it!