HTML Guides for control character
Learn how to identify and fix common HTML validation errors flagged by the W3C Validator — so your pages are standards-compliant and render correctly across every browser. Also check our Accessibility Guides.
What Are Control Characters?
Control characters occupy code points U+0000 through U+001F and U+007F through U+009F in Unicode. They were originally designed for controlling hardware devices (e.g., U+0002 is “Start of Text,” U+0007 is “Bell,” U+001B is “Escape”). These characters have no visual representation and carry no semantic meaning in a web document.
The HTML specification explicitly forbids character references that resolve to most control characters. Even though the syntax  is a structurally valid character reference, the character it points to is not a permissible content character. The W3C validator raises this error to flag references like �, , , , and others that fall within the control character ranges.
Why This Is a Problem
- Standards compliance: The WHATWG HTML Living Standard defines a specific set of “noncharacter” and “control character” code points that must not be referenced. Using them produces a parse error.
- Unpredictable rendering: Browsers handle illegal control characters inconsistently. Some may silently discard them, others may render a replacement character (�), and others may exhibit unexpected behavior.
- Accessibility: Screen readers and other assistive technologies may choke on or misinterpret control characters, degrading the experience for users who rely on these tools.
- Data integrity: Control characters in your markup often indicate a copy-paste error, a corrupted data source, or a templating bug that inserts raw binary data into HTML output.
How to Fix It
- Identify the offending reference — look for character references like , , �, , or similar that point to control character code points.
- Determine intent — figure out what character or content was actually intended. Often, a control character reference is the result of a bug in a data pipeline or template engine.
- Remove or replace — either delete the reference entirely or replace it with the correct printable character or HTML entity.
Examples
Incorrect: Control character reference
This markup contains , which expands to the control character U+0002 (Start of Text) and triggers the validation error:
<p>Some text  more text</p>
Incorrect: Hexadecimal form of a control character
The same problem occurs with the hexadecimal syntax:
<p>Data: </p>
Correct: Remove the control character reference
If the control character was unintentional, simply remove it:
<p>Some text more text</p>
Correct: Use a valid character reference instead
If you intended to display a special character, use the correct printable code point or named entity. For example, to display a bullet (•), copyright sign (©), or ampersand (&):
<p>Item • Details</p>
<p>Copyright © 2024</p>
<p>Tom & Jerry</p>
Correct: Full document without control characters
<!DOCTYPE html>
<html lang="en">
<head>
<title>Example Page</title>
</head>
<body>
<p>This paragraph uses only valid character references: & < > ©</p>
</body>
</html>
Common Control Character Code Points to Avoid
| Reference | Code Point | Name |
|---|---|---|
| � | U+0000 | Null |
|  | U+0001 | Start of Heading |
|  | U+0002 | Start of Text |
|  | U+0007 | Bell |
|  | U+0008 | Backspace |
|  | U+000B | Vertical Tab |
|  | U+000C | Form Feed |
|  | U+007F | Delete |
If your content is generated dynamically (from a database, API, or user input), sanitize the data before inserting it into HTML to strip out control characters. Most server-side languages and templating engines provide utilities for this purpose.
Ready to validate your sites?
Start your free trial today.