URL Encode / Decode
Encode and decode text to and from URL percent-encoded format.
URL Encode & Decode — Percent-Encoding Web Standards Guide
Uniform Resource Locators (URLs) are the addresses used to locate resources on the World Wide Web. Because of historical networking design, URLs can only contain a specific, limited subset of characters from the ASCII character set. Any characters outside this range — or characters with special meanings within a URL's structure — must be translated into a safe format. This process is known as URL Encoding or Percent-Encoding.
This guide details why URL encoding is necessary, which characters require encoding, how the encoding process works, and the standard lookup codes.
---
1. Why Is URL Encoding Necessary?
A URL consists of several structural components (e.g., protocol, domain name, path, query parameters, and fragments). Certain characters have designated syntactical roles within these components:
- `?` separates the path from the query string.
- `&` separates individual query parameters.
- `=` separates a parameter name from its value.
- `/` separates path segments.
If you need to include these characters as actual data inside a query parameter (for instance, searching for the phrase "Cats & Dogs"), you cannot use the `&` symbol directly because the web server will interpret it as the boundary of a new query parameter. Encoding ensures the server reads the data correctly.
---
2. Character Categories in URLs
According to the RFC 3986 specification, characters in a URL are split into three categories:
Unreserved Characters
These characters have no special meaning and can be used anywhere in a URL without encoding:
- `A` to `Z` and `a` to `z`
- `0` to `9`
- `-` (hyphen), `_` (underscore), `.` (period), and `~` (tilde)
Reserved Characters
These characters have special structural meanings. They must be encoded if they are used as data instead of their structural role:
- `!`, `*`, `'`, `(`, `)`, `;`, `:`, `@`, `&`, `=`, `+`, `$`, `,`, `/`, `?`, `%`, `#`, `[`, `]`
Excluded/Invalid Characters
These characters are not allowed in URLs under any circumstances and must always be encoded (e.g., spaces, brackets, curly braces, and non-ASCII Unicode characters).
---
3. How Percent-Encoding Works
To encode a character:
1. Determine its numerical byte value in UTF-8 format.
2. Convert that value to a hexadecimal number.
3. Prepend a percent sign `%` to the two-digit hexadecimal value.
Examples:
- Space: ASCII value 32 is `20` in hexadecimal. It becomes `%20` (or sometimes a `+` in query parameters).
- Ampersand (`&`): ASCII value 38 is `26` in hexadecimal. It becomes `%26`.
- Percent Sign (`%`): Since the percent sign marks the start of an encoded sequence, it must be encoded as `%25` to be treated as actual data.
---
Common URL Encoding Lookup Table
| Character | Description | Encoded Value |
|---|---|---|
| ` ` | Space | `%20` or `+` |
| `!` | Exclamation Mark | `%21` |
| `#` | Hash / Fragment identifier | `%23` |
| `$` | Dollar Sign | `%24` |
| `%` | Percent Sign | `%25` |
| `&` | Ampersand (parameter separator) | `%26` |
| `+` | Plus Sign (often represents space) | `%2B` |
| `/` | Forward Slash (path separator) | `%2F` |
| `:` | Colon | `%3A` |
| `;` | Semicolon | `%3B` |
| `=` | Equals Sign (key/value separator) | `%3D` |
| `?` | Question Mark (query separator) | `%3F` |
| `@` | At Symbol | `%40` |
---
Non-ASCII and Unicode Characters
For characters outside the standard ASCII range (such as emojis or non-English alphabets like accented letters or Chinese characters), the character is first converted to its multi-byte UTF-8 representation, and then each byte is percent-encoded.
For example, the character `é` (UTF-8 bytes `C3 A9`) encodes to `%C3%A9`. The thumbs-up emoji `👍` (UTF-8 bytes `F0 9F 91 8D`) encodes to `%F0%9F%91%8D`.
Related Calculators
- Base64 Encode / Decode — Convert binary data to ASCII strings.
- Binary Calculator — Calculate binary base values.
- Hex Calculator — Solve hexadecimal math.
Frequently Asked Questions
URL encoding (also known as percent-encoding) converts unsafe characters in a URL into a safe format that can be transmitted across the internet. It replaces characters with a percent sign (%) followed by two hex digits.
URLs can only contain characters from the standard US-ASCII set. Special characters like spaces, question marks, and ampersands have specific functional meanings in URLs, so they must be encoded to avoid parsing conflicts.
Reserved characters (like ?, &, =, +, #, and /) must be encoded when they represent data rather than URL structure. Space is also reserved and must be encoded.
encodeURI assumes the input is a full URL, so it does NOT encode characters that are valid in a URL (like :, /, ?, &, =). encodeURIComponent encodes all special characters, making it suitable for encoding query parameters.
Percent-encoding represents a character by its hexadecimal UTF-8 byte value prefixed by a percent symbol. For example, the space character in UTF-8 is byte 32 (hex 20), which translates to %20.
Under the standard RFC 3986 specification, a space is encoded as %20. In some older query string formats, space was sometimes represented as a plus (+), but %20 is the modern, universal standard.