URL Encode / Decode

Encode and decode text to and from URL percent-encoded format.

Advertisement
Modify the values and click Calculate

URL Encode & Decode — Percent-Encoding Web Standards Guide

Uniform Resource Locators (URLs) are the addresses used to locate resources on the World Wide Web. Because of historical networking design, URLs can only contain a specific, limited subset of characters from the ASCII character set. Any characters outside this range — or characters with special meanings within a URL's structure — must be translated into a safe format. This process is known as URL Encoding or Percent-Encoding.

This guide details why URL encoding is necessary, which characters require encoding, how the encoding process works, and the standard lookup codes.

---

1. Why Is URL Encoding Necessary?

A URL consists of several structural components (e.g., protocol, domain name, path, query parameters, and fragments). Certain characters have designated syntactical roles within these components:

  • `?` separates the path from the query string.
  • `&` separates individual query parameters.
  • `=` separates a parameter name from its value.
  • `/` separates path segments.

If you need to include these characters as actual data inside a query parameter (for instance, searching for the phrase "Cats & Dogs"), you cannot use the `&` symbol directly because the web server will interpret it as the boundary of a new query parameter. Encoding ensures the server reads the data correctly.

---

2. Character Categories in URLs

According to the RFC 3986 specification, characters in a URL are split into three categories:

Unreserved Characters

These characters have no special meaning and can be used anywhere in a URL without encoding:

  • `A` to `Z` and `a` to `z`
  • `0` to `9`
  • `-` (hyphen), `_` (underscore), `.` (period), and `~` (tilde)

Reserved Characters

These characters have special structural meanings. They must be encoded if they are used as data instead of their structural role:

  • `!`, `*`, `'`, `(`, `)`, `;`, `:`, `@`, `&`, `=`, `+`, `$`, `,`, `/`, `?`, `%`, `#`, `[`, `]`

Excluded/Invalid Characters

These characters are not allowed in URLs under any circumstances and must always be encoded (e.g., spaces, brackets, curly braces, and non-ASCII Unicode characters).

---

3. How Percent-Encoding Works

To encode a character:

1. Determine its numerical byte value in UTF-8 format.

2. Convert that value to a hexadecimal number.

3. Prepend a percent sign `%` to the two-digit hexadecimal value.

Examples:

  • Space: ASCII value 32 is `20` in hexadecimal. It becomes `%20` (or sometimes a `+` in query parameters).
  • Ampersand (`&`): ASCII value 38 is `26` in hexadecimal. It becomes `%26`.
  • Percent Sign (`%`): Since the percent sign marks the start of an encoded sequence, it must be encoded as `%25` to be treated as actual data.

---

Common URL Encoding Lookup Table

Character Description Encoded Value
` ` Space `%20` or `+`
`!` Exclamation Mark `%21`
`#` Hash / Fragment identifier `%23`
`$` Dollar Sign `%24`
`%` Percent Sign `%25`
`&` Ampersand (parameter separator) `%26`
`+` Plus Sign (often represents space) `%2B`
`/` Forward Slash (path separator) `%2F`
`:` Colon `%3A`
`;` Semicolon `%3B`
`=` Equals Sign (key/value separator) `%3D`
`?` Question Mark (query separator) `%3F`
`@` At Symbol `%40`

---

Non-ASCII and Unicode Characters

For characters outside the standard ASCII range (such as emojis or non-English alphabets like accented letters or Chinese characters), the character is first converted to its multi-byte UTF-8 representation, and then each byte is percent-encoded.

For example, the character `é` (UTF-8 bytes `C3 A9`) encodes to `%C3%A9`. The thumbs-up emoji `👍` (UTF-8 bytes `F0 9F 91 8D`) encodes to `%F0%9F%91%8D`.

Related Calculators

Frequently Asked Questions

URL encoding (also known as percent-encoding) converts unsafe characters in a URL into a safe format that can be transmitted across the internet. It replaces characters with a percent sign (%) followed by two hex digits.

URLs can only contain characters from the standard US-ASCII set. Special characters like spaces, question marks, and ampersands have specific functional meanings in URLs, so they must be encoded to avoid parsing conflicts.

Reserved characters (like ?, &, =, +, #, and /) must be encoded when they represent data rather than URL structure. Space is also reserved and must be encoded.

encodeURI assumes the input is a full URL, so it does NOT encode characters that are valid in a URL (like :, /, ?, &, =). encodeURIComponent encodes all special characters, making it suitable for encoding query parameters.

Percent-encoding represents a character by its hexadecimal UTF-8 byte value prefixed by a percent symbol. For example, the space character in UTF-8 is byte 32 (hex 20), which translates to %20.

Under the standard RFC 3986 specification, a space is encoded as %20. In some older query string formats, space was sometimes represented as a plus (+), but %20 is the modern, universal standard.

Advertisement