Toolflux

URL encoding explained: percent-encoding encoder and decoder

URL encoding step by step: every %-code colour-coded and explained, double-encoding alert, encode and decode in the browser.

When a URL looks like https%3A%2F%2Fexample.com%2Fsearch%3Fq%3DCaf%C3%A9, it is not broken - it is encoded. Each %XX replaces a character that would otherwise carry structural meaning or sit outside ASCII. The URL Encoder below renders every replacement as a coloured chip, input beside output, and flags double-encoding.

Function
Like encodeURIComponent - escapes URL separators like `: / ? #` as well
Output
https%3A%2F%2Fexample.com%2Fsuche%3Fq%3DCaf%C3%A9%20M%C3%BCnchen%26page%3D1
UTF-8 bytes: 75

The result explained

Input
https://example.com/suche?q=Café München&page=1
Output
https%3A%2F%2Fexample.com%2Fsuche%3Fq%3DCaf%C3%A9%20M%C3%BCnchen%26page%3D1

Runs in your browser. No network call. No account.

What is URL encoding and why does it look like that?

URL encoding, or percent-encoding, is a translation layer between browser and server. URLs follow a fixed grammar: : separates the scheme, / path segments, ? the query, # the fragment. Characters that break that grammar (spaces, umlauts, a stray / in a value) get masked as %XX where XX is the hex byte value.

Without that layer, invoice/2026.pdf reads as a sub-path; & in a query value tears the parameter structure apart. The panel splits your input into categories: plain grey, URL delimiters green, parameter separators purple, non-ASCII bytes orange, spaces blue. Every chip on the left has a matching chip on the right.

When do encoded URLs show up?

Encoded URLs show up wherever text with special characters or non-ASCII travels through URL structures. Developers are not the only readers - SEOs, marketers, content operators, translators and students hit %20 or %C3%BC too, usually without realising it is a translation.

  • SEOs read canonicals with %20 for spaces - raw spaces are invalid; only the encoded form is.
  • Marketers build links like https://shop.de/?utm_source=newsletter&utm_campaign=frühling 2026 - the space breaks the link; modern browsers usually UTF-8-encode umlauts. → Try UTM example
  • Email addresses like my+name@example.com trip up query strings because + decodes to a space; my%2Bname is safer. → Try email example
  • CMS exports ship URLs already encoded; skip the decode step and double-encoding follows.
  • Translators see Caf%C3%A9 in exports where Café was intended.

URI Component or Full URI - which mode fits?

The mode switch above picks between encoding a single value and smoothing a whole URL. Rule of thumb: URI Component for anything inside a URL (query value, path segment, fragment content). Full URI for a ready-made URL where : / ? # should stay put.

ScenarioRight choiceWhat breaks otherwise
Query value from user inputURI Component& and = stay literal; parameters collapse
Path segment with slashesURI Component/ stays literal; segment falls apart
Smooth over an assembled URLFull URIURL structure is dismantled
Fragment contentURI Component# is read as delimiter

Default to URI Component; Full URI is the exception.

What do %20, %3A and %C3%A9 mean?

Every %XX is a byte in hex. %20 = 0x20 = ASCII space, %3A = :, %2F = /. Anything outside safe ASCII gets masked this way. Non-ASCII goes through UTF-8 first and is encoded byte by byte, which is why é produces two percent-sequences (%C3%A9).

The panel categories cover every input. Plain characters (letters, digits, - . _ ~) stay literal, grey. URL delimiters : / ? # [ ] @ carry structural meaning; in URI Component mode they become %3A, %2F, %3F, %23. Parameter separators ! $ & ' ( ) * + , ; = are the RFC 3986 sub-delimiters, purple. Spaces become %20, except in application/x-www-form-urlencoded strings where + is the space. Two edge-case categories catch legacy control characters and already-encoded runs.

Double-encoding: when %3A turns into %253A

Double-encoding happens when an encoded string runs through an encoder again. The % itself gets encoded: %3A becomes %253A, %20 becomes %2520. The panel spots the pattern, raises an alert, offers a one-click switch to decode mode.

  1. Check the signature. Lots of %25 runs? %25 encodes % itself - a tell the string was encoded already.
  2. Try decoding. One pass gives readable text if singly encoded; remaining %XX means it was doubled.
  3. Find the source. Usually a tool chain without a decode step, or a CMS export with layered escape logic.

Try it: paste https%3A%2F%2Fexample.com%2Fsearch%3Fq%3DCaf%C3%A9%20Munich%26page%3D1 into the URL Encoder, or → Try double-encoding example. The alert raises automatically.

Umlauts, emojis and non-Latin scripts

Everything beyond ASCII travels through UTF-8. UTF-8 splits a character into one to four bytes; each byte is percent-encoded. That is why umlaut URLs are longer than they look and emoji URLs are genuinely long: ü is two bytes (%C3%BC), 🚀 four (%F0%9F%9A%80).

Take https://example.jp/search?q=東京タワー (→ Try Japanese example). Each Japanese character is three UTF-8 bytes - encoding yields fifteen %XX in a row. With https://site.com/post?title=Hello 🚀 World (→ Try emoji example) the byte counter shows the jump the rocket's four bytes contribute. On legacy systems, the same umlaut URL can arrive as %C3%BC (UTF-8) or %FC (Latin-1).

Frequently Asked Questions

What does %20 mean in a URL?

%20 is the percent-encoded space. The digits after the percent sign are the hex byte value - 0x20 is the ASCII space. In application/x-www-form-urlencoded query strings + is the space instead.

How do I detect and fix double-encoded URLs?

Lots of %25 runs are the first signal - %25 encodes % itself. A single decode pass typically returns the original. The panel above spots the pattern and offers the switch to decode mode.

Does encoding change what a URL points to?

When encoding and decoding stay correct, the URL resolves after decoding to the same resource as the original input. Partially raw or broken forms can parse differently or be rejected. Trouble usually starts with double-encoding or a skipped decode step.

How long can an encoded URL be?

RFC 3986 sets no hard limit, browsers and servers do. Rule of thumb: URLs under ~2000 characters clear nearly every pipeline; beyond that, behaviour gets inconsistent (Internet Explorer historically cut off around 2083). Modern browsers tolerate longer URLs; the practical limit usually comes from the server, proxy or CDN. The byte counter helps when emojis or non-Latin scripts inflate the length.

Why are : and / encoded differently?

They carry structural meaning: : separates the scheme, / separates path segments. In Full URI mode both stay intact so the URL stays readable. In URI Component mode they get masked so your value is not read as a path or query.