Base64 explained: encoder and decoder with byte breakdown
Base64 step by step: 3-byte groups as chips, 4-character blocks beside them, padding and URL-safe variant explained. Encode and decode in the browser.
When you spot cGFzc3dvcmQ6IHN1cGVyc2VjcmV0 in a config file or an API token starting with eyJ, you are looking at Base64. It is not encryption and not a protocol; it is a translation layer that packs arbitrary bytes into 64 safe ASCII characters so they fit through text channels like URLs, JSON strings, and email headers. The encoder below shows your input as UTF-8 bytes in 3-byte groups and pairs each group with the matching 4-character output block.
SGFsbG8gVG9vbGZsdXghIPCfmoA=
The result explained
Runs in your browser. No network call. No account.
What is Base64 and why does it look like that?
Base64 packs three bytes (24 bits) into four characters from a 64-symbol alphabet (A-Z, a-z, 0-9, +, /). The size grows by about 33 percent - the price you pay for a result made of harmless ASCII that travels safely through systems that otherwise only accept text.
The panel above splits your input into four categories: ASCII bytes grey, UTF-8 lead bytes orange, UTF-8 continuation bytes amber, padding = purple. Every 3-byte group on the left has a 4-character block on the right. With hello (5 bytes) you get two blocks - the second one ends in =. With München you can see how Mü becomes three bytes (4D C3 BC) and then the block TcO8.
A bit strip runs underneath the chips: three 8-bit bytes on the input side, four 6-bit sextets on the output side. Both sides carry exactly 24 bits, just sliced differently. Click any block and the strip jumps to that quantum - so you can watch 4D C3 BC (i.e. 01001101 11000011 10111100) regroup into the sextets 19 28 14 60, which the standard alphabet spells T c O 8.
When do you run into Base64?
Base64 turns up wherever bytes have to travel through text channels. Developers are not the only readers - power users, sysadmins, students hit it too, often without realising it is a translation rather than a code.
- Kubernetes secrets store
datavalues as Base64 inside the YAML manifest (echo -n "password" | base64returnscGFzc3dvcmQ=).stringDatais the plain-text alternative. - Data URIs in HTML like
data:image/png;base64,iVBORw0KG...embed images straight into the markup. - JWT tokens are three URL-safe Base64 parts: header, payload, signature. → Decode JWT header
- Email attachments have travelled as Base64 through SMTP since the 90s, because the protocol only guarantees 7-bit ASCII.
- Browser APIs like
FileReader.readAsDataURLhand back Base64. JWK fields fromcrypto.subtle.exportKey("jwk", ...)use Base64url.
If you need the bytes as a hex list instead of Base64 - to compare against xxd, a hex viewer, or openssl output - the "Copy hex" action hands them over ready to paste.
Standard or URL-safe - which variant fits?
The variant switch picks between the two alphabets defined in RFC 4648. Standard (§4) uses + and /. Both are special in URLs and have to be percent-encoded if they appear there. URL-safe (§5) replaces + with -, / with _, and usually drops the padding =, so the string is safe to drop into a URL, a filename, or a cookie value.
| Scenario | Right choice | What breaks otherwise |
|---|---|---|
| Bytes in YAML, JSON, headers | Standard | This is the default almost everywhere |
| Token in a URL path or query | URL-safe | + and / need extra percent-encoding |
| Filename derived from a hash | URL-safe | / would create a fresh path segment |
| JWT (header, payload, signature) | URL-safe | Standard does not work in the token format |
Default to Standard. Switch to URL-safe only when the Base64 string ends up in a URL or a filename.
What does the = at the end mean?
Padding = rounds the final block up to four characters. Base64 consumes three bytes at a time. If your input is a multiple of three bytes, you get no =. One byte left over yields two =. Two bytes left over yield one =.
hello(5 bytes) → one full 3-byte block plus 2 bytes left →aGVsbG8=(→ try it).hi(2 bytes) → 2 bytes left →aGk=.Foo(3 bytes) → exact fit →Rm9v, no padding.a(1 byte) → 1 byte left →YQ==.
Padding is not just decoration. Decoders use it to recover the exact byte count without inspecting the data. The URL-safe variant tends to drop the padding because the length is implicit anyway - the decoder above accepts both forms.
Why is btoa() tricky with umlauts?
btoa(string) is a 90s browser API. It expects a byte string: characters with code values from 0x00 to 0xFF. A ü (U+00FC) is inside that range, so btoa("München") does not throw. The result is TfxuY2hlbg==, because ü is treated as the single byte 0xFC. The UTF-8 form that APIs usually expect is TcO8bmNoZW4=.
The moment an emoji or a Chinese character shows up, btoa throws InvalidCharacterError, because those characters cannot fit into one byte. The encoder above handles both cases by taking the detour through TextEncoder: first turn the text into UTF-8 bytes, then call String.fromCharCode(...bytes), then btoa. Decoding works the same way in reverse: atob returns a byte sequence, TextDecoder turns it back into UTF-8 text. With München you can see the jump from 7 characters to 8 bytes in the panel above - the ü accounts for two of them.
Frequently Asked Questions
What is Base64 and why does it exist?
Base64 is an encoding that packs arbitrary bytes into 64 safe ASCII characters (A-Z, a-z, 0-9, +, /). It started in email, which for decades could only carry 7-bit ASCII. Today Base64 turns up in data URIs, JWT tokens, Kubernetes secrets, and anywhere binary data has to travel through a text channel.
What does the = at the end mean?
The = is padding. Base64 packs three bytes into four characters. If your input is not divisible by three, the last block is padded with =. One = means two data bytes in the final block, two = means one data byte. The URL-safe variant usually drops the padding.
Why does btoa() produce junk on umlauts?
btoa(string) expects a byte string. Umlauts like ü are treated as Latin-1 bytes, while emoji and many other characters throw. The tool code routes through TextEncoder and produces the UTF-8 Base64 form that servers and APIs expect. The URL Encoder uses UTF-8 the same way.
Is Base64 encryption?
No. Base64 is encoding, not encryption. Anyone with the string and a decoder gets the plain text back. If you need secrecy, reach for encryption (AES, RSA, libsodium).
When do I need Base64 in a URL?
Whenever bytes have to travel through a URL without being mangled - JWT tokens, OAuth state parameters, webhook signatures, sometimes image hashes in paths. Switch to the URL-safe variant in those cases, or you end up with double encoding. More on the mechanics in the URL Encoder.