Base64 shows up in data URIs, email attachments, JWTs, and auth headers — anywhere binary data needs to travel through a text-only channel. It's simple once you see the mechanics, and understanding it clears up a common and dangerous misconception: Base64 is not a form of security.
Base64 is a binary-to-text encoding scheme. It represents arbitrary binary data using only 64 printable ASCII characters, so the result is safe to place in text-based formats and protocols. It's defined in RFC 4648.
The standard alphabet is:
A–Z, a–z (52 characters)0–9 (10 characters)+ and / (2 characters)That's 64 symbols — hence the name — plus = used only for padding.
Plenty of systems were designed to carry text, not raw bytes: email bodies, URLs, JSON string fields, XML, HTTP headers. Sending raw binary through them can corrupt the data — a stray byte might be interpreted as a control character, a line break, or a delimiter. Base64 sidesteps the problem by re-expressing binary as safe text on the way in, and decoding it back on the way out.
It's worth being precise with terminology: Base64 is an encoding, not a cipher, not compression, and not a hash. It's fully reversible and adds size.
Base64 processes input 3 bytes at a time. Three bytes are 24 bits; those 24 bits are split into four 6-bit groups, and each group (a value from 0–63) maps to one character in the alphabet.
So 3 bytes of input become 4 characters of output. Walking through the text Man:
| Char | ASCII | 8-bit binary |
|---|---|---|
M |
77 | 01001101 |
a |
97 | 01100001 |
n |
110 | 01101110 |
Concatenated, that's 010011010110000101101110. Regrouped into 6-bit chunks: 010011 010110 000101 101110 → values 19 22 5 46 → T W F u. So Man encodes to TWFu.
When the input length isn't a multiple of 3, the output is padded with = so it stays a multiple of 4:
| Input | Bytes | Base64 |
|---|---|---|
Man |
3 | TWFu |
Ma |
2 | TWE= |
M |
1 | TQ== |
That trailing = carries no data — it's only there so the length is a clean multiple of four.
Because every 3 bytes become 4 characters, Base64 output is about 33% larger than the original. That's the trade-off for text safety — so compress or minify elsewhere when size matters, and don't Base64-encode large assets you could link to directly.
The standard + and / characters have special meaning in URLs and filenames. The URL-safe variant (RFC 4648 §5, used by JWTs among others) swaps them:
+ becomes -/ becomes _= padding is often droppedThe encoded data is identical — only those two characters change so the string can sit in a URL, filename, or cookie untouched.
url(data:image/png;base64,iVBORw0...), saving an HTTP request for tiny assets.Authorization: Basic <base64("username:password")>.This is the most important point: Base64 is fully reversible by anyone, with no key. It hides nothing and provides zero confidentiality.
Use Base64 to transport data, never to protect it. If you see a password or token "hidden" in Base64, treat it as plaintext. For confidentiality, encrypt the data first, then Base64-encode the ciphertext if you need it in text form.
Base64 encodes bytes, not characters. To encode text that contains non-ASCII characters (emoji, accents, non-Latin scripts), first convert it to bytes with UTF-8, then Base64-encode those bytes. Skipping the UTF-8 step is a classic source of "it works for English but breaks for everything else" bugs.
The Base64 Encoder / Decoder converts text to Base64 and back, UTF-8 safe, right in your browser — the data you paste never leaves your device.
Is Base64 secure or encrypted? No. It's a reversible encoding with no key. Anyone can decode it instantly.
Why does Base64 end with = or ==? That's padding, added when the input length isn't a multiple of 3 so the output length is a multiple of 4.
Why is my Base64 string longer than the original? Encoding inflates size by roughly 33% — that's inherent to the 3-bytes-to-4-characters mapping.
What's the difference between Base64 and Base64url? Base64url replaces +// with -/_ (and usually drops padding) so the string is safe in URLs and filenames.
Why does my encoded emoji or accented text break? Encode the text as UTF-8 bytes first; Base64 operates on bytes, not characters.