Base64 Encoding, Explained

Base64 shows up in data URIs, email attachments, JWTs, and auth headers — anywhere binary data needs to travel through a text-only channel. It's simple once you see the mechanics, and understanding it clears up a common and dangerous misconception: Base64 is not a form of security.

What is Base64?

Base64 is a binary-to-text encoding scheme. It represents arbitrary binary data using only 64 printable ASCII characters, so the result is safe to place in text-based formats and protocols. It's defined in RFC 4648.

The standard alphabet is:

That's 64 symbols — hence the name — plus = used only for padding.

Why Base64 exists

Plenty of systems were designed to carry text, not raw bytes: email bodies, URLs, JSON string fields, XML, HTTP headers. Sending raw binary through them can corrupt the data — a stray byte might be interpreted as a control character, a line break, or a delimiter. Base64 sidesteps the problem by re-expressing binary as safe text on the way in, and decoding it back on the way out.

It's worth being precise with terminology: Base64 is an encoding, not a cipher, not compression, and not a hash. It's fully reversible and adds size.

How Base64 works

Base64 processes input 3 bytes at a time. Three bytes are 24 bits; those 24 bits are split into four 6-bit groups, and each group (a value from 0–63) maps to one character in the alphabet.

So 3 bytes of input become 4 characters of output. Walking through the text Man:

Char ASCII 8-bit binary
M 77 01001101
a 97 01100001
n 110 01101110

Concatenated, that's 010011010110000101101110. Regrouped into 6-bit chunks: 010011 010110 000101 101110 → values 19 22 5 46T W F u. So Man encodes to TWFu.

Padding

When the input length isn't a multiple of 3, the output is padded with = so it stays a multiple of 4:

Input Bytes Base64
Man 3 TWFu
Ma 2 TWE=
M 1 TQ==

That trailing = carries no data — it's only there so the length is a clean multiple of four.

Size overhead

Because every 3 bytes become 4 characters, Base64 output is about 33% larger than the original. That's the trade-off for text safety — so compress or minify elsewhere when size matters, and don't Base64-encode large assets you could link to directly.

URL-safe Base64

The standard + and / characters have special meaning in URLs and filenames. The URL-safe variant (RFC 4648 §5, used by JWTs among others) swaps them:

The encoded data is identical — only those two characters change so the string can sit in a URL, filename, or cookie untouched.

Common use cases

Base64 is not encryption

This is the most important point: Base64 is fully reversible by anyone, with no key. It hides nothing and provides zero confidentiality.

Use Base64 to transport data, never to protect it. If you see a password or token "hidden" in Base64, treat it as plaintext. For confidentiality, encrypt the data first, then Base64-encode the ciphertext if you need it in text form.

A note on Unicode

Base64 encodes bytes, not characters. To encode text that contains non-ASCII characters (emoji, accents, non-Latin scripts), first convert it to bytes with UTF-8, then Base64-encode those bytes. Skipping the UTF-8 step is a classic source of "it works for English but breaks for everything else" bugs.

Encode and decode Base64 online

The Base64 Encoder / Decoder converts text to Base64 and back, UTF-8 safe, right in your browser — the data you paste never leaves your device.

Frequently asked questions

Is Base64 secure or encrypted? No. It's a reversible encoding with no key. Anyone can decode it instantly.

Why does Base64 end with = or ==? That's padding, added when the input length isn't a multiple of 3 so the output length is a multiple of 4.

Why is my Base64 string longer than the original? Encoding inflates size by roughly 33% — that's inherent to the 3-bytes-to-4-characters mapping.

What's the difference between Base64 and Base64url? Base64url replaces +// with -/_ (and usually drops padding) so the string is safe in URLs and filenames.

Why does my encoded emoji or accented text break? Encode the text as UTF-8 bytes first; Base64 operates on bytes, not characters.

Keep going

Base64 एन्कोड / डिकोड