Text to Binary

Free online text to binary tool. No signup required.

Built by Bob Article by Lace QA by Ben Shipped

How to use

  1. 1

    Use the Text To Binary tool above.

Ratings & Reviews

Rate this tool

Sign in to rate and review this tool.

Loading reviews…

What the Text to Binary converter does

Every letter, digit, space, and emoji on your screen is a number underneath. The computer doesn't store the letter "A" — it stores the number 65, which it draws as "A" because a lookup table says so. Binary is just that same number written in base 2: a string of zeros and ones grouped into bytes. The Text to Binary converter takes whatever you type and shows you the underlying numbers.

Paste in Hi and the converter returns:

01001000 01101001

That's two bytes, eight bits each. The first byte (01001000) is decimal 72, which the ASCII table maps to the capital letter H. The second (01101001) is decimal 105, which maps to lowercase i. Same letters you typed, written in the language the CPU actually speaks.

Type Hello, world! and you get thirteen bytes back — one per character, including the comma, the space, and the exclamation mark. Spaces aren't invisible to the computer; they're just byte 32 (00100000), as real as any letter.

Why anyone needs to see binary

Most people never look at binary directly. The computer handles it, your editor renders the result, and you move on. But there are a handful of moments where seeing the bits matters:

  • Learning computer science basics. Encoding is the first abstraction every CS student hits. Reading 01001000 as H the first time is when bits stop being mysterious.
  • Debugging character encoding bugs. When an emoji shows up as ????? or your CSV breaks on a special character, the problem is almost always a byte the receiving program didn't expect. Looking at the actual bytes tells you whether you're shipping UTF-8, Latin-1, or something stranger.
  • Working with low-level protocols. Network packet headers, serial port firmware, microcontroller code, embedded systems — anywhere bytes travel one at a time, you read them as binary.
  • Hobby and puzzle work. Escape rooms, CTF challenges, geocaching clues, Reddit's r/codes — binary is a popular "secret message" format because it looks cryptic but decodes trivially.
  • Teaching kids. Showing a 10-year-old that HI equals 01001000 01001001 is the moment binary stops being abstract.

Most online binary converters either bury the conversion under a sign-up wall or wrap it in a "developer suite" with twelve other features you don't want. This one just converts. Type on the left; binary on the right.

How character encoding actually works

The conversion runs in two steps. First, each character in your input is mapped to a number (its code point). Second, that number is written out in base 2.

For the first 128 characters — the English alphabet, digits, basic punctuation, and a handful of control codes — the mapping is defined by ASCII, the American Standard Code for Information Interchange, dated 1963. ASCII reserves seven bits per character, which gives 128 possible values (0 through 127). The Text to Binary converter pads each value to 8 bits for readability, so the leading bit is always 0 for ASCII characters.

Once you move past character 127 — accented letters, currency symbols, emoji, Cyrillic, Chinese, anything beyond plain English — ASCII runs out. Modern systems use Unicode, which assigns a code point to every character in every writing system on Earth (currently around 150,000 characters and counting). Unicode itself is just the numbering; UTF-8 is how those numbers get written as bytes.

UTF-8 is variable-length: ASCII characters still take one byte, but characters above 127 take two, three, or four bytes. The encoding is clever — the first bits of each byte tell the reader how many bytes the character spans, so there's no ambiguity. The Text to Binary converter follows UTF-8 because that's the default encoding on the modern web (97% of pages, per the W3Techs survey).

A worked example, byte by byte

Let's walk through Hi! in full detail. Three characters, all in the ASCII range, so three bytes.

H → ASCII 72 → binary 01001000
i → ASCII 105 → binary 01101001
! → ASCII 33 → binary 00100001

To go from decimal to binary, you ask "which powers of 2 add up to this number?" For 72 that's 64 + 8, so positions 6 and 3 (counting from 0 on the right) are 1, everything else is 0. Written out: 0-1-0-0-1-0-0-0. Pad to a full byte by adding a leading 0 if needed, and you get 01001000.

For 105: 64 + 32 + 8 + 1 = 105. That's positions 6, 5, 3, and 0. Written out: 01101001.

For 33: 32 + 1 = 33. Positions 5 and 0. Written out: 00100001.

The converter does this for every character in your input and joins the bytes with spaces. For longer strings the principle is identical — it just scales up.

Common characters and their binary

A short reference for the characters people look up most often:

CharacterDecimalBinary (8 bits)Notes
Space3200100000Yes, spaces are real bytes
!3300100001First printable punctuation
04800110000Digit zero, not numeric zero
95700111001Digits run 48–57 in order
A6501000001Uppercase A
Z9001011010Last uppercase letter
a9701100001Lowercase, 32 above uppercase
z12201111010Last lowercase letter
Newline (\n)1000001010Linux/Mac line break
Tab (\t)900001001Horizontal tab

Notice the pattern: uppercase A through Z occupy 65–90, lowercase a through z occupy 97–122, and the offset between them is exactly 32. That's why flipping bit 5 of any letter switches its case — and why old-school programmers sometimes wrote uppercasing as a single XOR operation.

The digits 0 through 9 occupy 48–57, which is also why subtracting 48 from a digit character gives you the numeric value. '7' - 48 = 7. Useful when you're parsing numbers character by character in low-level code.

What happens beyond ASCII: UTF-8 in practice

Type a plain English letter and you get one byte. Type an accented letter and you get two. Type a Chinese character and you get three. Type an emoji and you usually get four.

The character "é" (Latin small e with acute) has code point 233. That fits in 8 bits in theory, but UTF-8 reserves the high bit (anything where the leading bit is 1) for multi-byte sequences. So "é" gets encoded as two bytes: 11000011 10101001. The leading 110 on the first byte means "this is the start of a 2-byte character," and the leading 10 on the second byte means "I'm a continuation."

The character "中" (Chinese "middle") has code point 20013, which needs 15 bits — too many for 2 bytes after UTF-8's overhead. It encodes as three bytes: 11100100 10111000 10101101. Leading 1110 on byte 1, leading 10 on bytes 2 and 3.

The emoji "😀" (grinning face) has code point 128512, which needs 17 bits. UTF-8 uses four bytes: 11110000 10011111 10011000 10000000. Leading 11110, then three continuation bytes.

This is why a 100-character tweet in English fits in 100 bytes but the same length in Chinese needs 300, and a message full of emoji can hit 400. The on-screen character count and the on-the-wire byte count are different numbers in any non-ASCII text.

Going the other direction: binary back to text

The Text to Binary converter has a toggle for the reverse direction. Paste in a string of bytes (spaces between each byte are tolerated; so are runs of zeros and ones with no separators, as long as the total length is a multiple of 8), and it returns the decoded text.

If your binary doesn't decode cleanly — wrong number of bits, invalid UTF-8 sequence, control characters that don't render — the converter tells you what's off rather than silently guessing. Most "this binary won't decode" cases come from one of three issues: a byte that's not 8 bits long, a UTF-8 multi-byte sequence that's incomplete, or someone hand-typed the binary and inverted a bit. Looking at the failing byte usually pinpoints which.

If you specifically need to decode binary that came from another source, the Binary Decoder is the same conversion wrapped in a paste-and-go interface. The output is identical.

Related encoding tools

Binary is one number base among several. The Microapp encoding tools cover the common ones:

  • Binary to Decimal Converter — for going from base 2 to base 10 directly, without the character-encoding layer. Useful when you're working with raw numbers (network masks, bit flags) rather than text.
  • Number Base Converter — converts between binary, octal, decimal, and hexadecimal in one place. Useful for any task that crosses base boundaries.
  • Base64 Encoder/Decoder — for encoding binary data inside text-safe channels (email, JSON, URLs). Base64 is what you reach for when raw bytes need to ride through a string-only pipe.
  • MD5 Hash Generator — for producing a fixed-length fingerprint of any input. Different from encoding (which is reversible) — hashing is one-way.
  • SHA-256 Generator — the modern cryptographic hash, used everywhere from Bitcoin to TLS certificates.

Frequently asked questions

Why 8 bits per character and not 7?

ASCII is technically 7 bits, but bytes on modern hardware are 8 bits. The convention is to pad ASCII to a full byte with a leading zero — it makes alignment easier and matches how the byte sits in memory. The Text to Binary converter follows that convention.

What happens if I paste in something with accents or emoji?

It encodes as UTF-8. ASCII characters still take one byte each; accented letters take two; most CJK (Chinese, Japanese, Korean) characters take three; emoji and rare scripts take four. The total byte count will be larger than the character count if your input contains anything outside basic English.

Is my text sent to a server?

No. The conversion runs entirely in your browser using JavaScript's TextEncoder API. Nothing crosses the network. Closing the tab takes everything with it.

Can I paste a paragraph or only a few characters?

Either. The converter handles short snippets and full paragraphs equally well. There's no character limit beyond what your browser can hold in memory — typically tens of megabytes before any lag.

Why does my emoji take four bytes when a letter takes one?

Emoji are high-numbered Unicode code points (most are above 128,000), and UTF-8 needs more bytes to encode larger numbers. The leading bits of each byte signal how many bytes the character spans. It's the cost of having one encoding that handles every writing system on Earth.

How do I read a binary string by hand?

Each bit is a power of 2. Reading right to left, the positions represent 1, 2, 4, 8, 16, 32, 64, 128. Sum the powers wherever there's a 1. For 01001000: 64 + 8 = 72, which is the ASCII code for H. Practice on a few short strings and the pattern sticks.

What's the difference between ASCII and UTF-8?

ASCII covers 128 characters using 7 bits. UTF-8 covers every Unicode character (around 150,000 and growing) using 1 to 4 bytes per character. UTF-8 is backwards-compatible with ASCII — any pure-ASCII text is also valid UTF-8 with identical bytes. The Text to Binary converter uses UTF-8 for everything, so ASCII input produces ASCII-style binary, and non-ASCII input produces multi-byte sequences.

Why doesn't my binary decode back to the same text?

Three common reasons: the bit count isn't a multiple of 8 (a byte is dropped or extended), a multi-byte UTF-8 sequence is incomplete (you have the first byte but not the continuation), or a stray space ended up in the middle of a byte. The Binary Decoder will point at the offending position rather than silently producing garbage.