Convert Emoji Characters to Unicode in JavaScript
Emojis are not just simple characters; they are part of the Unicode Supplementary Planes. Understanding how JavaScript handles them is crucial for modern web development, especially when dealing with string lengths, databases, and cross-platform display.
The Problem: Why .length Lies
In JavaScript, strings are UTF-16 encoded. Standard characters like "A" or "β" (Snowman) sit in the Basic Multilingual Plane (BMP) and take up 16 bits. However, modern emojis (like π) sit in the Supplementary Planes and are represented by two 16-bit code units, known as a Surrogate Pair.
'A'.length; // 1 (BMP)
'β'.length; // 1 (BMP)
'π'.length; // 2 (Supplementary Plane - Surrogate Pair)
'π¨βπ©βπ§βπ¦'.length; // 11 (Wait, what? See "ZWJ" below)
Solution 1: String.prototype.codePointAt()
Before ES6, we used charCodeAt(), which only returned the first half of a surrogate pair. Modern JavaScript provides codePointAt(), which correctly identifies the full 32-bit Unicode number.
const emoji = 'π';
// β The old way (Incorrect for emojis)
console.log(emoji.charCodeAt(0)); // 55357 (Surrogate lead)
// β
The modern way
console.log(emoji.codePointAt(0)); // 128512
console.log(emoji.codePointAt(0).toString(16)); // "1f600"
Solution 2: String.fromCodePoint()
To convert a Unicode number back to a visible emoji, use the static method String.fromCodePoint().
// Decimal
String.fromCodePoint(128512); // "π"
// Hexadecimal
String.fromCodePoint(0x1F600); // "π"
// Multiple points
String.fromCodePoint(0x1F1FA, 0x1F1F8); // "πΊπΈ"
Batch Conversion: Multiple Characters
If you have a string containing various emojis and characters, you can easily convert them all at once. For beginners, the Spread Operator [...] is your best friend because it is "Unicode-aware"βit won't break your emojis apart.
1. From String to Numbers (Encoding)
You can get an array of Unicode numbers (Decimal or Hex) using .map():
const text = "Hi π π";
// Convert to Decimal numbers
const decimals = [...text].map(char => char.codePointAt(0));
console.log(decimals);
// [72, 105, 32, 128512, 32, 128640]
// Convert to Hexadecimal strings (common for CSS/JS escape)
const hexCodes = [...text].map(char => `0x${char.codePointAt(0).toString(16)}`);
console.log(hexCodes);
// ["0x48", "0x69", "0x20", "0x1f600", "0x20", "0x1f680"]
2. From Numbers to String (Decoding)
To turn those numbers back into a readable string, use String.fromCodePoint with the Rest Operator ...:
const myNumbers = [128512, 128640, 9731];
// This "spreads" the array items as individual arguments
const result = String.fromCodePoint(...myNumbers);
console.log(result); // "ππβ"
3. Using a Simple Loop
If you find .map() confusing, a for...of loop works perfectly too:
const input = "ππ";
for (let char of input) {
let hex = char.codePointAt(0).toString(16);
console.log(`Character: ${char} -> Unicode: U+${hex.toUpperCase()}`);
}
// Character: π -> Unicode: U+1F34E
// Character: π -> Unicode: U+1F34A
Complex Emojis: Zero Width Joiners (ZWJ)
Some emojis look like a single character but are actually a sequence of multiple Unicode points joined by a special character called a Zero Width Joiner (ZWJ, \u200D).
For example, the "Family" emoji π¨βπ©βπ§βπ¦ is composed of:
Man + ZWJ + Woman + ZWJ + Girl + ZWJ + Boy.
To correctly iterate over these, use the Spread Operator [...] or Array.from(), which are Unicode-aware:
const complexEmoji = 'π¨βπ©βπ§βπ¦';
// β Standard split (breaks the emoji)
console.log(complexEmoji.split(''));
// ["\ud83d", "\udc68", "β", "\ud83d", ...]
// β
Unicode-aware iteration
const points = [...complexEmoji];
console.log(points);
// ["π¨", "β", "π©", "β", "π§", "β", "π¦"]
// 1. Get code points as Hexadecimal (commonly used in docs)
const hexCodes = [...complexEmoji].map(c => `0x${c.codePointAt(0).toString(16)}`);
console.log(hexCodes);
// ["0x1f468", "0x200d", "0x1f469", "0x200d", "0x1f467", "0x200d", "0x1f466"]
// 2. Get code points as Decimal (Non-Hexadecimal)
const decimalCodes = [...complexEmoji].map(c => c.codePointAt(0));
console.log(decimalCodes);
// [128104, 8205, 128105, 8205, 128103, 8205, 128102]
// 3. Convert Decimal back to Emoji
console.log(String.fromCodePoint(...decimalCodes));
// "π¨βπ©βπ§βπ¦"
// 4. Convert Hex strings back to Emoji
const emojiFromHex = String.fromCodePoint(...hexCodes);
console.log(emojiFromHex);
// "π¨βπ©βπ§βπ¦"
Practical Use Cases
- Database Storage: Ensure your database (like MySQL) uses
utf8mb4to store these 4-byte characters. - Input Limiting: When limiting characters in a Bio or Tweet, count emojis correctly using
[...str].lengthinstead ofstr.length. - Custom Text Renderers: Useful for Canvas-based games or high-performance UI components.
Need an Interactive Tool?
If you're working with these conversions frequently, check out our Online Unicode & Emoji Converter. It allows you to bidirectionally convert between text, Hex, CSS, and JS escape sequences in real-time.