Encoding and Decoding

beginner
encoding base64 url-encoding unicode
Encoding Type Description Encoding Example Used in Offense / Defense Security Relevance
UTF-8 Variable-length encoding for Unicode (1–4 bytes). Most widely used. "A"0x41, "é"0xC3 0xA9 ✅ Common in XSS, LFI, filter bypass Attackers mix character representations to evade input filters. Normalize before processing.
UTF-16 / UTF-32 Fixed-width Unicode encodings. Often with Byte Order Mark (BOM). "A"0x00 0x41 (UTF-16) ⚠️ Used in obfuscated PowerShell, malware droppers Bypasses detection if BOM not handled. Normalize and scan memory for variants.
ASCII 7-bit encoding for basic Latin alphabet. "A"0x41 ✅ Used in shellcode, malware loaders ASCII art payloads and shell launchers. Straightforward for basic filters.
ISO-8859-1 (Latin-1) 8-bit encoding for Western European characters. "é"0xE9 ⚠️ Exploited in charset confusion attacks Can cause XSS/XSRF if web server misinterprets encoding. Normalize charset headers.
Windows-1252 Superset of Latin-1 with smart quotes and symbols. "“"0x93 ⚠️ Used in phishing to replace characters deceptively Smart quotes / symbol abuse. Leads to visual deception and filter bypass.
Base64 Binary-to-text using 64-character ASCII subset. "Hi"SGk= ✅ Common in malware payloads, obfuscation, phishing emails Decode PowerShell commands, scripts, and email content to detect payloads.
Hexadecimal Represents each byte in base-16. "Hi"0x48 0x69 ✅ Shellcode, encoded payloads, registry obfuscation %25 encoding in URLs, malware configs, and evasion techniques. Decode before logging.
Binary (ASCII) Text encoded in 8-bit binary form. "A"01000001 ⚠️ Used in stego, low-level keyloggers Covert channels or firmware attacks. Rare in phishing, but valuable in forensics.
Morse Code Dots and dashes representing characters. "S"... 🧪 Seen in advanced phishing (e.g., Morse obfuscated JS) Used to sneak past traditional filters. Microsoft 2021 Nobelium case.
Braille (Unicode) Unicode for raised dot system (visually impaired). "A" (U+2801) 🧪 Rare — used in Unicode stego and obfuscation Unicode abuse in detection evasion or visual trickery (e.g., malicious PDF).
QR Code / Barcode Graphical encoding of data. "Hello" → 📷 QR Code ✅ Used in ransomware notes, phishing attachments Attackers embed malicious URLs. Always validate before scanning codes in SOC environments.
Caesar Cipher Rotational substitution cipher (e.g., ROT3). "ABC""DEF" 🧪 Seen in CTFs, basic malware obfuscation Easy to decode. Sometimes still used in red team tricks or forum malware.
ROT13 Caesar cipher with fixed 13-letter rotation. "HELLO""URYYB" 🧪 Used for masking C2 commands or jokes Occasionally used in old scripts or for visual bypass. Decode automatically during threat hunting.
URL Encoding Encodes special characters as %XX. " "%20, "é"%C3%A9 ✅ Common in XSS, LFI, SQLi, and payload delivery Critical vector in web attacks. Normalize URLs before validation/logging.
HTML Entities Reserved HTML characters encoded for safe display. "<"&lt;, "&"&amp; ✅ Obfuscates malicious HTML in XSS payloads Decode during sanitization to avoid injection via entity abuse.
Base32 Binary-to-text using 32-character ASCII set. "Hi"JBSWY=== ✅ Used in 2FA (TOTP), DNS tunneling, malware exfil Base32 decoding in DNS logs helps detect covert channels.
Base58 Modified Base64 without ambiguous characters. "123"BukQL ✅ Used in Bitcoin, ransomware payment addresses Seen in crypto wallet IDs and blockchain-related phishing or ransom notes.
Punycode Encodes Unicode domain names into ASCII. "münich.de"xn--mnich-kva.de ✅ Used in phishing via homograph domain spoofing ɡoogle.comgoogle.com. Enable IDN detection in email gateways and browsers.
Percent-Encoding Like URL encoding — encodes special characters. "@"%40 ✅ Used for bypass in query strings and POST bodies Exploited in directory traversal and SQLi. Normalize percent sequences before analysis.
Quoted-Printable MIME encoding for 8-bit text in emails. "é"=E9 ✅ Used by Emotet, QakBot, and phishing kits Email filters must decode this to extract URLs, attachments, or commands.