On this page
Encoding Links Dark corners of Unicode Let’s Stop Ascribing Meaning to Code Points UTF-8 – “The most elegant hack” (2013) UTF-8 Everywhere (HN )Not everything is UTF-8 (2020) (Lobsters )Archie Markup Language (ArchieML) - Structured text format optimized for human writability.Unidecode - Lossy ASCII transliterations of Unicode text.BARE Message Encoding - Simple binary representation for structured application data. (Lobsters )Explaining text representation in computers (2020) Text Encoding: The Thing You Were Trying to Avoid (2020) Amazon Ion - Richly-typed, self-describing, hierarchical data serialization format offering interchangeable binary and text representations. (HN ) (HN )Unicode In Five Minutes (2013) Unicode support. What does that actually mean? (2020) Fast UTF-8 validation (2020) (HN ) (Code )UTF-8 Illustrator Coded Character Sets, History and Development (1980) Awesome Code Points - Curated list of characters in Unicode, that have interesting (and maybe not widely known) features or are awesome in some other way.Text rendering tests - Unicode’s test suite for text rendering engines.Executable PNGs (2020) (Lobsters ) (HN )Unicode Proposal to Encode Subscripts/Superscripts for Mathematical Programming Emoji Under the Hood (HN )The history of UTF-8 as told by Rob Pike (Lobsters ) (HN )Concise Encoding - Friendly data format for human and machine. Ad-hoc, secure, with 1:1 compatible twin binary and text formats and rich type support. (Code )Practical Reed-Solomon for Programmers (2021) Apache Avro - Data serialization system. (Web )Unicode sorting is hard & why browsers added special emoji matching to regexp (2021) Any Encoding, Ever (2021) (HN )casync - Content Addressable Data Synchronizer.ANSI Escape Codes (HN )Entropy coding in Oodle Data: Huffman coding (2021) Fun with Morse Code Latinendian vs Arabendian (2020) ICU - International Components for Unicode (Code )ruststep - STEP toolkit for Rust.Overview of Serialization Technologies (2018) Substrait - Cross-Language Serialization for Relational Algebra. (Code ) (substrait-rs )OpenH264 - Open Source H.264 Codec.Unicode Normalization Forms: When ö ≠ ö (2021) (HN )Planus - Alternative flatbuffer implementation.TinyCBOR - Concise Binary Object Representation (CBOR) Library.Cheat sheets for the Portable Document Format Understanding UUIDs, ULIDs and String Representations (Lobsters ) (HN )How UTF-8 works (2022) (HN ) (Lobsters )What Every Programmer Absolutely, Positively Needs To Know About Encodings (2011) (HN )Why I invented “dash encoding”, a new encoding scheme for URL paths (2022) (Tweet )You Don't Know GIF – An analysis of a GIF file and some weird GIF features (2022) DeGauss - Avro schema compatibility checker.Hex: A Strategy Guide (HN )trrs - CLI tool to transform data between different encodings.Ask HN: Is there a tool to generate binary protocol figures out of a spec? (2022) Plain Text - Dylan Beattie (2021) Low Complexity Communication Codec (LC3) ltp - High performance, readable, and maintainable, in-place encoding format.Identity Crisis: Sequence v. UUID as Primary Key (Lobsters )Concise Encoding - Secure data format for a modern world. (HN )BIPF (Binary In-Place Format) Spec - Binary format designed for in-place (without parsing) reads, with schemaless json-like semantics.New UUID Formats (2022) (HN )UTF8.XYZ - Quick web app for fetching Unicode characters without extra fluff. (Code )How to estimate disk space muon - Compact and simple binary object notation. (Lobsters ) (Doc )Character encoding and UTF-8 (2022) (HN )