Lexical analysis is the process of breaking down source code into a stream of tokens, such as keywords, identifiers, and literals. It's the first step in the glamorous world of compiling or interpreting a programming language, taking in the raw code as input and spitting out a nice token stream for the parser.
I was planning to spend my Friday night partying, but instead I got sucked into writing a lexical analyzer for my new programming language. FML.
After 3 hours of debugging, Jane finally realized the bug was due to the lexical analysis phase incorrectly tokenizing "Phteven" as an identifier instead of the literal string "Steven".
Let's Build a Compiler - A classic series of articles that walks through building a compiler from scratch, including lexical analysis.
ANTLR Mega Tutorial - If you want to skip the theory and just generate a lexical analyzer, this tutorial shows how to use the popular ANTLR parser generator.
Note: the Developer Dictionary is in Beta. Please direct feedback to skye@statsig.com.