← Lambdica / mathematical-foundations
String
Also known as: word
A string over an alphabet is a finite sequence of symbols drawn from that alphabet.
Formal definition
Given an alphabet , a string (or word) of length over is a function . The length is denoted . The unique string of length zero is the empty string, written .
The set of all strings of length exactly over is . The set of all finite strings is the Kleene closure
Concatenation is the operation that appends one string to another. Under concatenation with identity , forms a monoid — the free monoid generated by .
Intuition
A string is what you get when you take finitely many symbols from an alphabet and put them in a specific order. The empty string is a frequent source of bugs in proofs and implementations because it is simultaneously the unique length-zero object and the identity of concatenation. Treat it as a first-class citizen from the start; do not default to assuming strings have at least one character.
The finiteness is load-bearing. Infinite sequences over exist — they are called -words and live in — but they require a different theory altogether (Büchi automata and -regular languages). For the automata theory and computability covered on sibling pages, every input is a finite string.
Notes
A language over is any subset . Formal language theory studies which subsets can be specified by which kinds of machines.
The name Kleene closure honors Stephen Kleene, who introduced the operation in his 1951 RAND memorandum on regular expressions.
Connections
depends_on
References
sipser-2013-introduction