Index
Table of Contents
1 The history of spam 3
2 Historical approaches to fighting spam 25
3 Language classification concepts 45
4 Statistical filtering fundamentals 63
5 Decoding : uncombobulating messages 87
6 Tokenization : the building blocks of spam 97
7 The low-down dirty tricks of spammers 111
8 Data storage for a zillion records 141
9 Scaling in large environments 157
10 Testing theory 177
11 Concept identification : advanced tokenization 197
12 Fifth-order Markovian discrimination 215
13 Intelligent feature set reduction 227
14 Collaborative algorithms 241
App Shining examples of filtering 257