This is a start pack for people working in the area of tokenization for Large Language Models (LLMs) and Natural Language Processing (NLP).