Snowball is a small language for writing stemming algorithms, used primarily in information retrieval and natural language processing.
Created by Dr. Martin Porter, Snowball is a small string processing language designed for creating stemming algorithms for use in Information Retrieval. It was created partially to provide a canonical implementation of Porter's stemming algorithm, and partially to facilitate the creation of stemmers for languages other than English.
A further aim of Porter's was to provide a way of creating and defining stemmers that could readily or automatically be translated into C, Java, or other programming languages. The Snowball compiler translates a Snowball script (a .sbl
file) into either a thread-safe ANSI C program or a Java program. For ANSI C, each Snowball script produces a program file and corresponding header file (with .c
and .h
extensions).
The name "Snowball" is a tribute to the SNOBOL programming language.