Preprocessing can be the structuring from raw data and cleaning to be actually usable up to transforming data so that it can be handled by algorithms or improve their results. Preferably also tags for specific methods should be used. This tag should be used for meaningful preprocessing steps in a data pipeline, prior to algorithms or as a standalone method.
Data preprocessing is applicable to multiple stages in which data can persist.
This can be on a higher level right before more meaningful processing steps like analysis takes place.
But preprocessing also starts when raw data is generated and must be brought into a meaningful and usable format.
Currently the tag data-manipulation fits this lower level description better, likewise data-structures if the structure of how the data is stored and queried is important.
Finding errors, missing values and how to handle them can are also major part of it.
For that prefer to use the tag data-cleaning and/or data-wrangling.
This tag data-preprocessing should focus more on the rearrangement and transformation of data to be usable by algorithms or improve their results. Examples for preprocessing are encoding of data, their scaling or normalization of a already formatted dataset.
Preprocessing algorithms and techniques can be found in scikit-learn modules Preprocessing and Normalization:
Further theory and examples for the necessity of data preprocessing is discussed in section scikit-learn - Preprocessing data.