Good question! spaCy does not internally represent numeric tokens as numbers, so it doesn't have an explicit concept of the values. In that sense it can't tell between valid and invalid values for age.
However, spaCy does use "shape" features when representing tokens that will help it recognize valid ages. There are different kinds of shape tokens, but the one spaCy uses will represent words by converting characters to a representation of the character type. It works like this:
- spaCy → xxxXx
- fish → xxxx
- Fish → Xxxx
- 23 → dd
- 1000 → dddd
- 22.7 → dd.d
Because of this you could expect that spaCy learns that two-digit numbers are likely to be ages, but numbers with decimals or four digits aren't likely. On the other hand, this doesn't help it differentiate between 100 and 999.
For dates this will not help with determining valid or invalid birthdates. Shape is just one of spaCy's features, but other features like prefix and suffix aren't really going to help with this either.
Since it's easy to verify numeric values in code, what I would suggest is matching broadly in spaCy and then using your own function to check whether dates or ages are valid by parsing them.
Outside of spaCy in particular, the question of how NLP models represent numeric values is actually an increasingly popular research topic - if you'd like to know more about it this is a recent article on the topic: Do Language Models Know How Heavy an Elephant Is?