How to detect (length of) numbering in a numbered title?

Question

Is there a way (either a trained model or a deterministic function) in python3 that returns the length of numbering in a title. For example,

"I. This is a big title" ---> length=len("I.")=2
"1.10 This a small title" ---> length=len("1.10")=4
"A)b) This is another title" ---> length=len("A)b)")=4
"C.2 This is a regular title" ---> length=len("C.2")=3
"This is not a title" ---> length=0
etc....

?

I wrote a little function that uses regex to detect if a string begins with a numbering:

pattern = r'(^IX|IV|VI{0,3}|I{1,3})(\s|-|\s-|\)|\s\)|\.|\s\.|/|\s/|–|\s–)'
m_romans = re.search(pattern, text)
m_letters = re.search(r'^([a-zA-Z])(\s|-|\s-|\)|\s\)|\.|\s\.|/|\s/|–|\s–)', text)
m_digits = re.search(r'^(\d)(\s|-|\s-|\)|\s\)|\.|\s\.|/|\s/|–|\s–)', text)

Maybe regex can help ?

This comes down to writing a regex, or something functionally equivalent to a regex, for detecting a "numbering pattern". There are already lots of regex tutorials, so any additional help will simply be walking you through making explicit what you think a "numbering pattern" is. — Acccumulation, Jul 19 '18 at 15:09

AsheKetchum · Answer 1 · 2018-07-19T15:09:51.217

2

If the numbering is always at the start and separated with a space.

len(title.split()[0])

should work.

On second thought, perhaps you can do title.split()[0] and check that result with your regex. If it satisfies your definition of titles, check the length, otherwise return 0.

edited Jul 19 '18 at 15:09

answered Jul 19 '18 at 15:03

AsheKetchum

1,098
3
14
29

score 0 · Answer 2 · answered Jul 19 '18 at 15:07

0

If you try with something like that using regex first to detect numbers

Return positions of a regex match() in Javascript?

answered Jul 19 '18 at 15:07

Rodrigo Espinoza

380
2
17

How to detect (length of) numbering in a numbered title?

2 Answers2