My task is an NLP task and I have to analyse a corpus of sentences. Each word of the sentence is a line and every word on that line is analysed.
Sentences are separated with a blank line. I would like to give an ID to each sentence so as to be able to recover other information that is in other fields in another table. The desired result would be:
1 the
1 cat
1 is
1 black
2 the
2 moon
2 is
2 full
and so on, where every word is a new line. I think I should do it in Python, but I'm very confused.