Regex capture 2 lines above regex match

Question

Need help getting the above words (ZYGOMA, ZOMA, ZYGMA) after the match n. m.(noun masculine) and n. f.(noun feminine) is found. I've tried different flags like multiline and dotall but still no luck getting the main words above. Any help will be greatly appreciated

import re


def main():
    mytext = open("m.txt")
    mypattern = re.compile('n. (m.|f.)')
    for line in mytext:
        match = re.search(mypattern, line)
        if match:
            print(match.group())

if __name__ == "__main__":
    main()

The text i'm using as a sample is:

ZYGOMA

n. m. T. d'Anatomie . Os de la pommette de la joue.

ZOMA

n. m. T. d'Anatomie . Os de la pommette de la joue.

ZYGMA

n. m. T. d'Anatomie . Os de la pommette de la joue.

How the main file i'll parse looks like this:

How the main file i'll parse looks like this

score 1 · Accepted Answer · answered Jul 22 '18 at 12:07

Implying the words that are searched for are capitalized:

import re

text = """
    ZYGOMA

    n. m. T. d'Anatomie . Os de la pommette de la joue.

    ZOMA

    n. m. T. d'Anatomie . Os de la pommette de la joue.

    ZYGMA

    n. m. T. d'Anatomie . Os de la pommette de la joue.

    A B C

    n. m. T. d'Anatomie . Os de la pommette de la joue.
"""

g = re.findall(r'([A-Z][A-Z ]*)\s+(?=n\. m|f)', text)
print(g)

Will print:

['ZYGOMA', 'ZOMA', 'ZYGMA', 'A B C']

For Unicode capitalized words the solutions is here: Python regex for unicode capitalized words

You're a life saver. Thanks Andrej – Anthony Peter Kwawu Jul 22 '18 at 12:21 — Anthony Peter Kwawu, Jul 22 '18 at 12:21

Regex capture 2 lines above regex match

1 Answers1