If you are already able to read the PDF and store the text into a string, you could do the following:
import re # Import the Regex Module
pdf_text = """
user:John
user:Doe
user id:2
user id:4
"""
# re.findall will create a list of all strings matching the specified pattern
results = re.findall(r'user:\s\w+', pdf_text)
results = ['user: John', 'user: Doe']
This basically means: find all matches that start with the string 'user:', followed by a whitespace '\s' and then followed by characters that form words (letters and numbers) '\w' until it cannot match anymore '+'.
If you would only like to get the "value" field back, you could use: r'user:\s(\w+)' which would instruct the regex engine to group the string matched by '\w+'. If you have groups in your regex pattern, findall return a list of the group matches instead, so the result would be:
results = re.findall(r'user:\s(\w+)', pdf_text)
['John', 'Doe']
Take a look at the regex module documentation at: https://docs.python.org/3/library/re.html
Some other methods like finditer() could also help in case you want to do more complex stuff.
This regex guide could also be of help: https://www.regexbuddy.com/regex.html?wlr=1