0

I have a .txt file that contains all the info of a particular item on each line, in this format:

1 +'item 1'+ [0, 0]
2 +'item 2'+ [0, 0]

The first number is the item id, the string between the + signs is the item name, and the list at the end is the stats of the item. I need to use regex to get the name between the + symbols, but all of the answers I find aren't exactly what I'm looking for, and I don't understand regex very well at all. What pattern should I use to find the name?

Similar questions/answers, that don't really answer my question: one, two

OakenDuck
  • 485
  • 1
  • 6
  • 15

2 Answers2

1

Try to isolate the item name using regular string methods, see below.

saved_names = []
with open('file.txt', 'r') as fr:
    for line in fr.readlines():
        name = line.split('+')[1]
        saved_names.append(name)

Or use regex:

# compile pattern, catch all items.
pattern = re.compile(r'(.+)\s\+(.+)\+\s(.+)')

saved = []
with open('file.txt', 'r') as fr:
    for line in fr.readlines():
        name = match(pattern, line.strip('\n'))
        id, name, data = matches.groups()
        saved.append((id, name, data))
BramAppel
  • 1,346
  • 1
  • 9
  • 21
1

Better to use split method but if you really need to use regex, you can do it like this:

import re

file = 'filepath/to/your/text/file.txt'

with open(file, encoding='utf-8') as f:
    pattern = r'\'(.+)\''
    solution = re.findall(pattern, f.read())

print(solution)
mrWiecek
  • 61
  • 5