-1

I have a file on chemical compounds that gives the name of a section with the corresponding data, sometimes with a few lines of data, before it has a new section with a different name. I'm trying to read the 'NAME' entries (minus the 'NAME' part) and read each name (if it has multiple) into a list, then break whenever it reaches the 'FORMULA' section and have it move onto the next 'NAME' section, but I don't know how. I'm a novice programmer. Here's an example: Compound List Screenshot enter image description here

Here's my code so far:

li=[] #list of all names
for line in inputFile:
    if line[:5]=='ENTRY':
        items = line.split()
        cmNm = items[1] #compound Number
    else line[:4]=='NAME':
        items = line.split()
        cmName = items[]
        if line[:7]=='FORMULA':
            break
ΦXocę 웃 Пepeúpa ツ
  • 47,427
  • 17
  • 69
  • 97
Kevin
  • 15
  • 2

1 Answers1

0
with open('/path/to/file.txt', 'r') as inputFile:
    for line in inputFile:
        try:
            # Skip lines until we find an entry
            while len(line) < 5 or line[:5] != 'ENTRY':
                line = inputFile.next()
            # Setup for logging that entry
            cmNm = line.split()
            cmName = []
            # Append all name lines
            while len(line) < 7 or line[:7] != 'FORMULA':
                cmName.append(line)
                line = inputFile.next()
            # Process cmNm/cmName for current compound before moving on
            print (str(cmNm) + " " + str(cmName))
        except StopIteration:
            pass # Reached end of file

cmNm contains the split list of the ENTRY line

cmName contains a list of lines which together make up the name.

You'll have to add whatever processing you want to store/format cmNm & cmName how you want it. I just made it print them as it goes.

You can safely pass on StopIteration so long as the last valid entry has a FORMULA.

TemporalWolf
  • 7,727
  • 1
  • 30
  • 50