Python extracting blocks of text from a file

Question

I need to extract two blocks of text from several files and put them in separate lists using python. The first block starts from line 30, and isn't too hard to extract. The second block starts 2 lines after the first block; the issue is that the blocks can be of variable length. For example:

prj_files = [
  line,
  line,
  etc
]

prj_files_2 = [
  line,
  etc
]

So I need to take all the lines between [] in the first block and put it in one list, and take the lines between [] in the second block and put it in another list. As of right now, I use:

for i, line in enumerate(prj):
  if i > 29:

to start on a specific line, and then it uses a regular expression to find "]" where it breaks the for loop and records the line it ends on in cnt. Then I use another for loop to start at cnt + 2 to extract the second block. While I think this works, I feel like its super inefficient since I'm basically doing the same thing twice. Is there an obvious better method that I'm missing?

EDIT: So instead of parsing the file, I tried to use import instead. I do think it's much simpler, but since I'm looping through some files to find all the files, I have a general variable that represents the file name. This means when I try to use the variable to import the file, I get the module doesn't exist error. So for example, my variable name is py_file, and import is reading it as py_file instead of the actual path value. Is there a way to get around this?

The easiest way to parse these lines is probably to `exec` the file's contents, which is generally frowned upon, so, if I may ask, why are the files laid out like this? This is probably an [XY Problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). — TigerhawkT3, Jun 08 '15 at 20:05
The content of the first file is the first code block I had. So I'm actually extracting from another file of python code. I'm basically trying to extract the content of two lists of file paths. The regular expression is just re.match("]"). This should work because the ] is on its own line — Jpwang, Jun 08 '15 at 20:05
The files that I'm working with are all basically declaring an instance of a class where my function is located. So I think if I import the files, the lists that I'm trying to use would get overwritten each time. That being said, I'm relatively inexperienced, so I'm not entirely sure. — Jpwang, Jun 08 '15 at 20:13
Without more details, I can't understand exactly what you're trying to do, but the approach you're taking is basically the scenic route and you probably just need an `import` statement somewhere. — TigerhawkT3, Jun 08 '15 at 20:33
Yeah the more I think about it, the more I think import is the answer and I've just been being rather dumb about it. The details are rather complex, and I wanted to keep things concise, so sorry about the lack of info. — Jpwang, Jun 08 '15 at 20:38

score -1 · Accepted Answer · answered Jun 08 '15 at 20:08

-1

I suppose that your file content is:

prj_files = [
  line,
  line,
  etc
]

prj_files_2 = [
  line,
  etc
]

then you can do this:

exec open(YOUR_FILE).read()
f1 = open(FIRST_FILE,"w")
f2 = open(SECOND_FILE,"w")
for line in pjr_files:
    f1.write(line)
for line in pjr_files_2:
    f2.write(line)

answered Jun 08 '15 at 20:08

farhawa

10,120
16
49
91

[**Do not ever use `eval` (or `exec`) on data that could possibly come from outside the program in any form. It is a critical security risk. You allow the author of the data to run arbitrary code on your computer.**](https://stackoverflow.com/questions/1832940/why-is-using-eval-a-bad-practice) – Karl Knechtel Jul 07 '22 at 08:20

Python extracting blocks of text from a file

1 Answers1