1

I have a text file to parse line by line using python. I read the file as below,

with open(filename) as f:
    soup = bs(f.read(),"html.parser")

then I split all the lines of the text file into a list.

allLine = soup.text.split("\r\n")

Now I am iterating the list one by one such as below,

Method 1:

for line in allLine:
    # my task 

My questing is, I can do the same iterating without storing the data into list also as below,

Method 2:

for line in soup.text.split("\r\n")
        # my task 

My Questions is,

Method 1 allocating extra spaces for the list 'allLines'. But Method 2 doesn't need extra space. But will it do the split for 'n' lines?

Which method is efficient?

num3ri
  • 822
  • 16
  • 20
Smith Dwayne
  • 2,675
  • 8
  • 46
  • 75
  • There's no difference - Python is still creating a list internally to store the results of the `split()` - as far as I know, there is no method in the standard library to allow you to split a string without creating an intermediate list. You'd have to write your own function. – Chinmay Kanchi Feb 02 '19 at 08:46
  • 4
    why would you need/use BeautifulSoup in order to iterate over lines of txt file? Check https://stackoverflow.com/a/6475407/4046632 – buran Feb 02 '19 at 08:52
  • @buran: Sorry to mention it. It is a general routine. Sometimes I may send an HTML file to the function. – Smith Dwayne Feb 02 '19 at 10:51
  • 1
    Those two formulations are entirely equivalent. You *can* use `StringIO` to do the line splitting and save the memory for the copies of each line, but then you *have* to keep the whole file-string around during processing, and I don’t know if it’s any faster. – Davis Herring Feb 02 '19 at 19:53

0 Answers0