3

I have a following issue, there is a function to read a big file(few Mb), the content looks like:

0x05 0x00 0x00 0x00 0xCF 0x00 0x00 0x00 ; ..........
0xCF 0x00 0x00 0x00 0x22 0x00 0x00 0x00 ; ......"...
0x51 0x84 0x07 0x00 0x02 0x00 0x01 0x00 ; ..Q.......

My function has to read only the hex values, and to ignore ";" with following characters till the end of line.

Data from the first line what I need is

"0x05 0x00 0x00 0x00 0xCF 0x00 0x00 0x00 "

I tried two methods, one is with a separate file with this function

def ReadFileAsList(fileName):

    fileName = "2Output.txt"
    fileContentStr = ""
    with open(fileName,'r') as f:
        for line in f:
            fileContentStr += line.split(';')[0]

    fileContentList = fileContentStr.split()
    return fileContentList

and the second method, when these line are directly in my main .py file

fileName = "2Output.txt"
fileContentStr = ""
with open(fileName,'r') as f:
    for line in f:
        fileContentStr += line.split(';')[0]

fileContentList = fileContentStr.split()

The second method is very fast, the first(with the separate function in a separate file) is very slow, what am I missing? Thanks for any hint

Andrey Mazur
  • 510
  • 4
  • 14
  • 1
    In absence of any other culprits, something tells me you *might* be missing out on the optimization CPython does when concatenating strings with `+=` in the function case. I don't remember the actual circumstances that cause that optimization to be disabled. Try creating a list of the strings returning that to see if the difference is still there. – Dimitris Fasarakis Hilliard Nov 08 '17 at 15:02
  • @JimFasarakisHilliard It helped, thanks – Andrey Mazur Nov 08 '17 at 15:10
  • 1
    Under you `with open` line, you could build you list of trimmed lines using a list comprehension like `fileContentList = [line.split(';')[0] for line in f]` – Brad Campbell Nov 08 '17 at 15:16
  • how do you import the first script and call the method? – RockOnGom Nov 08 '17 at 15:17
  • I offer you to read this if you work with large texts https://waymoot.org/home/python_string/ – RockOnGom Nov 08 '17 at 15:19
  • @BradCampbell my function "ReadFileAsList" is in Functions.py file, in my main.py file I tried two options, `import Functions` `ReadFileAsList("filename.txt")` and the other one `from Functions import ReadFileAsList` `ReadFileAsList("fileName.txt")` I know that I did a mistake in my post that I don't use the passed filename to the function, but I think it doesn't play any role – Andrey Mazur Nov 09 '17 at 07:03

1 Answers1

1

It is faster to store local variables than it is for global variables. Local variables are stored in a fixed-sized array, where as global variables are stored in a true dictionary.

Here is a link to a more in-depth answer

badger0053
  • 1,179
  • 1
  • 13
  • 19