I have a zipfile
on my Google Drive
. In that zipfile
is a XML file
, which I want to parse, extract a specific information and save this information on my local computer (or wherever).
My goal is to use Python & Google Drive API (with help of PyDrive) to achieve this. The workflow could be as follows:
- Connect to my Google Drive via Google Drive API (PyDrive)
- Get my zipfile id
- Load my zipfile to memory
- Unzip, obtain the XML file
- Parse the XML, extract the desired information
- Save it as a csv on my local computer
Right now, I am able to do steps 1,2,4,5,6. But I dont know how to load the zipfile into memory without writing it on my local HDD first.
Following PyDrive code will obtain the zipfile and place it on my local HDD, which is not exactly what I want.
toUnzip = drive.CreateFile({'id':'MY_FILE_ID'})
toUnzip.GetContentFile('zipstuff.zip')
I guess one solution could be as follows:
I could read the zipfile as a string with some encoding:
toUnzip = drive.CreateFile({'id':'MY_FILE_ID'})
zipAsString = toUnzip.GetContentString(encoding='??')
and then, I could somehow (no idea how, perhaps StringIO
could be useful) read this string with Python zipfile library. Is this solution even possible? Is there a better way?