0

I'm trying to download and decompress a gzip file and then convert the resulting decompressed file which is of tsv format into a CSV format which would be easier to parse. I am trying to gather the data from the "Download Table" link in this URL. My code is as follows, where I am using the same idea as in this post, however I get the error IOError: [Errno 2] No such file or directory: 'file=data/irt_euryld_d.tsv' in the line with open(outFilePath, 'w') as outfile:

import os
import urllib2 
import gzip
import StringIO

baseURL = "http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?"
filename = "D:\Sidney\irt_euryld_d.tsv.gz" #Edited after heinst's comment below
outFilePath = filename[:-3]

response = urllib2.urlopen(baseURL + filename)
compressedFile = StringIO.StringIO()
compressedFile.write(response.read())

compressedFile.seek(0)

decompressedFile = gzip.GzipFile(fileobj=compressedFile, mode='rb') 

with open(outFilePath, 'w') as outfile:
    outfile.write(decompressedFile.read())

#Now have to deal with tsv file
import csv

with open(outFilePath,'rb') as tsvin, open('ECB.csv', 'wb') as csvout:
    tsvin = csv.reader(tsvin, delimiter='\t')
    csvout = csv.writer(csvout) #Converting output into CSV Format

Thank You

Community
  • 1
  • 1
user131983
  • 3,787
  • 4
  • 27
  • 42
  • Why do you have `file=` in front of your path? Is that part of the directory name? – heinst Jun 16 '15 at 15:01
  • @heinst Sorry. I will rectify that. However, the error still remains. – user131983 Jun 16 '15 at 15:02
  • Can you please update the question with the updated code so I can look? – heinst Jun 16 '15 at 15:03
  • @heinst Just updated it. – user131983 Jun 16 '15 at 15:04
  • 1
    I know the problem, you need the full path to the data folder. so `/Users/path/to/data/irt_euryld_d.tsv.gz` – heinst Jun 16 '15 at 15:05
  • @heinst Thank You. So I changed the line to `filename = "/Users/path/to/data/irt_euryld_d.tsv.gz"`, however, I still get the same error. Is this what you meant? – user131983 Jun 16 '15 at 15:07
  • 1
    Well it has to be a valid path. Wherever the file `irt_euryld_d.tsv.gz` is you have to provide the full path to it. For example if I have `irt_euryld_d.tsv.gz` on my desktop, my path to it would be `/Users/heinst/Desktop/irt_euryld_d.tsv.gz`. And I would set that path as the `filename` value – heinst Jun 16 '15 at 15:08
  • @heinst Thanks. I think that resolved the issue. However, I have another one now. Can I edit the question to reflect this? – user131983 Jun 16 '15 at 15:13
  • Youll have to open another question for the new issue. Sorry – heinst Jun 16 '15 at 15:14

1 Answers1

1

The path you were setting filename to was not a valid path to have a file written to it. So you have to change filename = "data/irt_euryld_d.tsv.gz" to be a valid path to wherever you want the irt_euryld_d.tsv.gz file to live. For example if I wanted the irt_euryld_d.tsv.gz file on my desktop I would set the value of filename = "/Users/heinst/Desktop/data/irt_euryld_d.tsv.gz". Since this is a valid path, python will not give you the No such file or directory error anymore.

heinst
  • 8,520
  • 7
  • 41
  • 77