0

I had some data in excel file. I changed the file to .csv file and tried to write some python code to read the file.

But I am getting some unpredictable outputs. My Code is like this:

INPUT_DIR = os.path.join(os.getcwd(),"Input")
OUTPUT_DIR = os.path.join(os.getcwd(),"Output")
print INPUT_DIR, OUTPUT_DIR 

def read_csv():    
    files = os.listdir(INPUT_DIR)
    for file in files:
        file_full_name = os.path.join(INPUT_DIR,file)
        print file_full_name
        f = open(file_full_name,'r')
        for line in f.readlines():
            print "Line: ", line

def create_sql_file():
    print "Hi"


if __name__ == '__main__':
    read_csv()
    create_sql_file()

This gives very peculiar output:

 C:\calcWorkspace\13.1.1.0\PythonTest\src\Input C:\calcWorkspace\13.1.1.0\PythonTest\src\Output
C:\calcWorkspace\13.1.1.0\PythonTest\src\Input\Country Risk System Priority Data_01232013 - Copy.csv
Line:  PK**

Does someone know of this issue?

Vivek
  • 910
  • 2
  • 9
  • 26

1 Answers1

11

First, make sure you converted the file from Excel to csv, using the Save As menu from Excel. Simply changing the extension doesn't work. The output you are seeing is data from Excel's native format.

Once you have converted the files, use the csv module:

import csv

for filename in os.listdir(INPUT_DIR):
   with open(os.path.join(INPUT_DIR,filename), dialect='excel-tab') as infile:
      reader = csv.reader(infile)
      for row in reader:
          print row

If you want to read raw Excel files, use the xlrd module. Here is a sample that shows how to read Excel files.

Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284