stripping data from two XLXS cells to csv

Question

im having an issue where im trying to take data from two cells in an excel spread sheet and put them into a csv file. the data is lat and lon coordinates so they have to be side by side to be read by the program. here is what i have:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import xlwt
import xlrd
import csv
import os, openpyxl, glob
from openpyxl import Workbook


with open ('test.csv', 'wb') as csvfile:
  spamwriter = csv.writer(csvfile, delimiter=',')

  for file in glob.glob ("/test"):
    wb = openpyxl.load_workbook('test-data.xlsx')
    ws = wb.active
    def lat():
      for row in ws.iter_rows('Q2:Q65536'):
        for cell in row:
          lat = cell.value
          return lat
    def lon():
      for row in ws.iter_rows('R2:R65536'):
        for cell in row:
          lon = cell.value
          return lon  

  cord=lat()+","+lon()
  print (lat()+","+lon()) #just to see if its working
  #spamwriter.writerow([cord]) uncomment to write to file

however it only gives me the first row of data not the rest of the rows (test-data has around 1500 rows). how would i make it to finish going through the file?

Not sure what this have to do with the tag Excel as it's a CSV you are reading. I have no knowledge of python, but shouldn't the FOR have a NEXT or something that encloses the loop? — Andreas, May 16 '17 at 16:20

score 0 · Accepted Answer · answered May 16 '17 at 17:06

0

This may not be the most dynamic way, but I would use pandas for this task. It has built in pd.read_excel() and pd.to_csv() functions.

import pandas as pd
import string

latColumn = string.lowercase.index('q') # determine index that corresponds to Excel Column letter (user lower case)
longColumn = string.lowercase.index('r') # Does not work for AA, BB, ...

data = pd.read_excel('test-data.xlsx', 'Sheet1', parse_cols=[latColumn,longColumn])

# Total number of rows being read in 65536 - 2 = 65534

csvOut = "foo.csv"
data[:65534].to_csv(csvOut, index=False, header=False)

If you need append to the file and not replace it, change the data[:65534].to_csv(....) to

open(csvOut, 'a') as f:  #append to the .csv file of your likings
     data[:65534].to_csv(f, index=False, header=False)

answered May 16 '17 at 17:06

nanoPhD

400
4
16

you are a life saver!!! thanks so much! is there a way to catch bad data? for some reason some cells have "None" in them and i need to either replace it as 0 or remove it. – stcoyle May 16 '17 at 17:30
@stcoyle, If you're looking to replace all the "None" values with 0 then you can insert data = data.replace(to_replace="None", value=0)` if you want to remove the row with that column then you can use `data = data[data.Lat != "None"]` and `data = data[data.Long != "None"]` – nanoPhD May 16 '17 at 17:46
In my test.xlsx dataset I have my latitude and longitude as "Lat" and "Long". You will have to set them to whatever yours are named. Check out [link](http://stackoverflow.com/questions/18172851/deleting-dataframe-row-in-pandas-based-on-column-value) – nanoPhD May 16 '17 at 17:48
you are a godsend! wish i asked this question last week haha thanks sooooo much! – stcoyle May 16 '17 at 17:54

stripping data from two XLXS cells to csv

1 Answers1