2

I am trying to add a new column to a csv file in python 3. The csv file has a header row, and the first two columns i don't need at this point. the other 8 columns contain 4 coordinates of a polygon. I am trying to add a new column that calculates the area from the points in the csv. I have seen several questions similar on stack overflow, and have tried to use the information there in my code however at the moment, only the last line of the csv is displaying and the I don't think the area is calculating correctly either. Any suggestions? (FYI this is my first code with a csv.) Here is my code:

with open(poly.csv, 'rU')as input:
    with open ('polyout.csv', 'w') as output:
        writer = csv.writer(output, lineterminator='\n')
        reader=csv.reader(input)

        coords=[]
        row =next(reader)
        row =next(reader,None)
        coords=row[2:]

        prev_de=coords[-2]
        prev_dn=coords[-1]
        prev_de=float(prev_de)
        prev_dn=float(prev_dn)
        areasq=float(0)

        for de,dn in zip(coords[:-1:2], coords[1::2]):
            areasq+= (float(de)*float(prev_dn))-(float(dn)*float(prev_de))
            prev_de, prev_dn = de,dn
            area =abs(areasq)/2

        for row in reader:
            row.append(area)  
            coords.append(row)

        writer.writerows(coords)

        print(row)
Nikki
  • 21
  • 2
  • Can you try tabbing the second for? – MattCom May 31 '17 at 14:38
  • that calculates a different area in the column but not still not the answer i was expecting...also any idea why it is only printing the last line of the csv? – Nikki May 31 '17 at 14:46
  • it's printing the last line of the csv because the `print` is not inside the `second` for loop, thus it is called only after the second for loop has gone through the entire csv file and set `row` to the last line in the file – Matti Lyra May 31 '17 at 15:02
  • Thank you! That fixed the printing issue. Now it prints all the lines (although not the headers, but i will tidy that bit up later) – Nikki May 31 '17 at 15:12
  • yes the headers are missing because of the `next` calls as those also forward the line – Matti Lyra May 31 '17 at 15:39

2 Answers2

1

I would recommend you use pandas for this.

import pandas as pd
df = pd.read_csv('./poly.csv')
df['area'] = calculate_area(df) # implement calculate_area
df.write_csv('polyout.csv')

You're probably better off actually just using plain numpy, see the answer to this question Calculate area of polygon given (x,y) coordinates

Matti Lyra
  • 12,828
  • 8
  • 49
  • 67
  • Unfortunately I haven't learnt about pandas yet so was trying to only use math and csv – Nikki May 31 '17 at 14:44
  • That's exactly why I would encourage you to learn `pandas` as it will make these kinds of operations much easier. Not exactly sure what your area calculation is doing, if you explain the `csv` structure I can probably help with the `calculate_are` implementation as well. – Matti Lyra May 31 '17 at 14:46
  • and the area is calculated by doing `E1_2 * N1_1` - `N2_2 - E1_1`? The `_1` and `_2` are row numbers in the csv – Matti Lyra May 31 '17 at 15:05
  • Um I'm not sure exactly what you mean, but the equation i was using is the shoelace formula.. – Nikki May 31 '17 at 15:33
0

My data, 1st quadrangular given clockwise, 2nd given anticlockwise

$ cat a.csv
a,b,x1,y1,x2,y2,x3,y3,x4,y4
a,b,3,3,3,9,4,9,4,3
e,f,0,0,5,0,5,5,0,5
$ 

Imports, I import also stdout to be able to show on screen my results

from csv import reader, writer
from sys import stdout

use the csv classes

data = reader(open('a.csv'))
out = writer(stdout)

process the headers (assuming one row of headers)

headers = next(data)
headers = headers+['A']
out.writerow(headers)

loop on data, process data, output processed data

for row in data:
    # the list comprehension is unpacked in aptly named variables
    x1, y1, x2, y2, x3, y3, x4, y4 = [int(v) for v in row[2:]]
    # https://en.wikipedia.org/wiki/Shoelace_formula#Examples
    a = (x1*y2+x2*y3+x3*y4+x4*y1-y1*x2-y2*x3-y3*x4-y4*x1)/2
    row.append(a)
    out.writerow(row)

I have saved the above in a file named area.py and finally we have

$ python3 area.py
a,b,x1,y1,x2,y2,x3,y3,x4,y4,A
a,b,3,3,3,9,4,9,4,3,-6.0
e,f,0,0,5,0,5,5,0,5,25.0
$ 

To use the shoelace formula as is remember that points must be ordered clockwise, if your data is different just write a = -(...

gboffi
  • 22,939
  • 8
  • 54
  • 85