Read csv file with 3 columns as a grid with first 2 columns as coordinates and third column as value

Question

Hi there i'm still new to python and learning as i'm going. i am trying to read a CSV file with 3 columns , the first 2 columns are coordinates and the third column is values. bellow is example of the CSV file content.

322000.235 582999.865 149.309 
322000.235 582999.615 149.29 
322000.235 582999.365 149.276 
322000.235 582999.115 149.26 
322000.235 582998.865 149.246 
322000.235 582998.615 149.245 
322000.235 582998.365 149.235 
322000.235 582998.115 149.228 
322000.235 582997.865 149.223 
322000.235 582997.615 149.226 
322000.485 582999.865 149.249 
322000.485 582999.615 149.217 
322000.485 582999.365 149.224 
322000.485 582999.115 149.243 
322000.485 582998.865 149.249 
322000.485 582998.615 149.256 
322000.485 582998.365 161.259 
322000.485 582998.115 149.257 
322000.485 582997.865 149.26 
322000.485 582997.615 149.274 
322000.735 582999.865 149.193 
322000.735 582999.615 149.159 
322000.735 582999.365 149.179 
322000.735 582999.115 149.215 
322000.735 582998.865 149.242 
322000.735 582998.615 149.261 
322000.735 582998.365 160.274 
322000.735 582998.115 149.29 
322000.735 582997.865 149.321 
322000.735 582997.615 149.342 
322000.985 582999.865 149.156 
322000.985 582999.615 149.128 
322000.985 582999.365 149.16 
322000.985 582999.115 149.205 
322000.985 582998.865 149.239 
322000.985 582998.615 149.265 
322000.985 582998.365 149.289 
322000.985 582998.115 149.324 
322000.985 582997.865 149.373 
322000.985 582997.615 149.401

I need it to read it as following

(322000.235 582999.865 149.309 )    (322000.485 582999.865 149.249 )    (322000.735 582999.865 149.193 )    (322000.985 582999.865 149.156 )
(322000.235 582999.615 149.29  )    (322000.485 582999.615 149.217 )    (322000.735 582999.615 149.159 )    (322000.985 582999.615 149.128 )
(322000.235 582999.365 149.276 )    (322000.485 582999.365 149.224 )    (322000.735 582999.365 149.179 )    (322000.985 582999.365 149.16  )
(322000.235 582999.115 149.26  )    (322000.485 582999.115 149.243 )    (322000.735 582999.115 149.215 )    (322000.985 582999.115 149.205 )
(322000.235 582998.865 149.246 )    (322000.485 582998.865 149.249 )    (322000.735 582998.865 149.242 )    (322000.985 582998.865 149.239 )
(322000.235 582998.615 149.245 )    (322000.485 582998.615 149.256 )    (322000.735 582998.615 149.261 )    (322000.985 582998.615 149.265 )
(322000.235 582998.365 149.235 )    (322000.485 582998.365 161.259 )    (322000.735 582998.365 160.274 )    (322000.985 582998.365 149.289 )
(322000.235 582998.115 149.228 )    (322000.485 582998.115 149.257 )    (322000.735 582998.115 149.29  )    (322000.985 582998.115 149.324 )
(322000.235 582997.865 149.223 )    (322000.485 582997.865 149.26  )    (322000.735 582997.865 149.321 )    (322000.985 582997.865 149.373 )
(322000.235 582997.615 149.226 )    (322000.485 582997.615 149.274 )    (322000.735 582997.615 149.342 )    (322000.985 582997.615 149.401 )

I put something together which goes through the column 3 and checks the value to its adjacent value using .shift(-1) and .shift(1), which does the job but i get allot of unnecessary data, what i actually want is to check not just the values next to it, but to check it as a grid with adjacent values which will be 4 checks in most cases . example linked, red is value to check with all the adjacent blue marked. example_value check Here is my script i have so far, hope all this is enough info and understandable don't know if i can alter it or should start over and how. hope someone can help.

from __future__ import print_function

import pandas as pd
import os
import re


Dir = os.getcwd()
Blks = []
CSV = []



for files in Dir:
    for f in os.listdir(Dir):
        if re.search('.txt', f):
            Blks = [each for each in os.listdir(Dir) if each.endswith('.txt')]
print (Blks)

for files in Dir:
    for f in os.listdir(Dir):
        if re.search('.csv', f):
            CSV = [each for each in os.listdir(Dir) if each.endswith('.csv')]
print (CSV)



limit = 3
tries = 0

while True:
        print ("----------------------------------------------------")
        spikewell = float(raw_input("Please Enter Parameters: "))
        tries += 1
        if tries == 4:
            print ("----------------------------------------------------")
            print ("Entered incorrectly to many times.....Exiting")
            print ("----------------------------------------------------")
            break
        else:
            if spikewell > 50:
               print ("parameters past limit (20)")
               print ("----------------------------------------------------")
               print (tries)
               continue
            elif spikewell < 0:
               print ("Parameters cant be negative")
               print ("----------------------------------------------------")
               print (tries)
               continue
            else:
               spikewell
               print ("Parameters are set")
               print (spikewell)
               print ("Searching files")
               print ("----------------------------------------------------")

        for z in Blks:
            df = pd.read_csv(z, sep=r'\s+', names=['X','Y','Z'])
            z = sum(df['Z'])
            average = z / len(df['Z'])




        for terrain in Blks:
                for df in terrain:
                    df = pd.read_csv(terrain, sep=r'\s+', names=['X','Y','Z'])


                    spike_zleft = df['Z'] - df['Z'].shift(1)
                    spike_zright = df['Z'] - df['Z'].shift(-1)
                    wzdown = -(df['Z'] - df['Z'].shift(-1))
                    wzup_abs = abs(df['Z'] - df['Z'].shift(1))
                    wzdown_abs = abs(wzdown)
                    spikecsv = ('spikes.csv')
                    wellcsv = ('wells.csv')




                    spikes_search = df.loc[(spike_zleft  > spikewell) & (spike_zright > spikewell)]
                    with open(spikecsv, 'a') as f:
                        spikes_search[['X','Y','Z']].to_csv(f, sep='\t', index=False)


                    well_search = df.loc[(wzup_abs > spikewell) & (wzdown > spikewell)]
                    with open(wellcsv, 'a') as f:
                        well_search[['X','Y','Z']].to_csv(f, sep='\t', index=False)

                    print ("----------------------------------------------------")
                    print ('Search completed')
                    if len(spikes_search) == 0:
                        print ("0 SPIKE\S FOUND")
                    elif len(spikes_search) > 0:
                        print (terrain)
                        print (str(len(spikes_search)) + " SPIKE\S FOUND")
                    elif len(spikes_search) > 0:
                        print (str(len(spikes_search)) + " SPIKE\S FOUND")


                    if len(well_search) == 0:
                        print ("0 WELL\S FOUND")
                    elif len(well_search) > 0:
                        print (str(len(well_search)) + " WELL\S FOUND")
                    elif len(well_search) > 0:
                        print (str(len(well_search)) + " WELL\S FOUND")


                    break

        break

Question is not clear to me atleast? What exactly you are trying here? — Anvesh, Apr 13 '16 at 09:07
You may want to look into pivoting: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pivot.html — tfv, Apr 13 '16 at 09:10
@Anvesh im trying to search through 3 columns with the third column as my values the first 2 columns are just "coordinates" or location. like x and y to a grid. and 3rd column is the value. but im looking to pick up high changes in the values,but comparing them to there adjacent values.. hope this makes more sense.. — Edwin Page, Apr 13 '16 at 09:34

score 0 · Answer 1 · edited May 23 '17 at 12:24

There are two issues here, importing the data and using it. It would have been better to describe what you were doing with the data, rather than providing your whole script. Please see https://stackoverflow.com/help/mcve

Regarding CSV input, use the csv module!

import csv

with open('FILENAME','r') as f:
    data = []
    readr = csv.reader(f)
    for line in readr:
        data.append([float(i) for i in line])

But your panda code already does this really.

If you are doing numeric work with arrays (which it looks like you are) then you should look into numpy http://www.numpy.org/. This module, or rather collection of modules, probably already has a function for what you are trying to do. Specifically, you are looking for local minima and maxima.

Once you have numpy arrays, you'll find other people trying to do the same this: Find all local Maxima and Minima when x and y values are given as numpy arrays

Also: Get coordinates of local maxima in 2D array above certain value

@ dodell , what i use data for is it is terrain data ,GIS, coordinate systems with elevation, im trying to find spikes over certain criteria of about 5 meters. Sorry about posting entire script ,wasn't intended to confuse because of size, was just example on what i have sofar, noted on the minimize script link. i will check out the links u add and see what can figure out thank — Edwin Page, Apr 13 '16 at 10:41

Read csv file with 3 columns as a grid with first 2 columns as coordinates and third column as value

1 Answers1