-2

I need to take a column from one csv file and compare this column with another column from another csv file to find a match.

I can not use Panda, I extract tables and stuck after..

def first():
    with open('1.csv') as csv_file:
        for line in csv_file.readlines():
            array = line.split(',')
            list_pk = array[1]

def sec():
    with open('2.csv') as csv_file:
        for line in csv_file.readlines():
            array = line.split(',')
            list_fk = array[0]
JohnPix
  • 1,595
  • 21
  • 44
  • 2
    Should the whole row match or only specific columns? Also, how big are your files? can they be loaded in memory? – urban Sep 27 '19 at 08:57
  • @urban specific column, the file is small max 4000 rows – JohnPix Sep 27 '19 at 08:57
  • 1
    Did you read the [documentatin for the `csv` module?](https://docs.python.org/3/library/csv.html). Anyway, I find that instead of writing python scripts it's usually easier to use the `csvkit` to `csvjoin` the files together and `csvgrep` values, although sometimes these tools are limited – Giacomo Alzetta Sep 27 '19 at 09:04
  • Define what you mean by "compare" _in your question_. You're also not opening the csv files proper — suggest you look at the examples in the documentation. – martineau Sep 27 '19 at 09:41
  • I guess you should be storing the first item of each line in some data structure. – Stop harming Monica Sep 27 '19 at 12:17

1 Answers1

1

I hope it will help you

def findMatch():
  with open('old.csv', 'r',  newline='') as t1, open('new.csv', 'r',  newline='') as t2:
    for line1,line2 in zip(t1,t2):
      if line1.split(' ')[colum_index]!=line2.split(' ')[colum_index]:
        print(line1,line2)
findMatch()

using zip_longest

from itertools import zip_longest
def findMatch():
  with open('old.csv', 'r',  newline='') as t1, open('new.csv', 'r',  newline='') as t2:
    for line in zip_longest(t1,t2):
      print(line)
      if line[0]!=line[1]:
        print("nq")
findMatch()

zip_longest reference:-zip_longest

soheshdoshi
  • 594
  • 3
  • 7
  • 24
  • 1
    When you `open()` csv files in Python 3.x, you should use `newline=''` as shown in the documentation. – martineau Sep 27 '19 at 10:07
  • @soheshdoshi `if line1.split(' ')[1] != line2.split(' ')[0]: IndexError: list index out of range` seems like I'm getting this exception because of the column length are different – JohnPix Sep 27 '19 at 12:11
  • @Nikolai tha's why i give you zip/izip reference.please do some research – soheshdoshi Sep 28 '19 at 05:48