2

I have two csv files.

one.csv:

1, 12.1455675, -13.1287564, 23, 9, 4.5, 4
2, 12.5934593, -13.0856385, 14, 5, 9.7, 6
3, 12.0496204, -13.8938582, 14, 6, 3.4, 9
4, 12.1456084, -12.1939589, 45, 2, 3.4, 8

two.csv:

9, 12.0496, -13.8939, .3, 55
3, 12.1456, -13.1288, 3.4, 9

What I want to do is match the two csv files based on columns one and two. I want another csv file that has the matched columns 1 and 2, but also includes the corresponding 3rd column values from two.csv and 6th column values from one.csv. Like this:

12.0496, -13.8939, 55, 3.4
12.1456, -12.1288, 9, 4.5

I am unsure how to go about this especially when some of the values in two.csv are rounded.

Any help is greatly appreciated!

user2483176
  • 317
  • 1
  • 10
  • 21

2 Answers2

3

You could use pandas' io to read/write csv files and its database-style joining/merging capabilities to merge the files:

import pandas as pd

normalize = lambda x: "%.4f" % float(x) # round
df = pd.read_csv("one.csv", index_col=(0,1), usecols=(1, 2, 5),
                 header=None, converters=dict.fromkeys([1,2], normalize))
df2 = pd.read_csv("two.csv", index_col=(0,1), usecols=(1, 2, 4),
                  header=None, converters=dict.fromkeys([1,2], normalize))
result = df.join(df2, how='inner')
result.to_csv("output.csv", header=None) # write as csv

Result

12.0496,-13.8939,3.4,55
12.1456,-13.1288,4.5,9
jfs
  • 399,953
  • 195
  • 994
  • 1,670
1

This is a quite common question on SO.

As of myself, same answer: for a medium-term solution, import in a DB, then perform a query using a JOIN ...


Try a search: https://stackoverflow.com/search?q=combining+csv+python

Community
  • 1
  • 1
Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • I haven't been able to find any that match based on two columns and deal with the rounding issue though. – user2483176 Jun 24 '13 at 15:40
  • My answer is still valid thought. This is not (only) a joke: Perform a [join](http://en.wikipedia.org/wiki/Join_%28SQL%29) on two columns. In addition, you could even do some calculations at SQL level... – Sylvain Leroux Jun 24 '13 at 15:41
  • Could you possibly point me to an example? I am not familiar with join in python. – user2483176 Jun 24 '13 at 15:47
  • I wasn't taking about a "join in Python" but instead of using Python to perform a _SQL join_: Here is an example: http://zetcode.com/db/sqlite/joins/ You will find many ressources on the internet. As a side note, standard Python distributions embed Sqlite3 so you have no extra lib to install. – Sylvain Leroux Jun 24 '13 at 15:53
  • Thanks. Sorry I am not familiar with SQL at all.. do I need to convert my csv files to tables? – user2483176 Jun 24 '13 at 16:09
  • See http://stackoverflow.com/questions/2887878/importing-a-csv-file-into-a-sqlite3-database-table-using-python Import each CSV in its own table. After that, you will have all the SQL power at hand to query, update, join your data – Sylvain Leroux Jun 24 '13 at 16:31