1

I'm working on the integration of 2 CSV files.

Files are made by the following columns:

First .csv:

SKU | Name | Quantity | Active
121 | Jablko | 23 | 1

Another .csv consists following:

SKU | Quantity
232 | 4
121 | 2

I'd like to update 1.csv with data from 2.csv, in Linux, any idea how to do it in best way? Python?

Adam Lesniak
  • 878
  • 10
  • 32
  • Are there really spaces around the separators, and are the separators really `|`s? – Tim Pietzcker Jun 17 '13 at 17:50
  • 1
    Import both tables in a DB then query using a join. See http://stackoverflow.com/questions/2887878/importing-a-csv-file-into-a-sqlite3-database-table-using-python – Sylvain Leroux Jun 17 '13 at 17:51
  • 1 CSV file is generated by the warehouse sytem, another one is generated from the website/shop/database, thats impossible to do this in MySQL Separator is ',' – Adam Lesniak Jun 17 '13 at 17:55
  • 1
    For parsing, combining, and analyzing real-world data, I switched to Python [pandas](http://pandas.pydata.org/#quick-vignette) from SQL. Check it out. – Dan Allan Jun 17 '13 at 17:56

4 Answers4

3

The awk solution:

awk -F ' \\| ' -v OFS=' | ' '
    NR == FNR {val[$1] = $2; next}
    $1 in val {$3 = val[$1]} 
    {print}
' 2.csv 1.csv

The FS input field separator variable is treated as a regular expression while the output field separator is treated as a plain string, hence the different treatement of the pipe character.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352
2

This is a solution with gnu awk (awk -f script.awk file2.csv file1.csv):

BEGIN {FS=OFS="|"}
FNR == NR {
     upd[$1] = $2
     next
}

{$3 = upd[$1]; print}
Dany Bee
  • 552
  • 3
  • 12
1

Untested, but something along the lines of:

import csv
from itertools import islice

with open('first.csv') as f1, open('second.csv') as f2, open('new', 'wb') as fout:
    sku_to_qty = dict(islice(csv.reader(f2, delimiter='|'), 1, None)) # create lookup table
    csvin = csv.reader(f1, delimiter='|') # input
    csvout = csv.writer(fout, delimiter='|') # output
    csvout.writerow(next(csvin)) # write header
    for row in csvin:
        row[2] = sku_to_qty.get(row[0], row[2]) # update or presume the same
        csvout.writerow(row)
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
1
awk 'BEGIN{FS=OFS="|"}NR==FNR{a[$1]=$2;next}{$3=a[$1]" "}1' second.csv first.csv
jaypal singh
  • 74,723
  • 23
  • 102
  • 147