0

I am trying to figure out the best way to pull a value from a CSV file called lookup.csv based on a value in master.csv, and then save the new file as output.csv.

In the example below, the master file has three columns of data with the last column being City. I'd like to replace the City name with the City Code from the lookup.csv file.

I don't have a DB that I can lookup from so I am having to use the CSV file. I am trying to use FasterCSV with Ruby 1.8.7.

Example File Structure:

master.csv:

First Name | Last Name | City
Joey       | Jello     | Atlanta
Home       | Boy       | Dallas

lookup.csv:

City    | City ID
Atlanta | 12345
Dallas  | 12346
Houston | 12347

output.csv:

First Name | Last Name | City
Joey       | Jello     | 12345
Home       | Boy       | 12346
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
mpowmap
  • 651
  • 1
  • 6
  • 11
  • "I don't have a DB" Why not look into SQLite? It's available for almost all OSes out there and great for temporary and in-memory databases. That'd simplify and speed up your task. Load the CSV into it, then you can do random accesses instead of sequential file reads. – the Tin Man Jun 21 '12 at 00:33

1 Answers1

1

I'm using 1.9, where FasterCSV is available as CSV in the standard lib. First I'd create a lookup hash out of lookup.csv:

cities = Hash[CSV.read('lookup.csv', :col_sep => ' | ').to_a[1..-1]]

If the file is very big, you might want to iterate over it with CSV.foreach and build the hash row by row:

cities = {}
CSV.foreach('lookup.csv', :col_sep => ' | ', :headers => true, :return_headers => false) do |line|
  cities[line['City']] = line['City ID']  
end  

Then iterate over master.csv, do a lookup of the city in the hash and write that to output.csv:

CSV.open('output.csv', "w", :headers => ['First Name', 'Last Name', 'City ID'], :write_headers => true) do |output|
  CSV.foreach('master.csv', :col_sep => ' | ', :headers => true, :return_headers => false) do |line|
    output << [line['First Name'], line['Last Name'], cities[line['City']]]
  end  
end
Michael Kohl
  • 66,324
  • 14
  • 138
  • 158