0

Given this CSV file:

date,name,st,code,num
2020-03-25,AB,53,2585,130
2020-03-26,AB,53,3208,151
2020-03-26,BA,35,136,1
2020-03-27,BA,35,191,1

I want to create the following hash with the given data:

{:AB=>[["2020-03-25", "2585"], ["2020-03-26", "3208"]], :BA=>[["2020-03-26", "136"], ["2020-03-27", "191"]]}

I tried this:

require 'csv'
h=Hash.new([])
CSV.foreach('file.csv', headers: true) do |row|
  h[row['st']] << [[row['date'], row['code']]]
end

but all I get is an empty hash h.

Khalil Gharbaoui
  • 6,557
  • 2
  • 19
  • 26
builder-7000
  • 7,131
  • 3
  • 19
  • 43

2 Answers2

1

Let's first create the CSV file.

str =<<~_
date,name,st,code,num
2020-03-25,AB,53,2585,130
2020-03-26,AB,53,3208,151
2020-03-26,BA,35,136,1
2020-03-27,BA,35,191,1
_

FName = 't'
File.write(FName, str)
  #=> 120 

Now we can simply read the file line-by-line, using CSV::foreach, which, without a block, returns an enumerator, and build the hash as we go along.

require 'csv'

CSV.foreach(FName, headers: true).
  with_object(Hash.new { |h,k| h[k] = [] }) do |row,h|
    h[row['name'].to_sym] << [row['date'], row['code']]
end  
  #=> {:AB=>[["2020-03-25", "2585"], ["2020-03-26", "3208"]],
  #    :BA=>[["2020-03-26", "136"], ["2020-03-27", "191"]]}

I've used the method Hash::new with a block to create a hash h such that if h does not have a key k, h[k] causes h[k] #=> []. That way, h[k] << 123, when h has no key k results in h[k] #=> [123].

Alternatively, one could write:

CSV.foreach(FName, headers: true).with_object({}) do |row,h|
    (h[row['name'].to_sym] ||= []) << [row['date'], row['code']]
end  

One could also use a converter to convert the values of name to symbols, but some might see that as over-kill here:

CSV.foreach(FName, headers: true,
  converters: [->(v) { v.match?(/\p{Alpha}+/) ? v.to_sym : v }] ).
  with_object(Hash.new { |h,k| h[k] = [] }) do |row,h|
    h[row['name']] << [row['date'], row['code']]
end  
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
  • Thanks that worked. I don't recall using `each_with_object` before, i need to read its documentation. – builder-7000 Mar 30 '20 at 04:28
  • I initially gave an answer that treated the CSV file as an ordinary text file. I had second thoughts about that, however, mainly because it would break if the columns were rearranged and might break if new fields were added or unused fields were removed. – Cary Swoveland Mar 30 '20 at 05:09
  • 1
    Also, one set of CSV data _may_ span multiple lines :) – Ja͢ck Mar 30 '20 at 05:19
1

There is no need to read a CSV file as a text file or whatever, you can use the CSV file as you intended and address the actual issues at hand.

There are three issues here:

  1. This won't work:

    h = Hash.new([])
    

    use this instead:

    h = Hash.new {|h, k| h[k] = [] }
    

    See "Strange, unexpected behavior (disappearing/changing values) when using Hash default value, e.g. Hash.new([])" as @jack commented.

  2. You need headers: true because the first row is a headers row in your case.

  3. You are only pushing to the values array. You need to overwrite it like:

    h[row['name']] = h[row['name']] << [row['date'], row['code']]
    

This will work for you:

require 'csv'
h = Hash.new { |h, k| h[k] = [] }

CSV.foreach('file.csv', headers: true) do |row|
  h[row['name']] = h[row['name']] << [row['date'], row['code']]
end

h.transform_keys(&:to_sym)

#=> {:AB=>[["2020-03-25", "2585"], ["2020-03-26", "3208"]], :BA=>[["2020-03-26", "136"], ["2020-03-27", "191"]]}
Khalil Gharbaoui
  • 6,557
  • 2
  • 19
  • 26