5

Is there a way to edit a CSV file using the map method in Ruby? I know I can open a file using:

CSV.open("file.csv", "a+")

and add content to it, but I have to edit some specific lines.

The foreach method is only useful to read a file (correct me if I'm wrong).

I checked the Ruby CSV documentation but I can't find any useful info.

My CSV file has less than 1500 lines so I don't mind reading all the lines.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Badr Tazi
  • 749
  • 1
  • 6
  • 20
  • 2
    http://stackoverflow.com/a/3561424/4621324 – Axalix Oct 24 '15 at 19:02
  • 1
    While you can use "`'a+'`, the better question is should you? If your code crashes mid-run, you stand a very good chance of corrupting your only file, which is not acceptable in a production environment. Instead use `foreach` to read the original file and begin writing to an entirely new file, adding the changes at the appropriate places. After the new file is completely written, rename the old to a backup file, rename the new to the old file's name, and *then* you are free to delete the old file. Don't slurp the original file as it's slower and not scalable. – the Tin Man Oct 24 '15 at 21:12
  • does any of the below answer your question @Badoo – z atef Nov 04 '15 at 20:38
  • Please read "[mcve]". We want to know what you tried to solve this, and why it didn't work. Without that it looks like you didn't try and want us to write the code for you, which isn't the SO way. See https://meta.stackoverflow.com/q/274630/128421 and http://meta.stackoverflow.com/questions/261592 – the Tin Man Jun 08 '17 at 16:57
  • You don't say how you determine what lines have to be changed. By line-number? By content in the line? In general, no, `map` wouldn't be a good choice as it implies that every line has to be passed through the block. Instead, a simple conditional test iterating over the lines and looking for the ones that need to be changed would be more sensible. – the Tin Man Jun 08 '17 at 17:17

3 Answers3

6

Another answer using each.with_index():

rows_array = CSV.read('sample.csv')

desired_indices = [3, 4, 5].sort # these are rows you would like to modify
rows_array.each.with_index(desired_indices[0]) do |row, index| 
  if desired_indices.include?(index)

    # modify over here
    rows_array[index][target_column] = 'modification'

  end
end

# now update the file
CSV.open('sample3.csv', 'wb') { |csv| rows_array.each{|row| csv << row}}

You can also use each_with_index {} insead of each.with_index {}

Shiva
  • 11,485
  • 2
  • 67
  • 84
  • 2
    This is slurping the data which isn't a scalable solution so only do it if the file is guaranteed to never exceed available memory. See https://stackoverflow.com/q/25189262/128421. – the Tin Man Jun 08 '17 at 17:15
2

Here is a little script I wrote as an example on how read CSV data, do something to data, and then write out the edited text to a new file:

read_write_csv.rb:

#!/usr/bin/env ruby
require 'csv'

src_dir = "/home/user/Desktop/csvfile/FL_insurance_sample.csv"
dst_dir = "/home/user/Desktop/csvfile/FL_insurance_sample_out.csv"
puts " Reading data from  : #{src_dir}"
puts " Writing data to    : #{dst_dir}"
#create a new file 
csv_out = File.open(dst_dir, 'wb')
#read from existing file
CSV.foreach(src_dir , :headers => false) do |row|

  #then you can do this 
  # newrow = row.each_with_index { |rowcontent , row_num| puts "#     {rowcontent} #{row_num}" }

  # OR array to hash .. just saying .. maybe hash of arrays.. 
  #h = Hash[*row]
  #csv_out << h

  # OR use map  
  #newrow = row.map(&:capitalize)
  #csv_out << h

  #OR use each  ... Add and end 
  #newrow.each do |k,v| puts "#{k} is #{v}"

  #Lastly,  write back the edited , regexed data ..etc to an out file.
  #csv_out << newrow

end

# close the file 
csv_out.close

The output file has the desired data:

USER@USER-SVE1411EGXB:~/Desktop/csvfile$ ls
FL_insurance_sample.csv  FL_insurance_sample_out.csv  read_write_csv.rb

The input file data looked like this:

policyID,statecode,county,eq_site_limit,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity
119736,FL,CLAY COUNTY,498960,498960,498960,498960,498960,792148.9,0,9979.2,0,0,30.102261,-81.711777,Residential,Masonry,1
448094,FL,CLAY COUNTY,1322376.3,1322376.3,1322376.3,1322376.3,1322376.3,1438163.57,0,0,0,0,30.063936,-81.707664,Residential,Masonry,3
206893,FL,CLAY COUNTY,190724.4,190724.4,190724.4,190724.4,190724.4,192476.78,0,0,0,0,30.089579,-81.700455,Residential,Wood,1
333743,FL,CLAY COUNTY,0,79520.76,0,0,79520.76,86854.48,0,0,0,0,30.063236,-81.707703,Residential,Wood,3
172534,FL,CLAY COUNTY,0,254281.5,0,254281.5,254281.5,246144.49,0,0,0,0,30.060614,-81.702675,Residential,Wood,1
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
z atef
  • 7,138
  • 3
  • 55
  • 50
  • Thank you @the Tin Man for revising this. Tin Man is the man. – z atef Jun 12 '17 at 20:39
  • You should use the block form of `File.open(dst_dir, 'wb')` rather than assign to a variable. It's the Ruby way. – the Tin Man Jun 13 '17 at 17:35
  • I think there is a mistake: `csv_out = File.open(dst_dir, 'wb')` should be `csv_out = CSV.open(dst_dir, 'wb')`, otherwise the writing will be a single row corresponding to the concatenation of the different elements. – Thomas May 05 '20 at 07:48
2

Is there a way to edit a CSV file using the map method in Ruby?

Yes:

rows = CSV.open('sample.csv')
rows_array = rows.to_a

or

rows_array = CSV.read('sample.csv')

desired_indices = [3, 4, 5] # these are rows you would like to modify

edited_rows = rows_array.each_with_index.map do |row, index| 
  if desired_indices.include?(index)

    # simply return the row
    #   or modify over here
    row[3] = 'shiva'

    # store index in each edited rows to keep track of the rows
    [index, row]

  end
end.compact

# update the main row_array with updated data
edited_rows.each{|row| rows_array[row[0]] = row[1]}

# now update the file
CSV.open('sample2.csv', 'wb') { |csv| rows_array.each{|row| csv << row}}

This is little messier. Is not it? I suggest you to use each_with_index with out map to do this. See my another answer

Shiva
  • 11,485
  • 2
  • 67
  • 84
  • 1
    Be careful using this technique as it's slurping the entire file into memory. Unless it's guaranteed the input will never exceed memory a line-by-line solution would be safer and faster. See https://stackoverflow.com/q/25189262/128421 – the Tin Man Jun 08 '17 at 17:15