0

I have a CSV file as following

ID      Required  -- these are headers
SD0005   Yes      -- row information

I have to validate each row against header. Say ID contains letters and numbers and length should not be more than 6. Required header should be either yes or no in every row.

How can I achieve this functionality in Ruby if I have to process a huge file which has more than 1000 rows with good performance?

I'm reading particular row against each header as follows

CSV.foreach('file path', :headers => true) do |csv_obj| 
csv_obj['ID'] 
csv_obj['Required']

Is there a way to know which condition failed while validating column against header for a row. I need to know for which condition it failed and print it out

New to ruby. Appreciate help

Rosie G
  • 3
  • 4

1 Answers1

1

To get the data into Ruby from the CSV file, try the following:

# This will read the data in with headers, convert the column names to symbols, 
# and then turn each resulting CSV::Row instance into a hash

data = CSV.read('temp.csv', headers: true, header_converters: :symbol).map(&:to_h)

this should return the following:

=> [{:id=>"SD0005", :required=>" yes"}, ...]

Once you have all of your information in a format you can work with in Ruby, you could make a method that checks the validity of each ID.

def valid_id?(id_string)
  # uses Regular Expressions to ensure ID is 6 
  # characters that consist of only numbers/letters
  # The double-bang(!!) turn a truthy value to `true`
  # or false value to `false`

  !!id_string.match(/^[\D|\d]{6}$/)
end

If you wanted to test another column's validity, do so with a separate method.

def valid_required?(req_column)
  req_column.downcase == 'yes' ||   req_column.downcase == 'no'
end

Create a main validator method

def all_valid?(row)
  valid_id?(row[:id]) && valid_required?(row[:required])
end

Then keep only the records whose ID is valid

# #select keeps values whose block evaluates to `true`
valid_records = data.select { |record| all_valid?(record) }
  • I have functions in javascripts folder under app folder(directory structure created when we create rails project) For the above hash created how can i call function for every column for a given row? Say i have to call valid_id for id and valid_required for required and so on.Each column in header has a function. And then store complete row information only if validation passes for all columns of row to database. could you be more specific how we can code this in ruby? – Rosie G Sep 07 '15 at 08:19
  • You can create another method that checks the validity of a record's `required` required property like `valid_required?` then create a method that checks all of your validations. Ex: Inside of `def all_columns_valid` write `valid_id? && valid_required?`. – Alex Flores Sep 08 '15 at 13:24
  • I'd encourage you to do some exploration on the CSV Ruby Documentation. It should help give you the understanding that's helpful in debugging these sorts of things. If all else fails, see here http://stackoverflow.com/questions/4822422/output-array-to-csv-in-ruby – Alex Flores Sep 09 '15 at 14:53
  • Thanks Alex. Will check it out – Rosie G Sep 14 '15 at 17:29