35

The task is to look for a specific field (by it's number in line) value by a key field value in a simple CSV file (just commas as separators, no field-enclosing quotes, never a comma inside a field), having a header in its first line.

User uynhjl has given an example (but with a different character as a separator):


val src = Source.fromFile("/etc/passwd")
val iter = src.getLines().map(_.split(":"))
// print the uid for Guest
iter.find(_(0) == "Guest") foreach (a => println(a(2)))
// the rest of iter is not processed
src.close()

the question in this case is how to skip a header line from parsing?

Don Branson
  • 13,631
  • 10
  • 59
  • 101
Ivan
  • 63,011
  • 101
  • 250
  • 382
  • I have just written a question and comprehensive answer covering both parsing the input and then composing the output for a CSV file. It's located here: http://stackoverflow.com/a/32488453/501113 – chaotic3quilibrium Sep 09 '15 at 20:24

3 Answers3

31

You can just use drop:

val iter = src.getLines().drop(1).map(_.split(":"))

From the documentation:

def drop (n: Int) : Iterator[A]: Advances this iterator past the first n elements, or the length of the iterator, whichever is smaller.

Travis Brown
  • 138,631
  • 12
  • 375
  • 680
  • 1
    It is **not correct way to read CSV file**. "Parsing CSV files properly is not a trivial matter", see **[CSV specification](https://en.wikipedia.org/wiki/Comma-separated_values#Specification)** and next answer. – Peter Krauss Sep 20 '19 at 17:26
  • 1
    @PeterKrauss While the title focuses on the CSV part, the question itself makes it clear that what the user is trying to do is skip a line. – Travis Brown Sep 21 '19 at 15:11
  • Sorry Travis, it is not a bold for you, it is for the 48k pageviews people that lost time looking for general "CSV solution". Seems a problem of Scala standard libraries, there are no standard... But it is used for big data-centric projects (e.g. Spark) and there are no obvious CSV reader. – Peter Krauss Sep 22 '19 at 11:45
18

Here's a CSV reader in Scala. Yikes.

Alternatively, you can look for a CSV reader in Java, and call that from Scala.

Parsing CSV files properly is not a trivial matter. Escaping quotes, for starters.

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
  • 1
    I've seen this, but looks too complex for my simple case. I don't need all those regexps as my files are very simple. – Ivan Sep 01 '10 at 00:24
  • I've just posted a way simpler solution (which is easily copy/pasted right into the local coding context) on this StackOverflow answer: http://stackoverflow.com/a/32488453/501113 – chaotic3quilibrium Sep 09 '15 at 20:37
  • 1
    This should be a comment at best, since it doesn't address the question (how to skip a row). – Travis Brown Feb 17 '16 at 14:04
4

First I read the header line using take(1), and then the remaining lines are already in src iterator. This works fine for me.

val src = Source.fromFile(f).getLines

// assuming first line is a header
val headerLine = src.take(1).next

// processing remaining lines
for(l <- src) {
  // split line by comma and process them
  l.split(",").map { c => 
      // your logic here
  }
}
tuxdna
  • 8,257
  • 4
  • 43
  • 61
  • 2
    The problem with `split(",")` is that when you come across a string like `"This, that"`, it splits it too even though it's the part of a single point. – Chetan Bhasin Feb 06 '15 at 18:54
  • I just addressed the very common and erroneous "use split(",") advice in my comprehensive answer to a CSV question here: http://stackoverflow.com/a/32488453/501113 – chaotic3quilibrium Sep 09 '15 at 20:25
  • 1
    The question says `simple CSV`. If CSV is not simple, it is always better to us a dedicated CSV library. – tuxdna Sep 10 '15 at 07:20