3
scala> val st1 = "|||||||000001|09/01/2014|V|174500,00|22||BD |2540|LEC|1000|BEB|
       01|53||AE|111 ||49|94,22|6||||||||2|2|App|80|2|||"
scala> st1.split('|').length
resXX: Int = 39

scala> val st2 = "|||||||000001|09/01/2014|V|174500,00|22||BD |2540|LEC|1000|BEB|
       01|53||AE|111 ||49|94,22|6||||||||2|2|App|80|2| | |"
scala> st2.split('|').length
resXX: Int = 41

that is the last empty fields are ignored by the split. is there any solution other that replacing all "||" by "| |"

the expected output is Int = 41.

indeed in the real file I may have lines such as:

"|||||||000001|09/01/2014|V|174500,00|22||BD |2540|LEC|1000|BEB|
       01|53||AE|111 ||49|94,22|6||||||||2|2|App|80|2|||150"

that is a 42nd column comprising a number. (In this case the result is Int = 42)

Every line has the same number of |, but depending on the content of the column, the split('|').length returns a different result! (31, 40, ...,42).

I can understand the lack of the column after the last separator, but not the lack of the previous ones.

tschmit007
  • 7,559
  • 2
  • 35
  • 43

1 Answers1

4

This issue comes from Java (since that's where String#split is defined). As you can see here, in the default case (which is limit=0), the trailing empty strings are discarded.

To make it work as you expect, you can use str.split('|', -1).

Cyrille Corpet
  • 5,265
  • 1
  • 14
  • 31