0

I've got a txt file with following contents:

abc

I made the file with Notepad and saved it in UTF-8. Calling the following function with "abc" from Scala Interpreter yields false The function searches the txt file with the string+final character of file.

def isWordKnown(word: String):Boolean={
    val inputStreamFromKnownWordsList=new FileInputStream("D:\\WordTracker\\new english\\test.txt")

    var stringWriter=new StringWriter()
    IOUtils.copy(inputStreamFromKnownWordsList,stringWriter,"UTF-8")
    val knownWordsString=stringWriter.toString()
    val endOfLine=knownWordsString.substring(knownWordsString.length()-1, knownWordsString.length())
    return knownWordsString.contains(word+endOfLine)
  }
user
  • 2,345
  • 3
  • 12
  • 14
  • See [What is the UTF-8 representation of “end of line” in text file](http://stackoverflow.com/questions/13836352/what-is-the-utf-8-representation-of-end-of-line-in-text-file) – user Feb 29 '16 at 13:49
  • Also [Answer to "What is character encoding and why should I bother with it"](http://stackoverflow.com/a/31760986/5159660) – user Feb 29 '16 at 14:19

2 Answers2

2

The bug is that you assumed EOL is one char in length. On Windows it is two chars.

scala> val data = "abc\r\n"
data: String =
"abc
"

scala> val end = data.substring(data.length - 1, data.length)
end: String =
"
"

scala> data.contains("abc" + end)
res0: Boolean = false

EOL length is scala.util.Properties.lineSeparator.length.

You might also check

scala> data.endsWith("abc" + util.Properties.lineSeparator)
res1: Boolean = true
som-snytt
  • 39,429
  • 2
  • 47
  • 129
1

Your method it's too complicated for what you want. Also, it's not really scala idiomatic (using a return and a var?)

look at this example:

def isWordKnown(word: String): Boolean = {
    val lines = scala.io.Source.fromFile("path/file.txt").mkString
    lines.contains(word)
}

and I get:

scala> isWordKnown("abc")
res1: Boolean = true

Since you are using windows and talking about file encoding, check the file contents that scala gets (a simple println) before the lines.contains(word)

pedrorijo91
  • 7,635
  • 9
  • 44
  • 82
  • Or `fromFile(f).getLines exists (_ endsWith word)` if the intention is to check if a line ends with the word. – som-snytt Feb 29 '16 at 02:40