1

I'm working with words and their phonemes. I found that in my code (and in the console) what looks like two identical strings " 'b eh1 r z'" for example are not returning true when compared, whether with double or triple equals. I did sanity tests in the chrome console, in the node console and in the file itself, and they all return expected results (i.e. only the 'strinfigied' variable seems corrupted. I'm racking my brains trying to figure what's going on. This is what is not workign as expected:

      let stringified = trialPhonemeSequence.join(" ")
      if (p == "z"){
        console.log(trialPhonemeSequence)
        let bearstring = 'b eh1 r z'
        console.log("SHould be adding 'z' at ", i, "so we got", trialPhonemeSequence, "and stringified", stringified)
        console.log(`String|${dictionary['bears']}| length ${dictionary['bears'].length} should equal |${stringified}| length ${stringified.length}: ${dictionary['bears'] == stringified} and ${bearstring == stringified}`);
      }

What the Chrome Console outputs

String|b eh1 r z| length 10 should equal |b eh1 r z| length 10: false and false

Here is the entire function up to that point for context. I don't think you want the entire min reproduable code as it requires large dictionaries and datasets and initialization. The goal of this function was to input bear and look for words that are a phonemic match, allowing for addition of a phoneme (the 'z' sound in this test case).

function findAddedPhonemes(word, dictionary, hashMap)
{
  let matches = []
  let phonemeSequence = dictionary[word]
  let phonemeSequenceList = phonemeSequence.split(" ")
  for (let i = 0; i <= phonemeSequenceList.length; i++)
  {
    phonemeList.forEach((p, ind) => // all the items in the list
    {
      let trialPhonemeSequence = phonemeSequenceList.slice()
      trialPhonemeSequence.splice(i, 0, p) // insert p phoneme into index
      let stringified = trialPhonemeSequence.join(" ")
      if (p == "z"){
        console.log(trialPhonemeSequence)
        let bearstring = 'b eh1 r z'
        console.log(`String|${dictionary['bears']}| length ${dictionary['bears'].length} should equal |${stringified}| length ${stringified.length}: ${dictionary['bears'] == stringified} and ${bearstring == stringified}`);
      }
      if (stringified == "b eh1 r z"){  //THIS IS WHERE ITS BROKEN
        console.log("Bears stringified searching!!!!!!!!!!!!")
      }

      let hash = stringified.hashCode(dictKeys.length * 4)
      if (hashMap[hash] !== undefined)
      {
        hashMap[hash].forEach((o) =>
        {
          let key = getObjectsKey(o)
          if (checkIfIdentical(dictionary[key], stringified))
          {
            matches.push(key)
          }
        })
      }
    })
  }
  console.log("Matches", matches)
  return matches
}

EDIT (SOLVED):

There is a char 13 (Carriage Return) in Stringified string but not the others. I think I understand where this is coming from. I was inserting a new phoneme with splice in each syllable of the word, and when splicing it onto the end of the words, it's not automatically stripping the '\n', which results in comparison errors. I now know one has to do this manually and wrong hash values. BTW the phoneme dictionary ishere

Thanks @VLAZ !

        stringified.split("").map(c => {
          console.log(c.charCodeAt(0))
        })
        console.log("New word")
        bearstring.split("").map(c => {
          console.log(c.charCodeAt(0))
        })
        console.log(stringified==bearstring)

console output

gcr
  • 443
  • 2
  • 5
  • 14
  • 2
    Are the spaces the same kind of spaces? Are the characters the same characters? – VLAZ Feb 11 '21 at 19:38
  • 1
    Can you edit this snippet to be a [Minimal. Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) (with proper inputs)? – esqew Feb 11 '21 at 19:40
  • try checking types too (are they both strings?). `console.log(typeof dictionary['bears'])` – Vitalii Feb 11 '21 at 19:41
  • @VLAZ Same kind of spaces? That sounds interesting. I am not sure how to check – gcr Feb 11 '21 at 19:42
  • 3
    Just do `str.split("").map(c => c.charCodeAt(0))` on the two values and compare the character codes. You can have, say, a normal space and a non-breakable space or alternatively a Latin `e` and a Cyrillic `e` or otherwise characters that look similar but are different. – VLAZ Feb 11 '21 at 19:45
  • @gcr - a couple of things to consider cleaning up or explaining: 1) Where does `phonemeList` come from (forEach() loop), nm on the `I` var. – Randy Casburn Feb 11 '21 at 19:48
  • @Vitalii They both come out as strings. What's interesting is the 'findAddedPhonemes' method works when the phonemes are added on non final syllables. For instance, trying bear it finds ablair, blaire and blare but not bears. Basically it hashes wrong and hence doesn't find it in the hash table, but it does exist there, I have confirmed – gcr Feb 11 '21 at 19:49
  • 4
    @VLAZ Better to do `[...str].map(c => c.codePointAt(0))` if you want Unicode compat. https://stackoverflow.com/q/4547609/215552 – Heretic Monkey Feb 11 '21 at 19:49
  • @VLAZ I think you solved it. I added an edit update with information. Basically there's an extra character (13) inside "stringified". Now comes the fun part, figuring out why that happened. – gcr Feb 11 '21 at 19:58
  • 1
    Looks like one of the strings has character code 13 (`'\r'`) in it (based on screenshots provided). It's a new line code in some systems https://en.wikipedia.org/wiki/Newline – Vitalii Feb 11 '21 at 20:07
  • @Vitalii yep that's it. It's a carriage return. I made an edit explaining what happened. When the phoneme p was 'spliced' on the end of the word as opposed to inside it, it didn't splice away the '\r' (which I assume js automatically appends (or maybe it came from the original text file) – gcr Feb 11 '21 at 20:09
  • 1
    @gcr Please use the edit you just added to your question as an Answer (in the section below) to make it easier for future visitors to this question to find the solution you eventually used (per generally-accepted Stack Overflow etiquette). – esqew Feb 11 '21 at 20:12
  • To echo what @gcr said. Rather than adding `[Solved]` to the question's title. Please remove the solution from your question and add it as an answer. After 24 hours you can accept it as an answer. – phuzi Feb 11 '21 at 20:25

0 Answers0