0

I'm using the package stringdist to compare some vectors of strings but I keep getting a different answer than what I think I should when I try to test out the package.

I want to do this:

stringsim('PANDIAN', 'PANIAN', method="lv")
[1] 0.8571429

To 2 columns in a dataframe

stringsim(testdf.lv$Last[1], testdf.lv$matchedname[1], method="lv")

But I get this error:

Error in UseMethod("lengths") : 
  no applicable method for 'lengths' applied to an object of class "factor"

I need to be able to do this because ideally, I would replace the row numbers with an i and run it in a loop. Is this even possible? I tried looking for similar errors but the other questions were not very helpful.

Steven Beaupré
  • 21,343
  • 7
  • 57
  • 77
grad_student
  • 317
  • 1
  • 5
  • 13
  • You really should provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) that defines `testdf.lv` otherwise we have no idea what's in it. However it sounds like you have factors rather than character variables. Try `stringsim(as.chracter(testdf.lv$Last[1]), as.character(testdf.lv$matchedname[1]), method="lv")` – MrFlick Oct 13 '15 at 21:36
  • @MrFlick Sorry, you are right. I should have included the full code. And YES! It was because the testdf.lv column was a vector instead of a character. Thank you! – grad_student Oct 13 '15 at 21:52

1 Answers1

0

So thanks to @MrFlick. It turns out the data I was using in the column:

testdf.lv$Last

Was mistakenly characterized as a factor variable instead of character. Changing the that column to a character with the following:

testdf.ld$Last <- as.character(testdf.ld$Last)

Fixed the error and I was able to rewrite the code into a for loop to go through the entire dataframe.

grad_student
  • 317
  • 1
  • 5
  • 13