Here is one column of my df: [df$City]
(I have other columns, but I'm just showing one column for simplicity.)
City
Seattle
San Diego
Bern
SEATTLE
SEATTLE
BERN
I want to do a frequency count on the cities. I want both "Seattle" and "SEATTLE" to be considered the same - basically, I want the frequency table calculation to be case insensitive.
If I use table(df)
it gives me "Seattle" and "SEATTLE" as two different items. I tried to overcome this by using toupper(df)
before doing table(df)
However, I get the error: invalid multibyte string.
I checked the encoding of my file and it seems to be UTF-8 - I could be wrong - is there a way for me to check the encoding?
Does anyone know how I can get a frequency table that is case insensitive? It doesn't have to be using my approach.
Thanks in advance!!