I've pulled a table from Wikipedia, but I'm getting a bunch of junk with the population numbers I'm looking for. For instance, I get "!B9840748934017Â 8,244,910" when the actual number I'm after is 8,244,910 only. I've cleaned up the character vector with regex, using sub('![[:alnum:]]*[[:space:]]', '', x)
This works fine, leaving me with the character vector "8,244,910". When I try to convert it to numeric using as.numeric
, however, it gets coerced to NA, and I'm unable to get an integer, no matter what conversions I try. Any thoughts?