0

I am using the R package RLastFM - working very well, BUT data being returned with "black diamonds/white question marks" for some characters. I'll try and post below and see if it displays:

e.g. "Like A Prayer�"

I suspect this is an encoding issue, but how to resolve it is beyond my knowledge. Any ideas?

Code:

library(RLastFM)

lastkey <- "1234567890" #my last.fm key
track1 = data.frame(track.search("Like a Prayer","Madonna", key = lastkey, parse = TRUE))

Output:

    track   artist  listeners
1   Like a Prayer   Madonna 492340
2   Like A Prayer [Live]    Madonna 3507
3   Like a Prayer 2008  Madonna 2624
4   Just like a prayer  Madonna 2408
5   Like a Prayer (Churchapella)    Madonna 2090
6   Madonna - Like A Prayer Madonna 2462
7   Like A Prayer 2008 - Live   Madonna 2468
8   Like a Prayer (live)    Madonna 1663
9   Like A Prayer (Album Version)   Madonna 2314
10  like_a_prayer   Madonna 157
11  Like A Prayer (Dance Remix) Madonna 1629
12  Like  A Prayer  Madonna 46
13  Like A Prayer.  Madonna 46
14  Like A  Prayer  Madonna 30
15  "Like a Prayer" Madonna 28
16  like-a-prayer   Madonna 15
17  Like A Prayer�  �Madonna    14
18  Like a Prayer   Madonna<U+3339> 12
...

Session Info:

R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RLastFM_0.1-5  RCurl_1.95-4.3 bitops_1.0-6   XML_3.98-1.1  

loaded via a namespace (and not attached):
[1] tools_3.1.1

Cheers B

RUser
  • 588
  • 1
  • 4
  • 17
  • If you help if you indicated exactly what code you were running that creates this problem. Provide some sample input and give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). With encoding problems, it's important to know your OS and R version and default locale (pretty much everything in `sessionInfo()`). – MrFlick Sep 10 '14 at 02:42
  • I provided the info as requested. – RUser Sep 10 '14 at 07:32
  • Without a key, we can't see the actual bytes that are being returned from the server. It appears that some level of cleaning was done on the server end to replace certain unicode character with the canonical [replacement character](http://www.cogsci.ed.ac.uk/~richard/utf-8.cgi?input=%F6&mode=char) to indicate an unprintable character. It would be useful to see the raw bytes: `charToRaw(as.character(track1)[17])` to see if has the `ef bf bd` sequence or if its something else. I was also assuming that `track` was a factor, but `Encoding(track1$track)` would be good to see as well win you use Win – MrFlick Sep 11 '14 at 00:24

0 Answers0