-1

I have the names of some music artists which I am working with within the Spotify API. I'm having some issues dealing with some strings because of the characters' accents. I don't have much understanding of character encoding.

I'll provide more context a bit further below, but essentially I am wondering if there is a way in R to "simplify" characters with ornaments.

Essentially, I am interested if there is a function which will take c("ë", "ö") as an input, and return c("e", "o"), removing the ornaments from the characters.


I don't think I can create a reproducible example because of the issues with API authentication, but for some context, when I try to run:

artistName <- "Tiësto"
GET(paste0("https://api.spotify.com/v1/search?q=", 
           artistName,
           "&type=artist"), 
    config(token = token))

The following gets sent to the API:

https://api.spotify.com/v1/search?q=Tiësto&type=artist

Returning me a 400 bad request error. I am trying to alter the strings I pass to the GET function so I can get some useful output.

Edit: I am not looking for a gsub type solution, as that relies on me anticipating the sorts of accented characters which might appear in my data. I'm interested whether there is a function already out there which does this sort of translation between different character encodings.

Ross
  • 521
  • 1
  • 4
  • 16

2 Answers2

3

Here is what I found, and may work for you. Simpler and convenient to apply on any form of data.

> artistName <- "Tiësto"
> iconv(artistName, "latin1", "ASCII//TRANSLIT")
[1] "Tiesto"
Sagar
  • 2,778
  • 1
  • 8
  • 16
  • 1
    This solution is nice and convenient if it works on the OPs system, but it is platform-dependent, so it might not work for everybody (in my Mac, it results in `"Ti\"esto"`). – Oriol Mirosa Aug 15 '17 at 13:53
  • @OriolMirosa - Didn't know about that. Thanks for your comment. – Sagar Aug 15 '17 at 14:14
0

Based on the answers to this question , you could do this:

artistName <- "Tiësto"

removeOrnaments <- function(string) {
  chartr(
    "ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåçèéêëìíîïðñòóôõöùúûüýÿ",
    "SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy",
    string
  )
}

removeOrnaments(artistName)

# [1] "Tiesto"
Oriol Mirosa
  • 2,756
  • 1
  • 13
  • 15