0

I have imported a movie dataset in csv format, few of the columns are full of special symbols along with the data I need(Example is attached below along with the image of the Movie dataset). Now, do I have to remove those special characters individually OR is there anyway(shortcut) to remove them while importing the file into R. Thanks

Movie.csv Image

GENRE [{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 14, "name": "Fantasy"}, {"id": 878, "name": "Science Fiction"}]

Spoken Languages [{"iso_639_1": "en", "name": "English"}, {"iso_639_1": "es", "name": "Espa\u00f1ol"}]

Community
  • 1
  • 1
  • 1
    You may need to convert using `fromJSON` `library(jsonlite)` – akrun May 18 '20 at 03:40
  • Facing this error while using fromjson: Error: Argument 'txt' must be a JSON string, URL or file. – Taimur Khan May 18 '20 at 03:47
  • 1
    It is better to show a reproducible example with `dput` and an expected output. You can check [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for more info to create an example so that others can reproduce or test your data – akrun May 18 '20 at 03:49
  • 1
    I'm unclear exactly what your CSV looks like or how you are importing it or what the exact desired result is. It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick May 18 '20 at 04:03
  • I have tried using dput() but the output is very large to paste here, I have added an image of the dataset in the original question – Taimur Khan May 18 '20 at 04:37
  • Neither of your textual inputs cause an error with `jsonlite::fromJSON`. They are both embedded frames (in R-speak), so you need to be able to deal with a list of frames. If you're using the tidyverse, it deals with nested frames quite well (as does `data.table`, but I don't assume you're using that). – r2evans May 18 '20 at 04:42
  • So you are saying that you have managed to import the data without those special characters in R. Can you please write the commands you used. Thanks @r2evans – Taimur Khan May 18 '20 at 04:48
  • `jsonlite::fromJSON('[{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 14, "name": "Fantasy"}, {"id": 878, "name": "Science Fiction"}]')` and `jsonlite::fromJSON(' [{"iso_639_1": "en", "name": "English"}, {"iso_639_1": "es", "name": "Espa\u00f1ol"}]')` – r2evans May 18 '20 at 04:54
  • You can get a list of such frames with something like `lapply(moves$genres, jsonlite::fromJSON)`, assuming no errors. If *some* have errors, you'll need to use `try` or `tryCatch` to get at least some of them. For example `lapply(movies$genres, function(json) tryCatch(jsonlite::fromJSON(json), error=function(e) NA_character))` (untested). – r2evans May 18 '20 at 05:07
  • fromjson is running smoothly when individual rows are entered(as in your 2nd last comment), but it is not working when entire column is provided through lapply, it produces an error"Error: Argument 'txt' must be a JSON string, URL or file." when used with trycatch the following error is showing: "Error in value[[3L]](cond) : object 'NA_character' not found 1." – Taimur Khan May 18 '20 at 20:13
  • Thanks r2evans, The error was occurring, because the data type of that column was integer and I need to convert the data type from integer to a character. Now your lapply command is working smoothly – Taimur Khan May 20 '20 at 21:24

0 Answers0