1

I am using Rstudio with R 3.3.1 on Windows 7 and I have installed CITAN package. I am trying to import bibliography entries from a CSV file that I exported from Scopus (as it is, untouched), choosing to export all available information.

This is the error that I get:

example <- Scopus_ReadCSV("scopus.csv")

Error in Scopus_ReadCSV("scopus.csv") : Column not found: `Source'. In addition: Warning messages:

1: In read.table(file = file, header = header, sep = sep, quote = quote, : invalid input found on input connection 'scopus.csv'

2: In read.table(file = file, header = header, sep = sep, quote = quote, : incomplete final line found by readTableHeader on 'scopus.csv'

Column `Source' is there when I open the file, so I do not know why it says 'not found'.

GeorgiosA
  • 31
  • 5
  • Is this useful? http://stackoverflow.com/questions/5990654/incomplete-final-line-warning-when-trying-to-read-a-csv-file-into-r – J_F Sep 21 '16 at 11:39
  • That is kind of helpful, but still it is not working. I opened the csv with notepad++ and added an empty line in the end. Nothing changed, I get the exact same errors – GeorgiosA Sep 21 '16 at 14:58

2 Answers2

2

Eventually I came into the following conclusions:

  1. The encoding of the CSV file as exported from Scopus was UTF-8-BOM, which does not seem to be recognized from R when using Scopus_readCSV("file.csv") or read.table("file.csv", header = TRUE, sep = ",", fileEncoding = "UTF-8").

  2. Although it is used an encoding type for the file from Scopus, there can be found some "strange" non-english characters which are not readable from the read function in R. (Mainly found this problem in names with special characters)

Solutions for those issues:

  1. Open the CSV file with a notepad application like the Notepad++ and save the file with UTF-8 encoding to become readable for R as UTF-8.

  2. When running the read function in R you will notice that it stops reading (e.g. in the 40th out of 200 registries). See where exactly it stopped and this way you can find the special character, by opening the CSV with the notepad, and then you can erase/change it as you wish in order to not have the same issue in R again.

GeorgiosA
  • 31
  • 5
  • Changing the encoding in Notepad++ does not work for me (I am using the usual `read.csv` though). Many scopus files seem to be corrupted to me. – anpami Feb 17 '21 at 11:02
0

Another solution that worked for me:

Open the file in Google Sheets, then download it from there again as a *.csv-file. R opens it correctly afterwards.

anpami
  • 760
  • 5
  • 17