0

I have searched a lot on Google, but all goes in vain. I am new with RStudio, and trying to use rattle for data mining. I have a file which contains umlaut (German characters such as; ä, ö, ü and ß). I am using Rattle 3.3.0 with RStudio 0.98.1049. When I try to load the file in RStudio, it loads the file perfectly;

However, when I tried to open the csv file in Rattle, it gives this error;

An error occured in the following command:

crs$dataset <- read.csv("file:///C:/Users/data_model.csv", dec=",",
na.strings=c(".", "NA", "", "?"), strip.white=TRUE, encoding="UTF-8").


The error message was:

Error in type.convert(data[[i]], as.is = as.is[i], dec = dec,
na.strings = character(0L)) :    invalid input 'FT-Stütze' in
'utf8towcs'

I am running R version 3.1.0 (2014-04-10) -- "Spring Dance" on 64-bit platform. Anyone knows how to fix it?

Thomas
  • 43,637
  • 12
  • 109
  • 140
Sherry
  • 11
  • 8
  • 2
    There is a lot of irrelevant information in you post. It will be easier for others to help you if you narrow down the problem. Edit the CSV into the smallest possible example that has the problem, and include a link in your question. – Jeroen Ooms Oct 01 '14 at 14:06
  • A user had the same error [in this question](http://stackoverflow.com/questions/9637278/r-tm-package-invalid-input-in-utf8towcs). Have a look. – Will Beason Oct 01 '14 at 14:29
  • @Jeroen I have removed all invaluable text. The real problem is that Rattle is not able to read the file with umlaut (ä, ö, ü and ß). Is there any package that I can include in rattle so that it can transform these text into wcs from utf8... since utf8 can handle German letters as well... or am I missing something?. – Sherry Oct 02 '14 at 06:51
  • Does that identical `read.csv` line that causes the error in Rattle work in RStudio? You've not explained exactly *how* you've read it in RStudio. – Spacedman Oct 02 '14 at 09:52
  • @Spacedman. I just used read.csv(filename.csv) command and data is stored as a dataframe and then when I use summary() command it has shown me following details: – Sherry Oct 02 '14 at 09:55
  • Does that (almost) IDENTICAL line that causes the error in rattle work in RSTudio? Its possible that the file *isnt* UTF-8 encoding, and whatever default options `read.csv` is using in RStudio handle it, but Rattle is forcing `encoding="UTF-8"`. I say almost, just do `dataset <- read.csv(....` etc. Not `crs$dataset <- ` – Spacedman Oct 02 '14 at 09:57
  • @Spacedman. I just used read.csv(filename.csv) command and data is stored as a dataframe and then when I use head() command it has shown me following details: **code:** `data <- read.csv("test.csv") head(data)` **outcome:** _ Task.List Baseline.FInish 1 NeuerMarkt_Ausführungsterminplan-255 2014-08-14 2 NeuerMarkt_Ausführungsterminplan-153 2014-09-18 3 NeuerMarkt_Ausführungsterminplan-18 2014-12-17 4 NeuerMarkt_Ausführungsterminplan-140 2014-09-26 5 NeuerMarkt_Ausführungsterminplan-158 2014-04-28 6 NeuerMarkt_Ausführungsterminplan-68 2014-08-22_ – Sherry Oct 02 '14 at 10:02
  • @Spacedman, yup, all lines work fine in R Studio. – Sherry Oct 02 '14 at 10:04
  • Confirm that: `read.csv("file:///C:/Users/data_model.csv", dec=",", na.strings=c(".", "NA", "", "?"), strip.white=TRUE, encoding="UTF-8")` works in RStudio. – Spacedman Oct 02 '14 at 10:05
  • yes, it is working file on R Studio as I mentioned in the above comments... as an **outcome** – Sherry Oct 02 '14 at 10:07
  • Final suggestion is you make your data file, or at least one with a line that breaks your rattle, available. My rattle reads UTF-8 just fine. – Spacedman Oct 02 '14 at 10:10
  • I also try to read the similar csv file, removing all the numerical values, just with the German text and the dates, with Rattle, and I got this error; `An error occured in the following command: crs$dataset <- read.csv("file:///C:/Users/asyed/Desktop/Testing Bins/R code/test.csv", na.strings=c(".", "NA", "", "?"), strip.white=TRUE, encoding="UTF-8"). The error message was: Error in type.convert(data[[i]], as.is = as.is[i], dec = dec, na.strings = character(0L)) : invalid input 'NeuerMarkt_Ausführungsterminplan-255' in 'utf8towcs'` – Sherry Oct 02 '14 at 10:11
  • **3rd try** I removed all the German special characters and then read a similar csv file and it gets load and Rattle is also able to analyze data as usual... **findings** 1, rattle is not able to convert the special characters to wcs in my pc. **procedure** as i just replaces all the special characters with (ü -> ue, ä -> ae, ö -> oe) manually. Now the **key-question** is how to make Rattle able to read special characters?. – Sherry Oct 02 '14 at 10:25
  • Possible duplicate of [What is character encoding and why should I bother with it](http://stackoverflow.com/questions/10611455/what-is-character-encoding-and-why-should-i-bother-with-it) – Raedwald Jan 21 '16 at 13:15

0 Answers0