1

In the book A first course in statistical programming with R by W. John Braun and Duncan J. Murdoch page 31 (Data frames and read.table function), they explain

Data sets frequently consist of more than one column of data, where each column represents measurements of a single variable. Each row usually represents a single observation. This format is referred to as case-by-variable format.

For example, the following data set consists of four observations on the three variables x, y, and z :

x y z
61 13 4
175 21 18
111 24 14
124 23 18

If such a data set is stored in a file called pretend.dat in the directory myfiles on the C:drive (this is in Windows, but I use a Mac), then it can be read into an R data frame. This can be the commands accomplished by typing pretend.df <- read.table("c:/myfiles/pretend.dat", header = T).

In a data frame, the columns are named. To see the x colum, type pretend.df$x

Problem (book) : Display the row 1, column 3 element of pretend.df.

So far I created with my Macbook Pro this file with Excel (.xlsx or csv??). From there, I wrote pretend.df <- read.table("/Users/jg24/Documents/R/Classeur1.xlsx", header = T), and an error occured

    Warning messages:
1: In read.table("/Users/jg24/Documents/R/Classeur1.xlsx", header = T) :
  line 1 appears to contain embedded nulls
2: In read.table("/Users/jg24/Documents/R/Classeur1.xlsx", header = T) :
  line 3 appears to contain embedded nulls
3: In read.table("/Users/jg24/Documents/R/Classeur1.xlsx", header = T) :
  line 4 appears to contain embedded nulls
4: In read.table("/Users/jg24/Documents/R/Classeur1.xlsx", header = T) :
  line 5 appears to contain embedded nulls
5: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
  embedded nul(s) found in input

If I try the command pretend.df <- read.table("/Users/jg24/Documents/R/Classeur1.csv", header = T), I got

Warning message:
In read.table("/Users/jg24/Documents/R/Classeur1.csv", header = T) :
  incomplete final line found by readTableHeader on '/Users/jg24/Documents/R/Classeur1.csv'

Question : Could anyone be able to tell me what's wrong with my last command, and how could I modify it?

P.S. I'm a new user of RStudio. Could anyone be able to show me this problem with this software?

  • 2
    maybe this can help? [link](http://stackoverflow.com/questions/25872946/read-csv-throws-error) – simone Oct 09 '16 at 16:40
  • @simone This exercise wants I use the read.table() function. With the current link, I don't understand, but it still doesn't work. Could you explain a full solution? – SpinningAtInfinity Oct 09 '16 at 22:07
  • did you save the file as a csv? you can do that with excel – simone Oct 10 '16 at 08:48
  • Yes, but if I use the command `pretend.df1 <- read.table(file.choose(), header = T, sep = "\t") on the file `pretend.dat.csv` or `Classeur1.csv` here, I got `Warning message: In read.table(file.choose(), header = T, sep = "\t") : incomplete final line found by readTableHeader on '/Users/jg24/Documents/R/pretend.dat.csv'` ? I just don't know what happen – SpinningAtInfinity Oct 10 '16 at 10:50
  • I see. Have you tried this [link](http://stackoverflow.com/questions/5990654/incomplete-final-line-warning-when-trying-to-read-a-csv-file-into-r)? (answer by @NicolasStifani) – simone Oct 10 '16 at 12:32
  • The comment of @NicolasStifani concerns Windows, but I have a Macbook Pro. – SpinningAtInfinity Oct 11 '16 at 00:07

1 Answers1

0

I don't have enough reputation to comment, so I'll say it here instead: I'd simplify everything. So, save it as a csv file if you can and use the readr package. From there, you can call read_csv.

Perhaps even better, you could download the preview release of RStudio and, in the Environment tab, click Import Dataset before following the remaining instructions. Without knowing more, it is hard to tell if that will work, but I suspect that it should.

p0bs
  • 1,004
  • 2
  • 15
  • 22
  • I had already downloaded RStudio. Indeed, when I clicked import Dataset, the table is displayed in RStudio, but It didn't change anything with the file `Classeur.1.xlsx` or `Classeur.csv`. Could you tell me what should be the difference in using the function `read.table()` ? – SpinningAtInfinity Oct 09 '16 at 21:23