In the book A first course in statistical programming with R by W. John Braun and Duncan J. Murdoch page 31 (Data frames and read.table
function), they explain
Data sets frequently consist of more than one column of data, where each column represents measurements of a single variable. Each row usually represents a single observation. This format is referred to as case-by-variable format.
For example, the following data set consists of four observations on the three variables x, y, and z :
x y z
61 13 4
175 21 18
111 24 14
124 23 18
If such a data set is stored in a file called
pretend.dat
in the directory myfiles on the C:drive (this is in Windows, but I use a Mac), then it can be read into an R data frame. This can be the commands accomplished by typingpretend.df <- read.table("c:/myfiles/pretend.dat", header = T)
.In a data frame, the columns are named. To see the x colum, type
pretend.df$x
Problem (book) : Display the row 1, column 3 element of pretend.df
.
So far I created with my Macbook Pro this file with Excel (.xlsx or csv??). From there, I wrote pretend.df <- read.table("/Users/jg24/Documents/R/Classeur1.xlsx", header = T)
, and an error occured
Warning messages:
1: In read.table("/Users/jg24/Documents/R/Classeur1.xlsx", header = T) :
line 1 appears to contain embedded nulls
2: In read.table("/Users/jg24/Documents/R/Classeur1.xlsx", header = T) :
line 3 appears to contain embedded nulls
3: In read.table("/Users/jg24/Documents/R/Classeur1.xlsx", header = T) :
line 4 appears to contain embedded nulls
4: In read.table("/Users/jg24/Documents/R/Classeur1.xlsx", header = T) :
line 5 appears to contain embedded nulls
5: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
embedded nul(s) found in input
If I try the command pretend.df <- read.table("/Users/jg24/Documents/R/Classeur1.csv", header = T)
, I got
Warning message:
In read.table("/Users/jg24/Documents/R/Classeur1.csv", header = T) :
incomplete final line found by readTableHeader on '/Users/jg24/Documents/R/Classeur1.csv'
Question : Could anyone be able to tell me what's wrong with my last command, and how could I modify it?
P.S. I'm a new user of RStudio. Could anyone be able to show me this problem with this software?