3

To remove rows from a data frame, I use the following command:

data <- data[-1, ] 

for example to remove the first row. I need to remove the first 6 rows, so I used the following:

data <- data[-c(1,2,3,4,5,6), ]

OR

data <- data[-(1:6), ]

this works as far as removing the row names, but introduced a new column called row.names that I cannot get rid of unless I use the command:

row.names(data) <- NULL

What is the reason for this? Is there a better way of removing a number of rows/columns with one command?

Example: enter image description here

after the following code:

tquery <- tquery[-(1:6), ]

This is the data: enter image description here

Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
WeakLearner
  • 918
  • 14
  • 26
  • 1
    Don't vote to close - OP has discovered a gotcha with `View()` on dataframes. – smci Jan 19 '15 at 23:47
  • Is this a "gotcha" if it is clearly documented in a (very short) help file? `If there are row names on the data frame that are not 1:nrow, they are displayed in a separate first column called row.names.` – rawr Jan 19 '15 at 23:50
  • Another issue with `View()` under RStudio: http://stackoverflow.com/questions/19341853/r-view-does-not-display-all-columns-of-data-frame – smci Jan 19 '15 at 23:50
  • I hate `View` for other reasons: `mat1 <- \`colnames<-\`(matrix(1:10, 5, 2), c('x','y')); mat2 <- \`colnames<-\`(matrix(11:20, 5, 2), c('x','y')); View(cbind(mat1, mat2)); rownames(mat1) <- rownames(mat2) <- letters[1:5]; View(cbind(mat1, mat2))` With NULL rownames, it binds as expected. With non NULL rownames, the first occurrence is just repeated. It appears as expected in the console. wtf `View`? – rawr Jan 19 '15 at 23:54
  • `print.data.frame` and `View` are not intended to do the same thing (as per the description of each), IMO, and one could easily be fixed to be identical to the other.. but they are not – rawr Jan 19 '15 at 23:57
  • It seems like my above comment is also an rstudio bug since this behavior does not occur in my terminal – rawr Jan 20 '15 at 00:00

2 Answers2

6

Although it seems as such, you are not actually adding a column to the data. What you are seeing is just a result of using View(). The function is showing the "row.names" attribute of the data frame as the first column, but you didn't really add the column.

This is expected and documented behavior. From the Details section of help(View)

If there are row names on the data frame that are not 1:nrow, they are displayed in a separate first column called row.names.

So since you subsetted the data, the row names are technically not 1:nrow any more and hence the new column is introduced in the viewer.

Print your data in the console and you'll see the difference.

View(mtcars) ## because the mtcars row names are not 1:nrow

versus

mtcars

Basically, don't trust View() to display an exact representation of the actual data. Instead use attributes(), *names(), dim(), length(), etc. or just peek at the data with head().

Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
2

See r help via "?row.names" for more info. From the documentation, "All data frames have a row names attribute"

?row.names ## get more information about row.names from r help

row.names is not a new column, but rather an attribute of every single data frame. This is simply meta data and is ignored by most data. When you output this data (i.e. CSV) or use it in a function, this data will not interfere. This is similar to how excel has row numbers on the left margin, which is referential data for the application.

str(your_dataframe) ## see that those columns don't exist
colnames(your_dataframe) ## see column names
Super_John
  • 1,767
  • 2
  • 14
  • 27