0

I imported a csv file into R. The first column has my observations and I have 5 variables. However, when I import it into R it takes my column of observations as a variable, and tells me I have 6 variables. How do I make it understand that the first column of "cars" is a column of observations? I attach a picture for reference.

Thank you,

Marianaenter image description here

  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Pictures of data aren't very helpful. What function did you use to import the data? Are you trying to set the first column to row names? I'm not sure what the problem is exactly that you are trying to fix. Is there a reason it needs to be exactly 5 columns? – MrFlick Mar 04 '20 at 21:48
  • Hi Marianna - welcome to SO and congrats on your first post! Two suggestions (1) add the image inline as it is hard to find as a link. (2) Add the code using the code {} markup. Cheers! – CoolDocMan Mar 04 '20 at 21:53
  • 2
    So, you *can* set the first column as "row names" in R. But many people try not to use row names in R, as they're not as easy to work with. It's totally fine to have a column in your dataframe contain information that defines your observations, it doesn't mean they will be treated as "outcome variables" – Marius Mar 04 '20 at 22:32

1 Answers1

0

You should be able to specify this with the row.names parameter in read.csv. Although I can't say exactly what to type since I don't have the original dataset, it should be something like:

read.csv(file = "myfile.csv", row.names = 1, [other options])

indicating that row names can be found in the first column.

If you're using some other method of importing the file (e.g. by using the RStudio graphical interface), there should be an option somewhere along the way to specify the location of row names.

Alternatively, a possibly easier approach is suggested by the read.csv documentation:

If row.names is not specified and the header line has one less entry than the number of columns, the first column is taken to be the row names. This allows data frames to be read in from the format in which they are printed. If row.names is specified and does not refer to the first column, that column is discarded from such files.

Try deleting the X in the top left corner of your .csv file (and delete the comma that follows it) and see if that gets you anywhere.

EDIT Marius has the right suggestion, by the way - just ignore the junk column and work with row numbers instead. (What's the harm?)

Aaron Montgomery
  • 1,387
  • 8
  • 11