Set a Data Frame Column as the Index of R data.frame object

Question

Using R, how do I make a column of a dataframe the dataframe's index? Lets assume I read in my data from a .csv file. One of the columns is called 'Date' and I want to make that column the index of my dataframe.

For example in Python, NumPy, Pandas; I would do the following:

df = pd.read_csv('/mydata.csv')
d = df.set_index('Date')

Now how do I do that in R?

I tried in R:

df <- read.csv("/mydata.csv")
d <- data.frame(V1=df['Date'])
# or
d <- data.frame(Index=df['Date'])

# but these just make a new dataframe with one 'Date' column. 
#The Index is still 0,1,2,3... and not my Dates.

You probably want to use `data.table`? http://cran.r-project.org/web/packages/data.table/index.html — Julián Urbano, Dec 17 '13 at 19:40
the `index` you are saying is probably `row.names`, so, `row.names(d) <- df['Date']` — Ananta, Dec 17 '13 at 19:42

score 65 · Accepted Answer · edited Dec 17 '13 at 19:43

65

I assume that by "Index" you mean row names. You can assign to the row names vector:

rownames(df) <- df$Date

edited Dec 17 '13 at 19:43

Julián Urbano

8,378
1
30
52

answered Dec 17 '13 at 19:41

Matthew Lundberg

42,009
6
90
112

2

But it keeps Date as a column of data frame ... How to remove it from columns ? – scls Mar 10 '16 at 20:24
2

You can remove it by assigning `NULL` to the columns: `df$Date <- NULL` – Matthew Lundberg Mar 10 '16 at 20:54

score 22 · Answer 2 · answered Sep 20 '18 at 10:27

22

The index can be set while reading the data, in both pandas and R.

In pandas:

import pandas as pd
df = pd.read_csv('/mydata.csv', index_col="Date")

In R:

df <- read.csv("/mydata.csv", header=TRUE, row.names="Date")

answered Sep 20 '18 at 10:27

bli

7,549
7
48
94

For R, you can also give the column number for row.names. For instance if you want the first column to be the index, you can give row.names=1. – nobot Jul 18 '23 at 18:05

score 13 · Answer 3 · answered Aug 20 '19 at 11:55

13

The tidyverse solution:

library(tidyverse)
df %>% column_to_rownames(., var = "Date")

answered Aug 20 '19 at 11:55

Koot6133

1,428
15
26

score 0 · Answer 4 · answered Jun 22 '23 at 11:10

The function match is very helpful when you need the indices of a first vector in a second vector; example: after tabulating one vector, I have obtained a table with 2 columns, the first one with the items and the second one with the frequency; suppose that you need to add a 3rd column to the frequency table, with the description of the data in the first column, that belongs to another dataset that has a complete list of your data in column1 of the frequency table, and the related name in another column (like a "dictionary"). First you save the match between the items in the first column of the frequency table with the items name column of the "dictionary" dataset; then you can use the saved match to access the related names.

score -1 · Answer 5 · edited Aug 31 '21 at 01:51

-1

while saving the dataframe use row.names=F e.g. write.csv(prediction.df, "my_file.csv", row.names=F)

edited Aug 31 '21 at 01:51

ah bon

9,293
12
65
148

answered Mar 24 '21 at 12:14

Gaurav Sahu

1

This answer is nothing to do with setting an index – Conor Neilson Mar 24 '21 at 21:10

Set a Data Frame Column as the Index of R data.frame object

5 Answers5

Linked