26

I have the following table:

     X.5       X.6       X.7       X.8          X.9 X.10         X.11  X.12   X.13
17   Zip CuCurrent PaCurrent PoCurrent      Contact  Ext          Fax email Status
18  74136         0         1         0 918-491-6998    0 918-491-6659            1
19  30329         1         0         0 404-321-5711                              1
20  74136         1         0         0 918-523-2516    0 918-523-2522            1
21  80203         0         1         0 303-864-1919    0                         1
22  80120         1         0         0 345-098-8890  456                         1

how can make the first row 'zip, cucurrent, pacurrent...' to be the column header?

Thanks,

below is dput(dat)

structure(list(X.5 = structure(c(26L, 14L, 6L, 14L, 17L, 16L), .Label = c("", 
"1104", "1234 I don't know Ave.", "139.98", "300 Morgan St.", 
"30329", "312.95", "4101 S. 4th Street, Traff", "500 Highway 89 North", 
"644.04", "656.73", "72160", "72336-7000", "74136", "75501", 
"80120", "80203", "877.87", "Address1", "BZip", "General Svcs Admin (WPY)", 
"InvFileName2", "LDC_Org_Cost", "N/A", "NULL", "Zip"), class = "factor"), 
    X.6 = structure(c(7L, 2L, 3L, 3L, 2L, 3L), .Label = c("", 
    "0", "1", "301 7th St. SW", "800-688-6160", "Address2", "CuCurrent", 
    "Emergency", "LDC_Cost_Adj", "Mtelemetry", "N/A", "NULL", 
    "Suite 1402"), class = "factor"), X.7 = structure(c(8L, 3L, 
    2L, 2L, 3L, 2L), .Label = c("", "0", "1", "Address3", "Cucustomer", 
    "LDC_Misc_Fee", "NULL", "PaCurrent", "Room 7512"), class = "factor"), 
    X.8 = structure(c(14L, 2L, 2L, 2L, 2L, 2L), .Label = c("", 
    "0", "100.98", "237.02", "242.33", "335.04", "50.6", "City", 
    "Durham", "LDC_FinalVolume", "Leavenwoth", "Pacustomer", 
    "Petersburg", "PoCurrent", "Prescott", "Washington"), class = "factor"), 
    X.9 = structure(c(18L, 16L, 10L, 17L, 7L, 9L), .Label = c("", 
    "0", "1", "139.98", "20024", "27701", "303-864-1919", "312.95", 
    "345-098-8890", "404-321-5711", "644.04", "656.73", "66048", 
    "86313", "877.87", "918-491-6998", "918-523-2516", "Contact", 
    "LDC_FinalCost", "PoCustomer", "Zip"), class = "factor"), 
    X.10 = structure(c(14L, 2L, 1L, 2L, 2L, 9L), .Label = c("", 
    "0", "2.620194604", "2.710064788", "2.717239052", "2.766403162", 
    "202-708-4995", "3.09912854", "456", "804-504-7200", "913-682-2000", 
    "919-956-5541", "928-717-7472", "Ext", "InvoicesNeeded", 
    "LDC_UnitPrice", "NULL", "Phone"), class = "factor"), X.11 = structure(c(7L, 
    4L, 1L, 5L, 1L, 1L), .Label = c("", " ", "1067", "918-491-6659", 
    "918-523-2522", "Ext", "Fax", "InvoiceMonths", "LDC_UnitPrice_Original", 
    "NULL", "x2951"), class = "factor"), X.12 = structure(c(13L, 
    1L, 1L, 1L, 1L, 1L), .Label = c("", "0", "100.98", "202-401-3722", 
    "237.02", "242.33", "335.04", "50.6", "716- 344-3303", "804-504-7227", 
    "913- 758-4230", "919- 956-7152", "email", "Fax", "GSA", 
    "Supp_Vol"), class = "factor"), X.13 = structure(c(10L, 2L, 
    2L, 2L, 2L, 2L), .Label = c("", "1", "15", "202-497-6164", 
    "3", "804-504-7200", "Emergency", "MajorTypeId", "NULL", 
    "Status", "Supp_Vol_Adj"), class = "factor")), .Names = c("X.5", 
"X.6", "X.7", "X.8", "X.9", "X.10", "X.11", "X.12", "X.13"), row.names = 17:22, class = "data.frame")
neuron
  • 1,949
  • 1
  • 15
  • 30
PMa
  • 1,751
  • 7
  • 22
  • 28
  • A `dput()` of the table would help. But you can do a `colnames(dat) <- as.character(dat[1,])` to set the column names and normal R syntax to "delete" the first row. – hrbrmstr Apr 22 '14 at 02:34
  • @RichardScriven good point. In retrospect that does look like a missing `header=TRUE` in a `read.…` function, but why do the row names start at 17 then? – hrbrmstr Apr 22 '14 at 02:35
  • @hrbrmstr `as.character(dat[1,])` returns the numeric indices of the factor levels in the first row (as character). I do not know why that is. – Matthew Lundberg Apr 22 '14 at 02:38
  • Weird. I made sure it did what I thought it did on a huge data.table of netflow records. Ah. But none of those were factors (just checked). Wish we had a `dput()` to work from :-) – hrbrmstr Apr 22 '14 at 02:42
  • @RichardScriven, the `dat` is a subset of a csv file `gas`. I used this code to subset `dat = gas[17:22,7:15]`. Row 17 in my original post should be the header of `dat`, but I am not sure how to change that. – PMa Apr 22 '14 at 03:09
  • @PerriMa can you write `dput(gas)` in your console and paste the result here? That way you will get good answers. – RockScience Apr 22 '14 at 07:30
  • @RockScience - just did in my original post. thanks! – PMa Apr 22 '14 at 14:13

7 Answers7

37

This could be in a simple way :

step 1: Copy 1st row to header:

names(dat) <- dat[1,]

step 2: Delete 1st row :

dat <- dat[-1,]
Shalini Baranwal
  • 2,780
  • 4
  • 24
  • 34
28

If you don't want to re-read the data into R (which it seems like you don't from the comments), you can do the following. I had to add some zeros to get your data to read completely, so disregard those.

dat
##       V2        V3        V4        V5           V6  V7           V8    V9    V10
## 17   Zip CuCurrent PaCurrent PoCurrent      Contact Ext          Fax email Status
## 18 74136         0         1         0 918-491-6998   0 918-491-6659     0      1
## 19 30329         1         0         0 404-321-5711   0            0     0      1
## 20 74136         1         0         0 918-523-2516   0 918-523-2522     0      1
## 21 80203         0         1         0 303-864-1919   0            0     0      1
## 22 80120         1         0         0 345-098-8890 456            0     0      1

First take the first row as the column names. Next remove the first row. Finish it off by converting the columns to their appropriate types.

names(dat) <- as.matrix(dat[1, ])
dat <- dat[-1, ]
dat[] <- lapply(dat, function(x) type.convert(as.character(x)))
dat
##     Zip CuCurrent PaCurrent PoCurrent      Contact Ext          Fax email Status
## 1 74136         0         1         0 918-491-6998   0 918-491-6659     0      1
## 2 30329         1         0         0 404-321-5711   0            0     0      1
## 3 74136         1         0         0 918-523-2516   0 918-523-2522     0      1
## 4 80203         0         1         0 303-864-1919   0            0     0      1
## 5 80120         1         0         0 345-098-8890 456            0     0      1
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
  • One thing that is useful is the library `janitor`'s `clean_names` to avoid the `lapply` statement. so `dat <- janitor::clean_names(aum)` – roarkz Aug 30 '17 at 21:18
17

The cleanest way to do this is a simple function already designed for this purpose. You need janitor package.

janitor::row_to_names(dat)

If you want the nth row to be used for column names the second parameter of the function is the row number to be used. The default is 1.

Lazarus Thurston
  • 1,197
  • 15
  • 33
5

If you get it from a csv file, use the argument 'header' from read.csv

dat=read.csv("gas.csv", header=TRUE)

if you already have your data and don't want to / or cannot get it in a clean way, you can alwasy do

dat=structure(list(X.5 = structure(c(26L, 14L, 6L, 14L, 17L, 16L), .Label = c("", "1104", "1234 I don't know Ave.", "139.98", "300 Morgan St.", "30329", "312.95", "4101 S. 4th Street, Traff", "500 Highway 89 North", "644.04", "656.73", "72160", "72336-7000", "74136", "75501", "80120", "80203", "877.87", "Address1", "BZip", "General Svcs Admin (WPY)", "InvFileName2", "LDC_Org_Cost", "N/A", "NULL", "Zip"), class = "factor"), X.6 = structure(c(7L, 2L, 3L, 3L, 2L, 3L), .Label = c("", "0", "1", "301 7th St. SW", "800-688-6160", "Address2", "CuCurrent", "Emergency", "LDC_Cost_Adj", "Mtelemetry", "N/A", "NULL", "Suite 1402"), class = "factor"), X.7 = structure(c(8L, 3L, 2L, 2L, 3L, 2L), .Label = c("", "0", "1", "Address3", "Cucustomer", "LDC_Misc_Fee", "NULL", "PaCurrent", "Room 7512"), class = "factor"), X.8 = structure(c(14L, 2L, 2L, 2L, 2L, 2L), .Label = c("", "0", "100.98", "237.02", "242.33", "335.04", "50.6", "City", "Durham", "LDC_FinalVolume", "Leavenwoth", "Pacustomer", "Petersburg", "PoCurrent", "Prescott", "Washington"), class = "factor"), X.9 = structure(c(18L, 16L, 10L, 17L, 7L, 9L), .Label = c("", "0", "1", "139.98", "20024", "27701", "303-864-1919", "312.95", "345-098-8890", "404-321-5711", "644.04", "656.73", "66048", "86313", "877.87", "918-491-6998", "918-523-2516", "Contact", "LDC_FinalCost", "PoCustomer", "Zip"), class = "factor"), X.10 = structure(c(14L, 2L, 1L, 2L, 2L, 9L), .Label = c("", "0", "2.620194604", "2.710064788", "2.717239052", "2.766403162", "202-708-4995", "3.09912854", "456", "804-504-7200", "913-682-2000", "919-956-5541", "928-717-7472", "Ext", "InvoicesNeeded", "LDC_UnitPrice", "NULL", "Phone"), class = "factor"), X.11 = structure(c(7L, 4L, 1L, 5L, 1L, 1L), .Label = c("", " ", "1067", "918-491-6659", "918-523-2522", "Ext", "Fax", "InvoiceMonths", "LDC_UnitPrice_Original", "NULL", "x2951"), class = "factor"), X.12 = structure(c(13L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "0", "100.98", "202-401-3722", "237.02", "242.33", "335.04", "50.6", "716- 344-3303", "804-504-7227", "913- 758-4230", "919- 956-7152", "email", "Fax", "GSA", "Supp_Vol"), class = "factor"), X.13 = structure(c(10L, 2L, 2L, 2L, 2L, 2L), .Label = c("", "1", "15", "202-497-6164", "3", "804-504-7200", "Emergency", "MajorTypeId", "NULL", "Status", "Supp_Vol_Adj"), class = "factor")), .Names = c("X.5", "X.6", "X.7", "X.8", "X.9", "X.10", "X.11", "X.12", "X.13"), row.names = 17:22, class = "data.frame")
dat2 = dat[2:6,]   
colnames(dat2) = dat[1,] 
dat2
RockScience
  • 17,932
  • 26
  • 89
  • 125
  • the data.frame you gave doesn't have 22 rows... your question is not formulated properly. I have changed the answer now that you provided the dput – RockScience Apr 23 '14 at 02:03
1

Please use header=TRUE when importing data into R!

UseR10085
  • 7,120
  • 3
  • 24
  • 54
darkage
  • 857
  • 3
  • 12
  • 22
1

If you are able to re-read the data into R from the file, you could also just add the "skip" argument to read.csv to skip over the first 16 lines and use line 17 as the header:

dat=read.csv("contacts.csv", skip=16, nrows=5, header=TRUE)
Petter Friberg
  • 21,252
  • 9
  • 60
  • 109
Brad E
  • 106
  • 1
  • 3
0

Shalini Baranwal's answer is the best, so I will upvote it. However, future readers may get an error message when running that solution. My error was:

"Error in setnames(x, value) : Passed a vector of type 'list'. Needs to be type 'character'."

To get around this, my modified solution was to add an as.character() wrapper to the first step. The full solution below:

step 1: Copy 1st row to header:

dat <- mtcars
names(dat) <- as.character(dat[1,])

step 2: Delete 1st row :

dat <- dat[-1,]
  • 3
    This should be posted as a comment on—or, perhaps, a suggested edit of—the referenced answer. While useful, it doesn't offer a self-contained answer to the original question. – Jeremy Caney Dec 23 '21 at 00:10
  • I do not have enough reputation to comment on any post. My previous answer has been edited to be self-contained. Thanks – davidbaseball Dec 23 '21 at 16:21