-1

I am working on requirement where the input data is in below format.

Name XYZ AGE 30 Country India Mobile 1234567890
Name ABC AGE 35 Country Russia Mobile 2345678901

I want to import this data into R & want to reshape it . i.e. "Name" "AGE" "Country" "Mobile" should be the column header .

zx8754
  • 52,746
  • 12
  • 114
  • 209
puneet
  • 1
  • How is the data stored? is it in a text file? How are the fields delimited? – Robin Gertenbach May 06 '16 at 08:19
  • Welcome to Stack Overflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – zx8754 May 06 '16 at 08:35

3 Answers3

1

How about you create a data frame first with the values and then add the names as follows,

x <- c('Name XYZ AGE 30 Country India Mobile 1234567890',
           'Name ABC AGE 35 Country Russia Mobile 2345678901')

df <- as.data.frame(do.call(rbind, lapply(strsplit(x, ' '), function(i) i[c(FALSE, TRUE)])))
names(df) <- unlist(strsplit(x[1], ' '))[c(TRUE, FALSE)]
df
#  Name AGE Country     Mobile
#1  XYZ  30   India 1234567890
#2  ABC  35  Russia 2345678901
Sotos
  • 51,121
  • 6
  • 32
  • 66
1

Assuming that the data is stored in a data.frame df1

df1 <- read.table(text="Name XYZ AGE 30 Country India Mobile 1234567890
                        Name ABC AGE 35 Country Russia Mobile 2345678901")

You could create a new data.frame df2 by selecting every second (even-numbered) column

df2 <- df1[c(FALSE,TRUE)]

and assign the column names by using every second (odd-numbered) entry in the first row of df1:

colnames(df2) <- unlist(df1[1, c(TRUE, FALSE)])

The data.frame df1 can then be deleted with rm(df1). This is the result for df2:

#> df2
#  Name AGE Country     Mobile
#1  XYZ  30   India 1234567890
#2  ABC  35  Russia 2345678901

The same procedure could be written as a one-liner. Arguably less clear, but certainly more compact:

df1 <- `colnames<-`(df1[c(FALSE,TRUE)], unlist(df1[1,c(TRUE,FALSE)]))

In that case the second data.frame df2 is not needed.

RHertel
  • 23,412
  • 5
  • 38
  • 64
  • Worked for me..Thanks.. Trying the other suggestions also..will update here with the results.. – puneet May 06 '16 at 09:37
0

A combination of matrix and unlist should do the trick. Like

tidyData <- data.frame(matrix(unlist(dataByLine), nrow=length(fileByLines), byrow=T),stringsAsFactors=F))

If you had a minimum reproducible example, this would be easier to answer

mondano
  • 827
  • 10
  • 29