As far as I can see, you have 3 problems in your data:
- Blank lines in your data file.
- Some of the values in the first column start with an uppercase letter, some with lower case.
- The data is not in the format you like to see (i.e.: wide format)
This can be solved as follows:
1) Read the data by using the correct separator and the blank.lines.skip
parameter (and possibly also fill=TRUE
):
mydf <- read.table(text="colA: dataA
colB: dataB
colC: dataC
ColA: dataA
ColB: dataB
ColC: dataC", sep=":", header=FALSE, blank.lines.skip=TRUE)
this gives:
> mydf
V1 V2
1 colA dataA
2 colB dataB
3 colC dataC
4 ColA dataA
5 ColB dataB
6 ColC dataC
2) Capitalize the values in the first column:
mydf$V1 <- gsub('(^[a-z])','\\U\\1', mydf$V1, perl=TRUE)
3) Reshape to wide format:
library(data.table)
dcast(setDT(mydf), rowid(V1) ~ V1, value.var = 'V2')[, V1 := NULL][]
which gives:
ColA ColB ColC
1: dataA dataB dataC
2: dataA dataB dataC
The above reshaping solution uses the development version (1.9.7) of data.table.
For more alternatives of reshaping your data, see "Transposing Long to Wide without Timevar"