I have a data.table
obtained from a somewhat quirky file:
library(data.table)
istub <- setDT(read.fwf( 'http://www.bls.gov/cex/pumd/2016/csxistub.txt',
widths=c(2,3,64,12,2,3,10), skip=1,
stringsAsFactors=FALSE, strip.white=TRUE,
col.names = c( "type", "level", "title", "UCC",
"survey", "factor","group" )
) )
One of the quirks of the file is that if type==2
, the row merely holds a continuation of the previous row's title
field.
So, I want to append the continuation title
to the previous row's title. I assume there is only ever one continuation line per ordinary line.
For each example, please begin with:
df <- copy(istub) # avoids extra requests of file
Base R solution: (desired result)
I know I can do:
# if type == 2, "title" field should be appended to the above row's "title" field
continued <- which(df$type==2)
# You can see that these titles are incomplete,
# e.g., "School books, supplies, equipment for vocational and"
tail(df$title[continued-1])
df$title[continued-1] <- paste(df$title[continued-1],df$title[continued])
# Now they're complete
# e.g., "School books, supplies, equipment for vocational and technical schools"
tail(df$title[continued-1])
# And we could get rid of the continuation lines
df <- df[-continued]
However, I would like to practice some data.table fu.
Attempts using data.table
First I tried using shift()
to subset .i
, but that didn't work:
df[shift(type, type='lead')==2,
title := paste(title, shift(title, type='lead') ) ] # doesn't work
This works:
df[,title := ifelse( shift(type, type='lead')==2,
paste(title, shift(title, type='lead')),
title ) ]
Am I stuck with two shift
s (seems inefficient) or is there an awesomer way?