0

I am using fread() to get data like this:

fread(data, select = c("col1", "col2", "missing", "col3"))

where column with the name of "missing" doesn't exist. fread() will automatically drop "missing" column and return dataset like this:

col1   col2   col3 
a      b      c
d      e      f

...

I am wondering if there is a way to change the result to:

col1   col2   missing  col3
a      b      NA       c
d      e      NA       f
zx8754
  • 52,746
  • 12
  • 114
  • 209
Jonnce
  • 29
  • 1
  • 6
  • You could consider adding the empty rows manually after reading the results. – JAD Jul 05 '17 at 14:27
  • I assume we are trying to read multiple files within a loop, then combine them when some might have mismatching column names, if so, see [this post](https://stackoverflow.com/questions/18003717/is-there-any-efficient-way-than-rbind-filllist). We can just fread all in a *lapply* loop then *rbind* with *fill*. – zx8754 Jul 05 '17 at 14:47
  • @zx8754 that solved my problem perfectly! Thank you – Jonnce Jul 07 '17 at 15:08

1 Answers1

0

This question is tagged with data.table and fread but has attracted a base R answer. Therefore, I felt obliged to post a data.table solution.

The OP wants to add a particular column at a specific place inside a data.table where this column appears to be missing. The OP seems to expect that the select parameter to fread() could be used for that purpose but fread() does print a

Warning message:
In fread("col1 col2 col3 \na b c\nd e f", :
Column name 'missing' not found in column name header (case sensitive), skipping.

So the missing column has to be added afterwards:

library(data.table)

# add column by reference
DT[, missing := NA][]
   col1 col2 col3 missing
1:    a    b    c      NA
2:    d    e    f      NA
# rearrange column order
setcolorder(DT, c("col1", "col2", "missing", "col3"))[]
   col1 col2 missing col3
1:    a    b      NA    c
2:    d    e      NA    f

Note that data.table excutes both operations by reference, i.e., it is not required to copy the whole object in order to add a single column or to change an attribute.

Data

library(data.table)
DT <- fread(
"col1   col2   col3 
a      b      c
d      e      f",
select = c("col1", "col2", "missing", "col3"))
Community
  • 1
  • 1
Uwe
  • 41,420
  • 11
  • 90
  • 134