1

I am new in this group (and also a quite-new R user) and I have a question. I have a data.table like this

Date             V2                       Deal Type
-----------------

1: 2009-1       Public sector bank        Corporate Bond-Investment-Grade                 
2: 2009-1       Private sector bank       Corporate Bond-Investment-Grade                 
3: 2009-7       Private sector industrial Corporate Bond-Investment-Grade                   
4: 2009-1       Private sector bank       Corporate Bond-Investment-Grade                  
5: 2009-1       Private sector bank       Covered Bond                         
6: 2009-1       Public sector bank        Corporate Bond-Investment-Grade                 
7: 2009-1       Private sector bank       Corporate Bond-Investment-Grade  

The question is how do I change the names of variables (and variables) in column V2. For example i want that "public sector bank" and "private sector bank" would appear in a new column as "financial" and "private sector industrial" and "public sector industrial" as "non-financial". Hope I have been sufficiently clear. Thank you very much for your help.

nicola
  • 24,005
  • 3
  • 35
  • 56
willpar
  • 35
  • 2
  • Possible duplicate of http://stackoverflow.com/questions/7531868/how-to-rename-a-single-column-in-a-data-frame-in-r, http://stackoverflow.com/questions/5824173/replace-a-value-in-a-data-frame-based-on-a-conditional-if-statement-in-r etc. – m-dz Apr 07 '16 at 10:31
  • When using the package data.table I recommend their offcial Cheat Sheet https://s3.amazonaws.com/assets.datacamp.com/img/blog/data+table+cheat+sheet.pdf – Berecht Apr 07 '16 at 12:33

3 Answers3

1

replace() can be handy in this scenario. Assuming your dataframe as DF and your new column as V2new:

# Creating new column V2new and replacing "Public/Private sector bank" to "financial" 
DF$V2new <- replace(DF$V2 ,DF$V2 =="Public sector bank"|DF$V2=="Private sector bank","financial") 
# Replacing "Public/Private sector industrial"  from V2new to "non-financial"
DF$V2new <-  replace(DF$V2new ,DF$V2new =="Public sector industrial"|DF$V2new =="Private sector industrial","non-financial")
0

Assuming your dataframe is called df, you could do something like:

df <- read.csv("data.csv", stringsAsFactors=FALSE)

df$newColumn[df$V2 == "Public sector bank" | df$V2 == "Private sector bank"] <- "financial"
df$newColumn[df$V2 == "Public sector industrial" | df$V2 == "Private sector industrial"] <- "non-financial"

or if you're sure that your V2 fields have the words "bank" and "industrial" in them, and thats how you're determining what to call the values in the new column, you could do this:

df$newColumn[grepl("bank", df$V2)] <- "financial"
df$newColumn[grepl("industrial", df$V2)] <- "non-financial"

This works the same way with data tables as well

Simon
  • 9,762
  • 15
  • 62
  • 119
0

if DT is your data.table

`DT[,':='(V3 = ifelse(V2 %in% c("Public sector bank","Private sector bank"),"Non financial","Financial")`]

It is usually a good practice to normalize text fields so you could consider:

DT[,':='(V3 = ifelse(tolower(gsub(" ","",V2)) %in% c("publicsectorbank","privatesectorbank"),"Non financial","Financial")]

Hope this helps, I also recommend https://s3.amazonaws.com/assets.datacamp.com/img/blog/data+table+cheat+sheet.pdf

Zahiro Mor
  • 1,708
  • 1
  • 16
  • 30
Gaurav Taneja
  • 1,084
  • 1
  • 8
  • 19