-2

If I have a data frame df in R:

area industry value
1       31-33   6
2       44-45   1023
3       48-49    8

How would I replace 31-33 with 31, 44-45 with 44, and 48-49 with 48? Every example on this site that I've tried hasn't worked. My latest try was

levels(df$industry)[levels(df$industry)=="31-33"] <- "31"

But like everything else I've tried, once I actually write the data to a txt file and import it into SQL it appears as a null.

Keep in mind that there are more columns and an enourmous amount of rows with more industry codes besides these three, but these three are the only ones that need to be changed. Thanks.

J.Jack
  • 51
  • 8
  • `df$industry <- sub("-.*", "", as.character(df$industry))` Eventually you can do it on the levels of the factor. – jogo Mar 22 '17 at 15:13
  • 1
    I'll try this and get back to you jogo, thanks. – J.Jack Mar 22 '17 at 15:14
  • [This is better](http://stackoverflow.com/questions/25307899/r-remove-anything-after-comma-from-column) ... – Sotos Mar 22 '17 at 15:18
  • That worked, thank you very much! So this is replacing everything after the dash with nothing, correct? then putting it back into the column as a character? – J.Jack Mar 22 '17 at 15:18
  • @J.Jack During "putting it back" the old column is substituted by the new one, eventually it is coerced to a factor (as part of the dataframe). – jogo Mar 22 '17 at 15:22

1 Answers1

1

you can try sub function it's like

df$industry<-sub("31-33","31",df$industry)

Liun
  • 117
  • 2