-4

I have a data in columns in dataframes as

ROMANIA ~ ROMANIA ~ ROMANIA ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0
SWITZERLAND ~ RUSSIAN FEDERATION ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0  
INDIA ~ 0 ~ 0~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 

and many more rows.

I want to remove data after first occurrence of zero. So final output look like

ROMANIA ~ ROMANIA ~ ROMANIA
SWITZERLAND ~ RUSSIAN FEDERATION
INDIA
zx8754
  • 52,746
  • 12
  • 114
  • 209
Romil
  • 1
  • 2
  • 1
    Are `~` characters actually in the dataframe column? Or are you trying to show different columns? Is it one column or 10 columns? – zx8754 Jul 11 '18 at 12:29
  • 3
    Please make your input data [reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – zx8754 Jul 11 '18 at 12:30
  • I have shown 3 rows of data frame having 10 columns. – Romil Jul 11 '18 at 12:36
  • @Romil i misinterpreted your data, assuming it were strings. What do you want as replacement for the 0 ? Is NA, then just use: `df[df == 0] <- NA` – Wimpel Jul 11 '18 at 12:45

3 Answers3

1

Use gsub to replace everything after the first occurrence of " ~ 0" (including that " ~ 0 "), with "" (=nothing)

v <- c("ROMANIA ~ ROMANIA ~ ROMANIA ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0",
       "SWITZERLAND ~ RUSSIAN FEDERATION ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0",
       "INDIA ~ 0 ~ 0~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0" )

gsub(" ~ 0.*", "", v)

#[1] "ROMANIA ~ ROMANIA ~ ROMANIA"      "SWITZERLAND ~ RUSSIAN FEDERATION" "INDIA"    
DJV
  • 4,743
  • 3
  • 19
  • 34
Wimpel
  • 26,031
  • 1
  • 20
  • 37
1

data:

library(magrittr)
df <- data.table::fread("
ROMANIA ~ ROMANIA ~ ROMANIA ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0
SWITZERLAND ~ RUSSIAN FEDERATION ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0  
                  INDIA ~ 0 ~ 0~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0 ~ 0",header=F,sep="~") %>% as.data.frame
#            V1                 V2      V3 V4 V5 V6 V7 V8 V9 V10
# 1     ROMANIA            ROMANIA ROMANIA  0  0  0  0  0  0   0
# 2 SWITZERLAND RUSSIAN FEDERATION       0  0  0  0  0  0  0   0
# 3       INDIA                  0       0  0  0  0  0  0  0   0

code:

df[,sapply(df,function(x)as.numeric(x) %>% {sum(.==0,na.rm=T) != length(x)})]

result:

#           V1                 V2      V3
#1     ROMANIA            ROMANIA ROMANIA
#2 SWITZERLAND RUSSIAN FEDERATION       0
#3       INDIA                  0       0
Andre Elrico
  • 10,956
  • 6
  • 50
  • 69
0

Since you haven't provided the sample data correctly so I couldn't completely tested it, try following once.

as.data.frame(lapply(df, function(y) gsub("~ 0.*", "", y)))
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93