Converting the format of a dataframe in R

Question

I am new in this forum, sorry for any issues... I have a dataframe (classification of substances with the classes) in the following format:

	A	B	C	D
1	Organic compounds	Benzenoids	Benzene	NA
2	Organic compounds	Benzenoids	Benzene	NA
3	Organic compounds	Organic oxygen compounds	NA	NA
4	NA	NA	NA	NA
5	Organic compounds	Benzenoids	NA	NA

At the end i need a dataframe with 2 columns. The result should be something like this:

class	count
Organic compounds; Benzenoids; Benzene	2
Organic compounds; Organic oxygen compounds	1
Organic compounds; Benzenoids	1

What is my first step? I tried to create a new column with the paste content of all the other columns like this:

df$class <- paste(df$A,df$B,df$C,df$D ,sep = "; ")

But the result is:

class
Organic compounds; Benzenoids; Benzene; NA
Organic compounds; Benzenoids; Benzene; NA
Organic compounds; Organic oxygen compounds; NA; NA
NA; NA; NA; NA
Organic compounds; Benzenoids; NA; NA

What would be a conceivable approach for this problem, to get the final result?

Thanks alot!

Have a look at [this](https://stackoverflow.com/q/13673894/15573469) — saz, Apr 09 '21 at 11:14

score 0 · Accepted Answer · answered Apr 09 '21 at 11:16

0

    library(dplyr)

    df$class<-gsub('; NA','',  paste(df$A,df$B,df$C,df$D ,sep = "; ") )
    df <- df[df$class!='NA',]
    
    df<-ddply(df,.(class),summarize, count=length(class) )

answered Apr 09 '21 at 11:16

Ashish Baid

513
4
9

score 0 · Answer 2 · answered Apr 09 '21 at 11:51

Will this work:

library(dplyr)
library(string)
df %>% mutate(across(everything(),~ replace_na(., ''))) %>% 
   mutate(class = trimws(paste(A,B,C,D, sep = ';'),whitespace = "''"), class = str_remove(class, ';+$')) %>% 
   count(class, name = 'count') %>% filter(!str_detect(class,'^$'))
# A tibble: 3 x 2
  class                                      count
  <chr>                                      <int>
1 Organic compounds;Benzenoids                   1
2 Organic compounds;Benzenoids;Benzene           2
3 Organic compounds;Organic oxygen compounds     1

Thank you for your help! The solution from @Ashish Baid is a bit better for me to understand. — Flow91, Apr 09 '21 at 13:08

Converting the format of a dataframe in R

2 Answers2