2

Here is a sample of my data :

df<-read.table (text="ID    Name    Surname Colour  A1  A2  A3  Flow1   Day1    M1  M2  M3  Flow2   Day2    P1  P2  P3  Flow3   Day3
12  John    Smith   A   NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  N
12  John    Smith   B   NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  N
12  John    Smith   M   4   4   4   A   N   4   3   3   B   Y   2   3   2   Q   N
12  John    Smith   N   2   3   3   D   N   3   1   2   G   Y   3   3   2   R   N
22  Rose    Billy   OM  3   3   3   C   N   3   3   3   O   Y   3   4   4   G   N
22  Rose    Billy   OZ  4   4   4   F   N   4   4   4   P   N   5   5   5   G   N
22  Rose    Billy   QR  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
22  Rose    Billy   QP  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA

", header=TRUE)

I want to get this outcome:

 

   out<-read.table (text="ID    Name    Surname Colour  A1  A2  A3  Flow1   Day1    M1  M2  M3  Flow2   Day2    P1  P2  P3  Flow3   Day3
    12  John    Smith   M   4   4   4   A   N   4   3   3   B   Y   2   3   2   Q   N
    12  John    Smith   N   2   3   3   D   N   3   1   2   G   Y   3   3   2   R   N
    22  Rose    Billy   OM  3   3   3   C   N   3   3   3   O   Y   3   4   4   G   N
    22  Rose    Billy   OZ  4   4   4   F   N   4   4   4   P   N   5   5   5   G   N

    ", header=TRUE)

As you can see, I want to get the data for each colour and reduce my data set.

user330
  • 1,256
  • 1
  • 7
  • 12

2 Answers2

2

We may use across to filter - Based on the input data/expected output, it seems to remove rows where the columns ('A1' to 'Day3') are all NAs

library(dplyr)
df %>%    
   filter(across(A1:Day3,  complete.cases))

-output

ID Name Surname Colour A1 A2 A3 Flow1 Day1 M1 M2 M3 Flow2 Day2 P1 P2 P3 Flow3 Day3
1 12 John   Smith      M  4  4  4     A    N  4  3  3     B    Y  2  3  2     Q    N
2 12 John   Smith      N  2  3  3     D    N  3  1  2     G    Y  3  3  2     R    N
3 22 Rose   Billy     OM  3  3  3     C    N  3  3  3     O    Y  3  4  4     G    N
4 22 Rose   Billy     OZ  4  4  4     F    N  4  4  4     P    N  5  5  5     G    N
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you, but when I continue data for other colours, I do not get the data, Can we do it? – user330 Nov 18 '21 at 19:53
  • Can we do it for columns? For example, if I have data until Day6 – user330 Nov 18 '21 at 19:56
  • Sorry, If you look at colour ,, you will see the other colours QR and Qand QP. So I want to get these data in outcome for these coloures that are availavle in other columns. Your help appriciated – user330 Nov 18 '21 at 20:12
  • @user330 isn't this output same as your expected 'out` – akrun Nov 18 '21 at 20:13
  • Thanks, it does work for these data, thank you- But when I have more similar data, say until day 6 as last coloumn, the code dioes not work – user330 Nov 18 '21 at 20:14
  • @user330 I didn't understand the logic based on colors QR, Q, QP which you didn't explain in your post. – akrun Nov 18 '21 at 20:15
  • i.e. where is the QR matching in other columns? I see only a single letter in Flow columns – akrun Nov 18 '21 at 20:16
  • In the colour coloumn, you see 8 colours, in your codes, only codes M, N, OM and OZ are appeared , which is good. I want to see other colurs for A, B, QR and QP when I have other similar colunmns. – user330 Nov 18 '21 at 20:26
  • @user330 can you update your post with an example that doesn't work with the code along with expected output so that it becomes more clear – akrun Nov 18 '21 at 20:29
  • @user330 sure, will check that. thanks – akrun Nov 19 '21 at 17:12
  • @user330 If i understand the pattern, you have some columns have only NA should be grouped together where as no NA in a different way – akrun Nov 19 '21 at 17:33
2

We could use na.omit()

library(dplyr)
df %>% 
  na.omit()
ID Name Surname Colour A1 A2 A3 Flow1 Day1 M1 M2 M3 Flow2 Day2 P1 P2 P3 Flow3 Day3
3 12 John   Smith      M  4  4  4     A    N  4  3  3     B    Y  2  3  2     Q    N
4 12 John   Smith      N  2  3  3     D    N  3  1  2     G    Y  3  3  2     R    N
5 22 Rose   Billy     OM  3  3  3     C    N  3  3  3     O    Y  3  4  4     G    N
6 22 Rose   Billy     OZ  4  4  4     F    N  4  4  4     P    N  5  5  5     G    N
TarJae
  • 72,363
  • 6
  • 19
  • 66
  • Why this does not work when I continue it until Day 6. any help? – user330 Nov 18 '21 at 20:03
  • If your data is in the same structure it should work also with any additional Day. So maybe you could `dput` the new data (until day 6). – TarJae Nov 18 '21 at 20:07