1

I have two dataframes:

>temp

      Var1                Freq1
1   file-upload             1
2   image-processing        1
3     mime-types            1
4       php                 5


>top 

        Var2               Freq2
1   file-upload             1
2   image-processing        1
3     mime-types            1
4       php                 5
5      upload               1
6      firefox              2
7   machine-learning        1
8     matlab                1
9        r                  2
10      c#                  7

Now I am doing:

m1 <- merge(temp, top, by.x = "Var1", by.y = "Var2", all.x = TRUE)

Them m1 will be:

       Var1               Freq1        Freq2
1   file-upload             1            1
2   image-processing        1            1
3     mime-types            1            1
4       php                 5            5

but the number of levels of m1$Var1 is showing 10 not 4 and when I am trying to do a split of m1 based on the Values of Var1

x <- split(m1, m1$Var1)

The length(x) is 10 not 4 and the 6 elements are showing this kind of values

$c#
[1] Var1 Freq1  Freq2  
<0 rows> (or 0-length row.names)

I want to remove these elements from the list or is there any way that when i am merging I have the number of level equal to that of temp df.

Frank
  • 66,179
  • 8
  • 96
  • 180
tanay
  • 468
  • 5
  • 16

2 Answers2

2

You can wrap droplevels around the merge to remove unused levels:

x <- data.frame(var=letters[1:3],freq1=1:3)
y <- data.frame(var=letters[2:4],freq2=2:4)

merge(x,y)$var
[1] b c
Levels: a b c

droplevels(merge(x,y))$var
[1] b c
Levels: b c
James
  • 65,548
  • 14
  • 155
  • 193
1

You can remove empty levels by

m1$Var1 <- factor(m1$Var1)
Gavin Kelly
  • 2,374
  • 1
  • 10
  • 13