1

I'm having a problem merging two data frames in R.

The first one consists of 103731 obs of 6 variables. The variable that I have to use to merge has 77111 unique values and the rest are NAs with a value of 0. The second one contains the frequency of those variables plus the frequency of the NAs so a frame of 77112 obs for 2 variables.

The resulting frame I need to get is the first one joined with the frequency for the merging variable, so a df of 103731 obs with the frequency for each value of the merging variable (so with duplicates if freq > 1 and also for each NA (or 0)).

Can anybody help me?

The result I'm getting now contains a data frame of 1 894 919 obs and I used:

tot = merge(df1, df2, by = "mergingVar", all= F, sort = F);  

Also I played a lot with 'all=' and none of the variations gave the right df.

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485

1 Answers1

1

why don't you just take the frequency table of your first table?

a <- data.frame(a = c(NA, NA, 2,2,3,3,3))
data.frame(table(a, useNA = 'ifany'))

     a Freq
1    2    2
2    3    3
3 <NA>    2

or mutate from plyr

ddply(a, .(a), mutate, freq = length(a))

   a freq
1  2    2
2  2    2
3  3    3
4  3    3
5  3    3
6 NA    2
7 NA    2
Paulo E. Cardoso
  • 5,778
  • 32
  • 42