-5

i'm new in R , and i'm trying to join between two tables. the shared filed between the two tables is the date but when i'm importing the data i received him with deferent structure.

First Table: enter image description here

Second Table: enter image description here

actually what i need is to join the data by operation system and remove Linux like inner join in sql with condition on the operation system. Thanks

user3600910
  • 2,839
  • 4
  • 22
  • 36
  • 2
    Do not post your data as an image, please learn how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) – Jaap Nov 15 '15 at 07:53

2 Answers2

0

Say that your first dataset is called df1 and the second one df2, you can join the two by calling:

merge(df1, df2, by = "operatingSystem")

You can specify the kinds of join by using all = T, all.x = T, or all.y = T.

David
  • 9,216
  • 4
  • 45
  • 78
0

I am a bit lazy to reproduce your example but I will give it a go as is

First, in your second table, you need to convert the date column to an actual date

You can do this with easily with lubridate

Assuming df1 and df2 for the first and second table respectively

library(lubridate)
df2$date <- ymd(df2$date)  #ymd function assumes `year` then `month` then `day` when converting

Then you can use the dplyr's inner_join to perform the desired join

from stat545

inner_join(x, y): Return all rows from x where there are matching values in y, and all columns from x and y. If there are multiple matches between x and y, all combination of the matches are returned.

library(dplyr)
semi_join(df1, df2, by = c("date", "operatingSystem")

This will keep all rows in df1 that have a match in df2 - Linux stays out, and also keep the columns newusers and will keep df2%users and rename into users.1.

Note:You might need to convert df1$date to dttm object with lubridate::date(df1$date)

Community
  • 1
  • 1
Lefkios Paikousis
  • 462
  • 1
  • 6
  • 12