1

I have 2 files. One is a time_file which has 3000 rows and the other is userid file which has 2000 rows. I want to merge the two, so that each row (ID) in the userid file is paired with the full data from each row of the time_file.

Rows 1-3000 would show the first userid with each of the dates.

Rows 3001-6000 would show the 2nd userid with each of the dates, and so on.

Thanks in advance!

Time file

    mo  day year    date
    11  1   2015    11/1/2015
    11  2   2015    11/2/2015
    11  3   2015    11/3/2015
    11  4   2015    11/4/2015
    11  5   2015    11/5/2015
    .
    .

userid file

userid
154
155
157
158
159
160
.
.

Ideal format(what I want)

    mo  day year    date        userid
    11  1   2015    11/1/2015   154
    11  2   2015    11/2/2015   154
    11  3   2015    11/3/2015   154
    11  4   2015    11/4/2015   154
    11  5   2015    11/5/2015   154
    .
    .
    3   28  2017    3/28/2017   154
    3   29  2017    3/29/2017   154
    3   30  2017    3/30/2017   154
    3   31  2017    3/31/2017   154
    11  1   2015    11/1/2015   155
    11  2   2015    11/2/2015   155
    11  3   2015    11/3/2015   155
    11  4   2015    11/4/2015   155
    11  5   2015    11/5/2015   155
    11  6   2015    11/6/2015   155
eli-k
  • 10,898
  • 11
  • 40
  • 44
Hayden
  • 11
  • 2
  • 1
    Please share sample of your data using `dput()` (not `str` or `head` or picture/screenshot) so others can help. See more here https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?rq=1 – Tung Sep 19 '18 at 21:52
  • @Tung : I've attached the screenshot! but will try to using dput() or another way to share the data sample. – Hayden Sep 19 '18 at 22:24
  • The SPSS solution is through `STATS CARTPROD` command. will post some code if the question is reopened. – eli-k Sep 20 '18 at 05:37

2 Answers2

1

Easiest solution in R I can think of, assuming you've gotten your time data in a data frame and your user data in a vector:

final_df <- cbind(date_df, "userid" = rep(user, each = 3000))

This will repeat each user_id 3000 times, then bind the user_id column to the date data frame.

Emma Clarke
  • 301
  • 1
  • 4
0

In SPSS you can use the cartesian product function for this:

First this recreates your example data:

data list free/mo  day year (3f4)   date (a12).
begin data.
11  1   2015    11/1/2015
11  2   2015    11/2/2015
11  3   2015    11/3/2015
11  4   2015    11/4/2015
11  5   2015    11/5/2015
end data.
DATASET NAME time_file.

data list free/ userid.
begin data.
154,155,157,158,159,160
end data.
DATASET NAME userid.

This will now combine the two tables like you requested:

STATS CARTPROD VAR1=userid INPUT2=time_file VAR2=mo  day year date 
/SAVE OUTFILE="path\your combined data.sav".
eli-k
  • 10,898
  • 11
  • 40
  • 44