1

I am a newbie in Python, I appreciate all help. I want a list of activities of a group, who purchased for 4$ this way: purchase_date(1,1,0,1,1,0,1) where purchase_date is the date of the purchase, and the arrays indeces+1 gives the days after the purchases. 1 means active day, 0 means not active day. E.g. 20190203(1,1,1,0,0,0,1) means the purchase was on 2019-02-03, and the user was active after that on 4th, 5th, 6th and 10th of February.

I tried the code below. Steps:

  1. Created a datatable with the purchases: four_dollar_buyers(user_pseudo_id,purchase_date). Queried it and loaded the result into the four_dollar_purchases list.
  2. Iterated over on four_dollar purchases
  3. Made 2 helper arrays: seven_days_date contains the dates after the purchases seven_days_number should contain ones and zeros (active or not in given day)
  4. Iterated over the seven_days_date, made a query from datatables of the given date, what gave the id-s of the active users on that day. Load the result of the query into a list named 'actives'
  5. If the user_id of the given purchase is in actives, then the seven_days_number array should change from 0 to 1 on the given index.
client = bigquery.Client(project="project")

QUERY = ('SELECT * FROM `project.four_dollar_buyers`')
query_job = client.query(QUERY)                             
four_dollar_purchases = list(query_job.result())                        

for row in four_dollar_purchases:                                       

  seven_days_date = ["","","","","","",""]                          
  seven_days_number = [0,0,0,0,0,0,0]                                   

  for i in range(7):
    date_time_obj = datetime.strptime(row[1], '%Y%m%d')                 
    date_time_obj = date_time_obj + timedelta(days=1)+timedelta(days=i)         
    seven_days_date[i] = date_time_obj.strftime("%Y%m%d")                   

  for idx, days in enumerate(seven_days_date):

    QUERY = ('''SELECT DISTINCT user_pseudo_id FROM 
    `project.events_'''+days+'''` WHERE event_name IN 
    ("activity_added")''')
    query_job = client.query(QUERY)
    actives = list(query_job.result())                          


  if row[0] in actives:                                 
    seven_days_number[idx] = 1                              


  print(row[1] + str(seven_days_number))

There is no error message anymore, but all result is like this 20181212(0,0,0,0,0,0,0). So for some reason the helper array does not change, after the purchase date it gives only zeros. I checked the variables row[0] and actives with pprint and both of them contains the right result.

Akos
  • 49
  • 2
  • 8

2 Answers2

0

Days is not an integer type as ralaxpy has suggested. So, you can use enumeration or something else in order to modify the list using the index.

mj1261829
  • 1,200
  • 3
  • 26
  • 53
  • Yeah, I tried it as edited the code in the question. No changes.:/ – Akos Jul 17 '19 at 12:02
  • Is row[0] in actives true? Also idx must be a value between 0 and len(seven_days_number), otherwise you wont notice any changes – mj1261829 Jul 18 '19 at 10:35
0
query_job = client.query(QUERY)
actives = list(query_job.result())

for dict in actives:
  if dict[0] == row[0]:
    seven_days_number[idx] = 1

print(row[1] + str(seven_days_number))
Magyar Miklós
  • 4,182
  • 2
  • 24
  • 42
  • 1
    Please explain why this is the appropriate solution to the question. The goal is to do more than toss out code, it's to educate so the OP knows what to do when a similar situation occurs. – the Tin Man Jul 21 '19 at 01:34
  • Thx, that's the solution. Disclaimer for other readers: Incmiko realized that my code compared list elements to a list of dictionaries, because the result of the query was a dict from which I made a list after. – Akos Jul 21 '19 at 10:55