0

I need help to get some json data from Pipedrive API(just a resquest.get), however, as expected, the json file is not formatted at all. As you can see in my python script bellow, I'll upload that file to my S3 Bucket, so I can create tables in AWS Glue. I'm having a lot of trouble to make this json file "readable" in AWS structure.

  • Sample of json file I'm working: [I need the "deals" section]

{"success":true,"data":[{"period_start":"2020-11-12 00:00:00","period_end":"2020-11-12 23:59:59","deals":[{"id":xx,"creator_user_id":xx,"user_id":XX,"person_id":XX,"org_id":XX,"stage_id":X}]}]}

  • Python script:
from datetime import datetime
import boto3
import requests
from jsonpath_ng import parse
import pandas as pd

now = datetime.now()
day_int = now.strftime("%d")
month_int = now.strftime("%m")
year_int = now.strftime("%Y")
yesterday = (int(day_int) - 1)
if yesterday == 0:
    yesterday = 31
#today = now.strftime("%Y" + "-" + "%m" + "-" + str(yesterday))
today = '2020-11-12'

response_01 = requests.get(
    'https://api.pipedrive.com/v1/deals/timeline?start_date=' + str(today) + '&interval=day&amount=' + str(
        days) + '&field_key=update_time&totals_convert_currency=' + currency + '&api_token=' + token_api)
raw_data = response_01.json()
x = ([match.value for match in parse('$..deals.[*].[*]').find(raw_data)])
y = json.dumps(x, default= str)

s3 = boto3.resource(
    's3',
    region_name=data_region,
    aws_access_key_id=key_id,
    aws_secret_access_key=access_key
)
client = boto3.client('s3')
client.put_object(Body=str(raw_data), Bucket='bucket_key', Key='object_key') ```

m_n_r
  • 21
  • 2
  • Yesterday's date is easier to calculate than you do it. Use `from datetime import date, timedelta` and then `yesterday = date.today() - timedelta(days=1)`. – Tomalak Nov 16 '20 at 17:13
  • Did you check this https://stackoverflow.com/questions/23306653/python-accessing-nested-json-data. According to that link, you should try accessing raw_data['data'][0]['deals']. – Hema Jayachandran Nov 16 '20 at 17:01
  • Thanks a lot guys, I end up doing a conversion to a Dataframe, then a CSV file thru this: – m_n_r Dec 01 '20 at 11:57
  • raw_data2 = raw_data_r['data'] list_data = pandas.json_normalize(raw_data2) csv_data = list_data.to_csv(index=False, header=True, sep=',', quoting=csv.QUOTE_ALL) – m_n_r Dec 01 '20 at 11:58

0 Answers0