I have written a code to retrieve JSON data from an URL. It works fine. I give the start and end date and it loops through the date range and appends everything to a dataframe.
The colums are populated with the JSON data sensor
and its corresponding values, hence the column names are like sensor_1
. When I request the data from the URL it sometimes happens that there are new sensors and the old ones are switched off and deliver no data anymore and often times the length of the columns change. In that case my code just adds new columns.
What I want is instead of new columns a new header in the ongoing dataframe.
What I currently get with my code:
datetime;sensor_1;sensor_2;sensor_3;new_sensor_8;new_sensor_9;sensor_10;sensor_11;
2023-01-01;23.2;43.5;45.2;NaN;NaN;NaN;NaN;NaN;
2023-01-02;13.2;33.5;55.2;NaN;NaN;NaN;NaN;NaN;
2023-01-03;26.2;23.5;76.2;NaN;NaN;NaN;NaN;NaN;
2023-01-04;NaN;NaN;NaN;75;12;75;93;123;
2023-01-05;NaN;NaN;NaN;23;31;24;15;136;
2023-01-06;NaN;NaN;NaN;79;12;96;65;72;
What I want:
datetime;sensor_1;sensor_2;sensor_3;
2023-01-01;23.2;43.5;45.2;
2023-01-02;13.2;33.5;55.2;
2023-01-03;26.2;23.5;76.2;
datetime;new_sensor_8;new_sensor_9;sensor_10;sensor_11;
2023-01-04;75;12;75;93;123;
2023-01-05;23;31;24;15;136;
2023-01-06;79;12;96;65;72;
My loop to retrieve the data:
start_date = datetime.datetime(2023,1,1,0,0)
end_date = datetime.datetime(2023,1,6,0,0)
sensor_data = pd.DataFrame()
while start_zeit < end_zeit:
q = 'url'
r = requests.get(q)
j = json.loads(r.text)
sub_data = pd.DataFrame()
if 'result' in j:
datetime = pd.to_datetime(np.array(j['result']['data'])[:,0])
sensors = np.array(j['result']['sensors'])
data = np.array(j['result']['data'])[:,1:]
df_new = pd.DataFrame(data, index=datetime, columns=sensors)
sub_data = pd.concat([sub_data, df_new])
sensor_data = pd.concat([sensor_data, sub_data])
start_date += timedelta(days=1)