0

I have put some code that outputs the following Pandas df:

group text date from to dollar flow
1 whale_alert_io 17,807 ETH 2023-05-26 18:20:23 gemini unknown 32,501,222 outflow
1 whale_alert_io 75,000,000 USDC 2023-05-26 17:43:23 usdc unknown 74,988,300 outflow
1 whalebotalerts 1,846 BTC 2023-05-26 17:20:38 bitfinex unknown 49,679,510 outflow
1 whale_alert_io 1,846 BTC 2023-05-26 17:20:20 bitfinex unknown 49,489,576 outflow
1 whale_alert_io 792,592 FXS 2023-05-26 16:01:33 unknown frax 5,459,692 inflow
1 whalebotalerts 500 BTC 2023-05-26 15:41:03 kucoin unknown 13,456,948 outflow
1 whale_alert_io 1,215 BTC 2023-05-26 15:34:30 gemini unknown 32,595,669 outflow

I'm trying to figure out how can I create a bar chat plot, where the X axis are blocks of 30 mins (based on date column), and Y is the 'dollar' amount. Note that it should aggregate dollar amounts at the 'flow' column and if the net aggregate is an 'outflow' then the bar should be in red color, otherwise in green.

I tried the following so far:

# Set 'date' column as the index
df.set_index('date', inplace=True)

# Group the data into 30-minute intervals
grouped = df.groupby(pd.Grouper(freq='30Min'))

# Calculate net aggregate of 'dollar' amounts based on 'flow' column
aggregated = grouped.agg({'dollar': 'sum', 'flow': 'last'})

# Determine color for each bar based on 'flow' column
colors = ['red' if flow == 'outflow' else 'green' for flow in aggregated['flow']]

# Plot the bar chart
plt.bar(aggregated.index, aggregated['dollar'], color=colors)
plt.xlabel('Time')
plt.ylabel('Dollar Amount')
plt.title('Net Dollar Amount by 30-Minute Intervals')
plt.xticks(rotation=45)
plt.show()

Error: TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
  • What's the full Stacktrace? Which line throws the error? – Joooeey May 26 '23 at 19:46
  • Hi @Joooeey - this one grouped = df.groupby(pd.Grouper(freq='30Min')) – Francisco Jurado May 26 '23 at 19:47
  • Okay, check the type of the "date" column with `df.types`. Apparently it's not a datetime (I guess a string). Using `df.index = pd.to_datetime(df["date"])` instead of `set_index` should do the necessary conversion for you. Since the `dtype` is crucial here, you should include a copy-able definition of the dataframe in your question as explained here: https://stackoverflow.com/a/30691921/4691830 – Joooeey May 26 '23 at 19:58

1 Answers1

0

It's telling you the error: your index isn't the right type.

EDITED

My first answer was a little/way off the mark. Here's a full solution:

mydates = ['2023-05-26 18:20:23','2023-05-26 17:43:23','2023-05-26 17:20:38','2023-05-26 17:20:20','2023-05-26 16:01:33','2023-05-26 15:41:03', '2023-05-26 15:34:30']

mydollars = [32,74,50,48,5,12,33]

my_df = pd.DataFrame({'date':mydates, 'dollar':mydollars, 'flow':['out','out','out','out','in','out','out']})

my_df.date.apply(lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M:%S'))

my_df.index = pd.to_datetime(my_df.date)

agg = my_df.groupby(pd.Grouper(freq='30min')).agg({'dollar':'sum','flow':'last'})

colors = ['red' if flow == 'out' else 'green' for flow in agg['flow']]

# Plot the bar chart
plt.bar(agg.index, agg['dollar'], color=colors)

plt.xlabel('Time')

plt.ylabel('Dollar Amount')

plt.title('Net Dollar Amount by 30-Minute Intervals')

plt.xticks(rotation=45)

plt.show()

Running this, I'm pretty sure this is not the bar chart you want. But this solves the question you posted. :)

Vincent Rupp
  • 617
  • 5
  • 13