0

here is the use case I am trying to solve for:

I have a DataFrame which has following columns:

  • Name
  • Date
  • SubscriptionID
  • Sku
  • Type(sale or refund)

What I am trying to do is to loop through the entire dataset sorted by Date ascending.

Once done the first instance of (subscriptionid and sku) should get a new value, say interval_value, of 1. While looping if the record comes again increment it if it is sale or do -1 if there is a refund.

essentially I am trying to figure out how many times has each subscription purchased. A subcan have potentially 2 sku,s hence I would like to do this using the subId and Sku.

In theory I can loop through the whole data frame and process line by line. HOwever I am looking for how would this be accomplished using Pandas, either using the Apply method or some other fashion that is more efficient.

EDITED:

enter image description here

This is the logic I would like to implement. (how to calculate the Interval Column).

Looking for your response. Thanks

tkansara
  • 534
  • 1
  • 4
  • 21

0 Answers0