0

I have a doubt and would require a small support on Python.

  1. I wish to calculate the difference of Last value of Current month - Last value of Previous month ( this is tag wise)
  2. Then, the total sum of value calculated from number 1. against per month for all tags.

In the dataset, there are multiple values for December 31, and Jan1, Jan2.. Jan 30. Now I want to calculate as, Result = Last or Latest value of Jan 30(among multiple values) - Last value of the Previous month (Dec31)

The expected output looks like this: enter image description here

I'm taking the reference from the powerBI table were calculation is already done. I want the same to be done using python.

Case 1: Calculate the difference of Last value of Current month - Last value of Previous month ( as seen from the last column in table( Diff Value of CM - PM). Please have a look to see how the difference is taken.

Case 2:

Total Summation of the values (Diff value of CM - PM from Case1) for all the tags against each month and year.

Case 1 & 2 are connected. Case 1 is the sum of values per tag, and Case 2 is the total sum of values for all the tags.

Sample data can be accessed from here: Dataset

Please support me on this ?

Nishad Nazar
  • 371
  • 2
  • 3
  • 16
  • You must provide all your data in a [reproducible format](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples), not images – mozway Jul 26 '23 at 07:23
  • Hi, if you really want help, post a reproducible example, not only images... – Myron_Ben4 Jul 26 '23 at 07:23
  • Apologies, @mozway and Myron_Ben4 Will add the Gdrive link to the sample dataset, used in this project. – Nishad Nazar Jul 26 '23 at 09:10
  • Refrain from showing your dataframe as an image. Your question needs a minimal reproducible example consisting of sample input, expected output, actual output, and only the relevant code necessary to reproduce the problem. See [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for best practices related to Pandas questions. – itprorh66 Jul 26 '23 at 13:58

1 Answers1

0

Maybe something like this will work:

# Step 1: Load the dataset and preprocess the date column
df = pd.read_csv('your_dataset_name_here.csv')
df['Date'] = pd.to_datetime(df['LocalSystemDate'])
df['Month'] = df['Date'].dt.month
df['Year'] = df['Date'].dt.year

# Step 2: Calculate the difference between the last value of the current month
# and the last value of the previous month for each tag
df['Diff_CM_PM'] = df.groupby(['Tag', 'Year', 'Month'])['Value'].transform(lambda x: x.iloc[-1] - x.iloc[-2])

# Step 3: Group the data by month and year and calculate the total sum of the difference for all tags
result = df.groupby(['Year', 'Month'])['Diff_CM_PM'].sum().reset_index()     
print(result)

Please before executing code adapt the column names to your dataset.

notarealgreal
  • 734
  • 16
  • 29