0

I have successfully imported temperature CSV file to Python Pandas DataFrame. I have also found the mean value of specific range:

df.loc[7623:23235, 'Temperature'].mean()

where 'Temperature' is Column title in DataFrame.

I would like to know if it is possible to change this function to find the average of last 25% (or 1/4) from the input range (7623:23235).

smci
  • 32,567
  • 20
  • 113
  • 146
fred
  • 93
  • 8
  • Please make sure to tag pandas questions [tag:pandas], not just [tag:python]. Tagging them pandas gets them seen faster by all the people who subscribe to that tag. Most Python users don't know pandas. – smci Dec 20 '22 at 19:54

2 Answers2

1

Yes, you can use the quantile method to find the value that separates the last 25% of the values in the input range and then use the mean method to calculate the average of the values in the last 25%.

Here's how you can do it:

quantile = df.loc[7623:23235, 'Temperature'].quantile(0.75)


mean = df.loc[7623:23235, 'Temperature'][df.loc[7623:23235, 'Temperature'] >= quantile].mean()
SuperStew
  • 2,857
  • 2
  • 15
  • 27
0

To find the average of the last 25% of the values in a specific range of a column in a Pandas DataFrame, you can use the iloc indexer along with slicing and the mean method.

For example, given a DataFrame df with a column 'Temperature', you can find the average of the last 25% of the values in the range 7623:23235 like this:

import math

# Find the length of the range
length = 23235 - 7623 + 1

# Calculate the number of values to include in the average
n = math.ceil(length * 0.25)

# Calculate the index of the first value to include in the average
start_index = length - n

# Use iloc to slice the relevant range of values from the 'Temperature' column
# and calculate the mean of those values
mean = df.iloc[7623:23235]['Temperature'].iloc[start_index:].mean()

print(mean)

This code first calculates the length of the range, then calculates the number of values that represent 25% of that range. It then uses the iloc indexer to slice the relevant range of values from the 'Temperature' column and calculates the mean of those values using the mean method.

Note that this code assumes that the indices of the DataFrame are consecutive integers starting from 0. If the indices are not consecutive or do not start at 0, you may need to adjust the code accordingly.

Rahul Mukati
  • 744
  • 2
  • 10
  • 15
  • You generally don't need to do manual index arithmetic in pandas, it's unnecessary (since idioms like logical indexing exist, and are more compact, often one-liners) and a code smell that you're doing something suboptimal. – smci Dec 20 '22 at 20:26