1

I want to Create a line graph that gives the daily maximum temperature for 2005 in my dataframe and also make the x-axis is a date and covers the whole year. When I run my code, I get this error : SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.See below my code:

data = pd.read_csv('https://raw.githubusercontent.com/dereksonderegger/444/master/data-raw/FlagMaxTemp.csv')
# Filter data for year 2005
data_2005 = data[data['Year'] == 2005]

# Create a date column combining Year, Month, and Day
data_2005.loc[:, 'Date'] = pd.to_datetime(data_2005[['Year', 'Month']].assign(day=1))
# Set Date as the index
data_2005.set_index('Date', inplace=True)

# Extract the daily maximum temperature columns
max_temp_cols = [str(i) for i in range(1, 32)]
max_temp_data = data_2005[max_temp_cols]

# Reshape the data to long format
max_temp_data = max_temp_data.melt(var_name='Day', value_name='Temperature', ignore_index=False)
max_temp_data.reset_index(inplace=True)

# Convert Day column to integer
max_temp_data['Day'] = max_temp_data['Day'].astype(int)
  • Does this answer your question? [How to deal with SettingWithCopyWarning in Pandas](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – NotAName Apr 21 '23 at 03:50

1 Answers1

0

You have to break the reference between data and data_2005 using copy:

Without copy

>>> data_2005 = data[data['Year'] == 2005]

>>> data_2005._is_copy
<weakref at 0x7fef0406ddb0; to 'DataFrame' at 0x7feeff3cbe50>

>>> hex(id(data))
'0x7feeff3cbe50'  # reference to data

With copy:

>>> data_2005 = data[data['Year'] == 2005].copy()

>>> data_2005._is_copy
None

So the rest of your code works without any warning.

I want to create a variable from the dataframe but the days are in a row form in my dataframe.

url = 'https://raw.githubusercontent.com/dereksonderegger/444/master/data-raw/FlagMaxTemp.csv'
data = (pd.read_csv(url, index_col=0)
          .melt(['Year', 'Month'], var_name='Day', value_name='Temp').dropna()
          .assign(Date=lambda x: pd.to_datetime(x[['Year', 'Month', 'Day']]))
          .sort_values('Date', ignore_index=True)[['Date', 'Temp']])

Output:

>>> data
            Date   Temp
0     1985-05-01  71.06
1     1985-05-02  71.06
2     1985-05-03  68.00
3     1985-05-04  68.00
4     1985-05-05  64.94
...          ...    ...
10877 2015-12-17  42.08
10878 2015-12-18  53.06
10879 2015-12-25  35.06
10880 2015-12-26  23.00
10881 2015-12-27  37.94

[10882 rows x 2 columns]
Corralien
  • 109,409
  • 8
  • 28
  • 52