I have an error Subsetting a data frame in pandas. Here's my code:
`import pandas as pd
import matplotlib.pyplot as plt
import scipy
from gluonts.dataset.pandas import PandasDataset
from gluonts.dataset.split import split
from gluonts.torch import DeepAREstimator`
# Load data from a CSV file into a PandasDataset
df = pd.read_csv(
"city_temperature.csv"
)
df.head()
df = df[df["City"]=="Algiers"]
dataset = PandasDataset(df2)
I tried to subset a part of data which city name is called "Algiers" in a global city temperature data. City is a column of the data set, and I am trying to use the PandasDataset() function. I am not sure how to use this function, since I searched the help file but cannot find it. After attempting to code, I got this error:
/tmp/ipykernel_712/289762879.py:9: DtypeWarning: Columns (2) have mixed types. Specify dtype option on import or set low_memory=False.
df = pd.read_csv(
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[5], line 15
13 df.dropna()
14 df2 = df[df["City"]=="Algiers"]
---> 15 dataset = PandasDataset(df2)
17 # Split the data for training and testing
18 #training_data, test_gen = split(df, offset=-36)
19 #test_data = test_gen.generate_instances(prediction_length=12, windows=3)
(...)
31 # forecast.plot()
32 #plt.legend(["True values"], loc="upper left", fontsize="xx-large")
File <string>:12, in __init__(self, dataframes, target, feat_dynamic_real, past_feat_dynamic_real, timestamp, freq, static_features, future_length, unchecked, assume_sorted, dtype)
File /opt/conda/lib/python3.10/site-packages/gluonts/dataset/pandas.py:119, in PandasDataset.__post_init__(self, dataframes, static_features)
114 if self.freq is None:
115 assert (
116 self.timestamp is None
117 ), "You need to provide `freq` along with `timestamp`"
--> 119 self.freq = infer_freq(first(pairs)[1].index)
121 static_features = Maybe(static_features).unwrap_or_else(pd.DataFrame)
123 object_columns = static_features.select_dtypes(
124 "object"
125 ).columns.tolist()
File /opt/conda/lib/python3.10/site-packages/gluonts/dataset/pandas.py:319, in infer_freq(index)
316 if isinstance(index, pd.PeriodIndex):
317 return index.freqstr
--> 319 freq = pd.infer_freq(index)
320 # pandas likes to infer the `start of x` frequency, however when doing
321 # df.to_period("<x>S"), it fails, so we avoid using it. It's enough to
322 # remove the trailing S, e.g `MS` -> `M
323 if len(freq) > 1 and freq.endswith("S"):
File /opt/conda/lib/python3.10/site-packages/pandas/tseries/frequencies.py:193, in infer_freq(index, warn)
191 if isinstance(index, Index) and not isinstance(index, DatetimeIndex):
192 if isinstance(index, (Int64Index, Float64Index)):
--> 193 raise TypeError(
194 f"cannot infer freq from a non-convertible index type {type(index)}"
195 )
196 index = index._values
198 if not isinstance(index, DatetimeIndex):
TypeError: cannot infer freq from a non-convertible index type <class 'pandas.core.indexes.numeric.Int64Index'>
Can somebody help me with this?
I tried googling this error but I cannot find answer.