I have a time series that is read using read_csv.
>series.head()
Arrival Price
Date
2010-01-02 40.9 1250.0
2010-01-05 155.9 1155.0
2010-01-06 50.6 2300.0
2010-01-07 99.0 1350.0
2010-01-08 159.5 2100.0
>series.describe()
Arrival Price
count 1847.000000 1847.000000
mean 409.292907 1154.412019
std 723.896792 815.889659
min 1.000000 200.000000
25% 50.650000 580.000000
50% 184.700000 900.000000
75% 460.850000 1412.500000
max 6496.000000 4750.000000
I need to split the series into two different series with the following condition
if series[]['Price'] is less than 2000 add it to s1series,
else add it to s2series
I could write a split function as follows, but I get the result that seem to be not a series as indicated in the 's1series.describe()' function
def split_series(series, limit):
s1index = list()
s1elements = list()
s2index = list()
s2elements = list()
X = series.values
print(series.head())
print(series.describe())
for i in range(len(series.values)):
if(X[i][1]< limit):
s1elements.append(X[i])
s1index.append(series.index[i])
else:
s2elements.append(X[i])
s2index.append(series.index[i])
s1index = DatetimeIndex(s1index)
s1series = Series(s1elements, s1index)
print(s1series.head())
print(s1series.describe())
s2index = DatetimeIndex(s2index)
s2series = Series(s2elements, s2index)
return s1series,s1index,s2series,s2index
I get the following output
s1series.head()
2010-01-02 [40.9, 1250.0]
2010-01-05 [155.9, 1155.0]
2010-01-06 [50.6, 1285.0]
2010-01-07 [99.0, 1350.0]
2010-01-08 [159.5, 1380.0]
dtype: object
>s1series.describe()
count 1600
unique 1600
top [95.1, 640.0]
freq 1
dtype: object
I strongly feel that there should be a better way to do this. Please help.