0

I have a time series that is read using read_csv.

>series.head()

              Arrival     Price
Date                       
2010-01-02     40.9       1250.0
2010-01-05    155.9       1155.0
2010-01-06     50.6       2300.0
2010-01-07     99.0       1350.0
2010-01-08    159.5       2100.0

>series.describe()

           Arrival        Price
count  1847.000000  1847.000000
mean    409.292907  1154.412019
std     723.896792   815.889659
min       1.000000   200.000000
25%      50.650000   580.000000
50%     184.700000   900.000000
75%     460.850000  1412.500000
max    6496.000000  4750.000000

I need to split the series into two different series with the following condition

if series[]['Price'] is less than 2000 add it to s1series, 
else add it to s2series

I could write a split function as follows, but I get the result that seem to be not a series as indicated in the 's1series.describe()' function

def split_series(series, limit):
    s1index = list()
    s1elements = list()
    s2index = list()
    s2elements = list()
    X = series.values
    print(series.head())
    print(series.describe())
    for i in range(len(series.values)):
        if(X[i][1]< limit):
            s1elements.append(X[i])
            s1index.append(series.index[i])
        else:
            s2elements.append(X[i])
            s2index.append(series.index[i])
    s1index = DatetimeIndex(s1index)
    s1series = Series(s1elements, s1index)
    print(s1series.head())
    print(s1series.describe())
    s2index = DatetimeIndex(s2index)
    s2series = Series(s2elements, s2index)
    return s1series,s1index,s2series,s2index

I get the following output

s1series.head()

2010-01-02     [40.9, 1250.0]
2010-01-05    [155.9, 1155.0]
2010-01-06     [50.6, 1285.0]
2010-01-07     [99.0, 1350.0]
2010-01-08    [159.5, 1380.0]
dtype: object

>s1series.describe()

count              1600
unique             1600
top       [95.1, 640.0]
freq                  1
dtype: object

I strongly feel that there should be a better way to do this. Please help.

prabhakar
  • 472
  • 1
  • 4
  • 11
  • @jezrael, thank you for referring me to the other answer. It looks good for the dataframe. I could use it for the series.values converted as DataFrame. But I am not clear on how to retrieve the 'Date' that is in time series as well for the corresponding rows in data frame. – prabhakar Apr 07 '18 at 06:37
  • It is very similar, need `df1.index[df1['Price'] > 2000]` for filter index values. – jezrael Apr 07 '18 at 06:39

0 Answers0