How do filter the dataframe in such a way?

Question

Here is the output of the dataframe

Date    Upper_zone      Lower_zone   Stock_name S/R
0   2018-02-12  163.40   155.75        ABFRL    Resistance becoming support
1   2017-03-16  200.00   189.10        CROMPTON Resistance becoming support
2   2017-04-11  127.69   126.16        CUB      Resistance becoming support
3   2017-02-02  644.40   625.00        ENDURANC Resistance becoming support
4   2019-08-27  15.70    15.20         GMRINFRA Resistance becoming support
5   2020-01-30  1287.00  1233.90       IPCALAB  Resistance becoming support
6   2017-08-01  17236.00 16220.50      PAGEIND  Resistance becoming support
7   2018-09-11  3788.00  3570.00       PFIZER   Resistance becoming support
8   2019-06-20  1261.35  1235.05       PIDILITIND Resistance becoming support
9   2018-09-26  17506.50 16803.40      SHREECEM Resistance becoming support
10  2018-09-03  556.67   542.13          VBL    Resistance becoming support
11  2018-10-31  563.33   533.37          VBL    Resistance becoming support
12  2019-02-06  562.90   534.00          VBL    Resistance becoming support
13  2017-07-05  479.00   461.70        VOLTAS   Resistance becoming support

Now I want to have only one stock with the latest date. Here VBL is appearing 3 times but I only want one line item of VBl with the latest date. ie.e 2010-02-06 and delete the remaining 2 line items.

here is the code I used group by

x  = final_df.groupby('Stock_name')
y = x['Date'].max()
print(y)

Output

Stock_name
ADANITRANS    2019-08-01
BEL           2019-02-14
BERGEPAINT    2020-01-06
ICICIGI       2019-10-07
INDIGO        2019-01-21
INFY          2019-10-24
MARICO        2017-02-15
RELIANCE      2019-08-07
TCS           2019-01-14
Name: Date, dtype: object

How can I add remaining columns with the output that I have received ?

What is the output of your code? – theletz Mar 29 '20 at 10:54 — theletz, Mar 29 '20 at 10:54

score 0 · Answer 1 · answered Mar 29 '20 at 11:01

0

You can use drop_duplicate on column Stock_name and keep the last value like this

df.drop_duplicates(subset='Stock_name', keep="last")

or keep the max value on the Upper_zone column like this:

df.groupby('Stock_name', group_keys=False).apply(lambda x: x.loc[x.Upper_zone.idxmax()])

or you can sort the value in Upper_zone first, then use the drop_duplicate then.

answered Mar 29 '20 at 11:01

Binh

1,143
6
8

You can do: ```df.groupby('Stock_name', group_keys=False).last()``` instead of ```apply(...)``` version- should be faster. – Grzegorz Skibinski Mar 29 '20 at 11:10
Actually the apply is selecting the `max` value, not just the `last` value tho – Binh Mar 29 '20 at 11:12
Check the sample data in question - it's the same thing here ;) – Grzegorz Skibinski Mar 29 '20 at 11:14
Yes, but I think the answer needs to be more clear, so that others can apply in another dataframe too, you know. Anyway, thank you for your suggestion mate – Binh Mar 29 '20 at 11:17

How do filter the dataframe in such a way?

1 Answers1