0
+---+----+
| A | B  |
+---+----+
| 4 | VK |
| 9 | MD |
| 9 | V  |
| 8 | VK |
| 9 | V  |
| 1 | N  |
| 9 | V  |
| 7 | VK |
| 0 | MD |
| 9 | VK |
| 8 | V  |
+---+----+

I want to create new rows from the given dataset (two columns). For example, in the given dataset I want to select a row based on 'B' column value 'VK' and two previous rows of Column 'A' to create a new row. So, that first row should be like

+---+---+---+----+
| 9 | 9 | 4 | VK |
+----------------+

So, new row  = ( Row that has 'VK' + add two previous values of column A )
             = [4 | VK] + [ 9 | 9]
             = [ 9 | 9] + [4 | VK]

This process should iterate and we should have a dataset like:

+---+---+---+----+
| 9 | 9 | 4 | VK |
| 1 | 9 | 8 | VK |
| 9 | 0 | 7 | VK |
+----------------+

Input df:

df = pd.DataFrame([[4,"VK"],
[ 9,"MD" ],
[ 9,"V"  ],
[ 8,"VK" ],
[ 9,"V"  ],
[ 1,"N"  ],
[ 9,"V"  ],
[ 7,"VK" ],
[ 0,"MD" ],
[ 9,"VK" ],
[ 8,"V"  ]],columns = [ "A" , "B"])
Let's try
  • 1,044
  • 9
  • 20
Shamsul Masum
  • 337
  • 1
  • 2
  • 14

2 Answers2

1

An option could be the following:

df = pd.DataFrame([[4,"VK"],
[ 9,"MD" ],
[ 9,"V"  ],
[ 8,"VK" ],
[ 9,"V"  ],
[ 1,"N"  ],
[ 9,"V"  ],
[ 7,"VK" ],
[ 0,"MD" ],
[ 9,"VK" ],
[ 8,"V"  ]],columns = [ "A" , "B"])

df2 = df[(df["B"] == "VK") & (df.index < len(df)-2)].reset_index()[["A","B"]]
ix = df[df["B"] == "VK"].index    
df2["Next"] = df.iloc[list(np.minimum(list(ix + 1), len(df)-1)),:].reset_index()["A"]
df2["2nd Next"] = df.iloc[list(np.minimum(list(ix + 2), len(df)-1)),:].reset_index()["A"]
df2 = df2[["2nd Next", "Next", "A","B"]]
df2

We just take the indexes where "VK" is and select the next ones.

Output:

df2

    2nd Next    Next    A   B
0   9           9       4   VK
1   1           9       8   VK
2   9           0       7   VK
Let's try
  • 1,044
  • 9
  • 20
  • Not what i am looking for but close. order should be: '"2nd Next', 'Next', 'A', 'B'. Moreover, where did you get the '2nd Next' value (8) on the last of your outcome? – Shamsul Masum Jul 29 '20 at 10:58
  • Hey, you can just change the [order](https://stackoverflow.com/questions/13148429/how-to-change-the-order-of-dataframe-columns#:~:text=You%20need%20to%20create%20a,columns%20in%20this%20new%20order.&text=You%20can%20also%20use%20a,inserted%20as%20the%20first%20column.) of the columns. In the code if I find an index bigger that the size of the dataset I just take the last value, this can be easily changed if that is not what you are looking for. – Let's try Jul 29 '20 at 12:19
  • the outcome I looked for is mentioned in the question. The only difference is that you got it on the wrong order and 4th row should not be there. Please take a look at the final dataset i am looking at the question. – Shamsul Masum Jul 29 '20 at 12:24
0

You could do this (Edited):

import pandas as pd
a = {'A':[4,9,9,8,9,1,9,7,0,9,8],
    'B':['VK','MD','V','VK','V','N','V','VK','MD','VK','V']}
df = pd.DataFrame(a)
df['C'] = df['A'].shift(-2)
df['D'] = df['A'].shift(-1)
df = df[(df['B'] == 'VK') & (pd.isna(df['C'])== False)]
df = df[['B', 'C','D','A']]

Output:

    B   C   D   A
0   VK  9.0 9.0 4
3   VK  1.0 9.0 8
7   VK  9.0 0.0 7
Jayesh
  • 11
  • 4