-3

I have a panda data frame as below.

|Col_A   |Col_B   |Col_C|
|--------|--------|-----|
|  1470  | 31     |687  | 
|  NaN   | 51     |689  |
|  NaN   | 85     |690  | 
|  1470  | 78     |691  |
|  NaN   | 64     |692  |
|  NaN   | 78     |693  |
|  NaN   | 45     |694  |
|  1471  | 87     |697  |

I need to obtain a data set, where all the values in Col_C will be removed (null) based on a condition of Col_A. The condition will be, only when the two consecutive values of column A will be different (example 1470 and 1471) the corresponding value of Col_C will be NaN

The datasets outcome that I want is:

|Col_A   |Col_B   |Col_C|
|--------|--------|-----|
|  1470  | 31     |687  | 
|  NaN   | 51     |689  |
|  NaN   | 85     |690  | 
|  1470  | 78     |691  |
|  NaN   | 64     |NaN  |
|  NaN   | 78     |NaN  |
|  NaN   | 45     |NaN  |
|  1471  | 87     |697  |

Any help will be highly appreciated.

Dark Star
  • 43
  • 9
  • 1
    [Please do not put images of data/code in your question](https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-errors-when-asking-a-question) , also this isn clear, please spend some more time to see how an [mcve](https://stackoverflow.com/help/minimal-reproducible-example) is created and edit your question, you can refer [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – anky Sep 04 '20 at 19:18
  • Edited second time. Hope this should be clear at this time – Dark Star Sep 04 '20 at 19:31
  • Thank you, however that is still an image, the links I linked you to would help you creating a minimal and reproducible example, It might look overwhelming at the beginning , but I am sure that will help you in the current and future questions – anky Sep 04 '20 at 19:38
  • 1
    Always provide a [mre], with **code, data, errors, current output, and expected output, as text**, not screenshots, because [SO Discourages Screenshots](https://meta.stackoverflow.com/questions/303812/). It is likely the question will be down-voted and closed. You are discouraging assistance because no one wants to retype your data or code, and screenshots are often illegible. [edit] the question and **add text**. – Trenton McKinney Sep 04 '20 at 21:10

1 Answers1

1
df.loc[(df['C']<142) & (df['C']>=127), 'C'] = pd.np.nan

Edit: Following last revision of the question. This worked for me:

value0 = df.loc[0, 'A']
idx0 = 0
toMissing = []

for i in df.index:
   value1 = df.loc[i, 'A']
   if (pd.isna(value1)==False) & (value1 != value0):
       toMissing.extend([w for w in range(idx0+1, i)])
       value0 = value1
       idx0=i
   elif (value1 == value0):
       idx0 = i

df.loc[toMissing, 'C'] = pd.np.nan
  • Thanks fro your answer. But I don't need a new column D. I just need to make all the values in column C null where the two values between column A are different. – Dark Star Sep 04 '20 at 19:24
  • My previous answer was based on pictures showing in wrong order. See updated answer – Evangelos Con Sep 04 '20 at 19:32
  • @ Evangelos Con, in your code I don't find any condition you checked where the two values in A column (in this e.g. 1470 and 1471 is different). Based on the two different values in A the corresponding values in C will be 'null'. It seems you hard coded the values in C column between rows 127 and 142 but this is not the case that I want. – Dark Star Sep 04 '20 at 19:38
  • If you can see carefully, the values in C column is same where the two values in column A (1470 and 1470 is same) – Dark Star Sep 04 '20 at 19:47
  • see updated answer – Evangelos Con Sep 04 '20 at 21:27
  • Hi Evangelos Con, THANKS A LOT ! It's work perfectly for me as well. By the way, it seems to be little bit low processing as you used for loop. Is it possible to reproduce the code without for loop to make it faster? – Dark Star Sep 05 '20 at 01:28
  • Sorry, given the dependence between the observations, i can't think of a quicker way that does not create auxiliary columns or does not use a for loop – Evangelos Con Sep 09 '20 at 12:12