I am wondering which would be the most efficient way of creating a column in a pandas dataframe which if an id_row exist in a given list will return 1 or else 0.
I'm currently using a lambda function to apply the result. My problem is that it is taking a long time as my dataframe is around 2M rows and the list it checks into between 200k to 100k items. If I'm not mistaken, this is quadratic time (I'm really not sure though), which in this case runs really slowly give the size of the objects.
The worst is that I have to repeat this bit of code for over a 100 other (different) dataframes.
Here is the function:
lst_to_add = [1,2,3.......,n]
df_table['TEST'] = df_table['id_row'].apply(lambda x : 1 if x i lst_to_add else 0)
I'm wondering how could I make the bit of code (way) more efficient. I thought of a 'divide and conquer' solution using a recursive function perhaps, but I'm really open to any suggestions.
Last thing. I also have constraits with memory hence I'd would prefer a method which takes a little more time but less memory than the alternative (if I had a choice).