I want to remove all url from a column. The column has string format.
My Dataframe has two columns: str_val[str], str_length[int]
.
I am using following code:
t1 = time.time()
reg_exp_val = r"((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+)"
df_mdr_pd['str_val1'] = df_mdr_pd.str_val.str.replace(reg_exp_val, r'')
print(time.time()-t1)
When I run the code for 10000
instance, it is finished in 0.6
seconds. For 100000 instances the execution just gets stuck. I tried using .loc[i, i+10000]
and run it in for
cycle but it did not help either.