I have a dataframe column having values like this:
Salary Offered
----------------------
£18,323 per annum
£18,000 - £22,000 per annum
Salary not specified
£15,000 - £17,000 per annum, pro-rata
£37,000 - £45,000 per annum
£9,100 - £9,152 per annum, OTE
£9.25 - £10.15 per hour
£35,000 - £40,000 per annum
£23,000 - £26,600 per annum
£18,000 - £25,000 per annum, inc benefits
So I ran the following command, which did a good job by replacing the pure string values (like: "Salary not specified") with None, which I can replace with random values, but I have to again split them by £:
In[13]: df = pd.DataFrame(df.salary_offered.str.split('£',1).tolist(),
columns = ['flips','row'])
In[14]: df['row']
Out[14]:
0 18,323 per annum
1 18,000 - £22,000 per annum
2 None
3 15,000 - £17,000 per annum, pro-rata
4 37,000 - £45,000 per annum
5 9,100 - £9,152 per annum, OTE
6 9.25 - £10.15 per hour
7 35,000 - £40,000 per annum
8 23,000 - £26,600 per annum
9 18,000 - £25,000 per annum, inc benefits
Also there are few rows having salaries given in per hour, so will need to replace them as well, which can be done, intuitively. But I want to separate into different columns having the mean values, something like this:
Salary (£)
---------------
18323
20000
18000
16000
41000
...