I have a pandas data frame that looks like
Run Time ... K Recovery Ca Recovery
0 14 05:02:54 ... 61,498.671 (492.0%) 62,095.863 (496.8%)
1 19 08:14:59 ... 63,576.997 (508.6%) 63,986.691 (511.9%)
2 35 10:30:42 ... 63,609.755 (508.9%) 64,400.180 (515.2%)
I want it to only keep the percentages and remove everything that isn't numeric so that it looks like this:
Run Time ... K Recovery Ca Recovery
0 14 05:02:54 ... 492.0 496.8
1 19 08:14:59 ... 508.6 511.9
2 35 10:30:42 ... 508.9 515.2
I was able to isolate the percentages by adding this re.findall(r'\(.*?\)', CaRecovery)
function to each individual string when I was creating the lists that make up my pandas data base, however this gave me some odd formatting problems:
Run Time Be Recovery ... Al Recovery K Recovery Ca Recovery
0 14 05:02:54 [(98.2%)] ... [(487.1%)] [(492.0%)] [(496.8%)]
1 19 08:14:59 [(101.6%)] ... [(499.8%)] [(508.6%)] [(511.9%)]
2 35 10:30:42 [(101.5%)] ... [(502.9%)] [(508.9%)] [(515.2%)]
It added the square brackets around the parentheses, and now for some reason
df = df.replace(r'[%]', '', regex=True)
has no effect on the database.
I need it just to be the numbers so I can convert the columns into floats.