I have a numpy series with values like "1.0s", "100ms", etc. I can't plot this (with pandas, after putting the array into a series), because pandas doesn't recognize that these are numbers. How can I have numpy or pandas extrapolate these into numbers, while paying attention to the suffixes?
Asked
Active
Viewed 49 times
1
-
2right out of the docs: http://pandas.pydata.org/pandas-docs/stable/timedeltas.html – Jeff Nov 10 '16 at 00:19
2 Answers
1
see question how do I get at the pandas.offsets object given an offset string
- use
pandas.tseries.frequencies.to_offset
- convert to timedeltas
- get total seconds
from pandas.tseries.frequencies import to_offset
s = pd.Series(['1.0s', '100ms', '10s', '0.5T'])
pd.to_timedelta(s.apply(to_offset)).dt.total_seconds()
0 0.0
1 0.1
2 10.0
3 300.0
dtype: float64

Community
- 1
- 1

piRSquared
- 285,575
- 57
- 475
- 624
0
This code could solve your problem.
# Test data
se = Series(['10s', '100ms', '1.0s'])
# Pattern to match ms and as integer of float
pat = "([0-9]*\.?[0-9]+)(ms|s)"
# Extracting the data
df = se.str.extract(pat, flags=0, expand=True)
# Renaming columns
df.columns = ['value', 'unit']
# Converting to number
df['value'] = pd.to_numeric(df['value'])
# Converting to the same unit
df.loc[df['unit']=='s', ['value', 'unit']] = (df['value'] * 1000, 'ms')
# Now you are ready to plot !
print(df['value'])
# 0 10000.0
# 1 100.0
# 2 100000.0

Romain
- 19,910
- 6
- 56
- 65