1

I have a numpy series with values like "1.0s", "100ms", etc. I can't plot this (with pandas, after putting the array into a series), because pandas doesn't recognize that these are numbers. How can I have numpy or pandas extrapolate these into numbers, while paying attention to the suffixes?

Turtle V. Rabbit
  • 313
  • 1
  • 3
  • 8
  • 2
    right out of the docs: http://pandas.pydata.org/pandas-docs/stable/timedeltas.html – Jeff Nov 10 '16 at 00:19

2 Answers2

1

see question how do I get at the pandas.offsets object given an offset string


  • use pandas.tseries.frequencies.to_offset
  • convert to timedeltas
  • get total seconds

from pandas.tseries.frequencies import to_offset

s = pd.Series(['1.0s', '100ms', '10s', '0.5T'])
pd.to_timedelta(s.apply(to_offset)).dt.total_seconds()

0      0.0
1      0.1
2     10.0
3    300.0
dtype: float64
Community
  • 1
  • 1
piRSquared
  • 285,575
  • 57
  • 475
  • 624
0

This code could solve your problem.

# Test data
se = Series(['10s', '100ms', '1.0s'])

# Pattern to match ms and as integer of float
pat = "([0-9]*\.?[0-9]+)(ms|s)"
# Extracting the data
df = se.str.extract(pat, flags=0, expand=True)
# Renaming columns
df.columns = ['value', 'unit']
# Converting to number
df['value'] = pd.to_numeric(df['value'])
# Converting to the same unit
df.loc[df['unit']=='s', ['value', 'unit']]  = (df['value'] * 1000, 'ms')

# Now you are ready to plot !
print(df['value'])
# 0     10000.0
# 1       100.0
# 2    100000.0
Romain
  • 19,910
  • 6
  • 56
  • 65