6

I have the following data:

23:10:50        all     28.36      0.00      0.38      0.25      0.00     71.02
23:10:51        all     22.77      0.00      0.84      0.12      0.00     76.27
23:10:52        all     32.06      0.00      0.86      0.00      0.00     67.08
23:10:53        all     31.38      0.00      0.61      0.00      0.00     68.01
23:10:54        all     27.17      0.00      1.36      0.25      0.00     71.22
23:10:55        all     37.48      0.00      0.75      0.00      0.00     61.77
23:10:56        all     29.02      0.00      0.75      1.76      0.00     68.47
23:10:57        all     41.82      0.00      1.37      0.12      0.00     56.68
23:10:58        all     29.01      0.00      1.10      0.00      0.00     69.89
23:10:59        all     37.00      0.00      1.50      1.88      0.00     59.62
23:11:00        all     44.25      0.00      1.12      0.00      0.00     54.62
23:11:01        all     27.72      0.00      0.62      0.00      0.00     71.66
23:11:02        all     30.71      0.00      1.11      0.00      0.00     68.18
23:11:03        all     27.40      0.00      0.62      0.00      0.00     71.98
...

Which I parse with pandas the following way:

dateparse = lambda x: pd.datetime.strptime(x, '%H:%M:%S')

data = pd.read_csv('../../data/cpu.dat', delim_whitespace=True, header=None, usecols=[0,2,4,7], names=['Time','User','System','Idle'], parse_dates=[0], date_parser=dateparse)

The first column is Hour:Minutes:Seconds, and my intention is that pandes parses it that way. However it creates the following:

0    1900-01-01 23:10:50
1    1900-01-01 23:10:51
2    1900-01-01 23:10:52
3    1900-01-01 23:10:53
4    1900-01-01 23:10:54
5    1900-01-01 23:10:55
6    1900-01-01 23:10:56
7    1900-01-01 23:10:57
8    1900-01-01 23:10:58
9    1900-01-01 23:10:59
10   1900-01-01 23:11:00
11   1900-01-01 23:11:01
12   1900-01-01 23:11:02
13   1900-01-01 23:11:03

Any way of getting rid of the Year-Month-Day added?

Regards, Max

Diziet Asahi
  • 38,379
  • 7
  • 60
  • 75
Max Nicholson
  • 101
  • 1
  • 3
  • 8
  • 1
    I guess the question would be *why* do you not want to have the date in the dataframe? Usually, and for all practical purposes, it totally makes sense to have a date associated with the times. What exactly is the drawback of it? Knowing that, one may propose an alternative solution to the *actual* problem, which is for sure not that there are dates present in some dataframe, not doing any harm. – ImportanceOfBeingErnest Dec 18 '17 at 21:01

2 Answers2

10

Try this, where timestr is the name of the column that contains string representations of times:

data['time'] = pd.to_datetime(data['timestr']).dt.time
Peter Leimbigler
  • 10,775
  • 1
  • 23
  • 37
1

IIUC what your issue is, I changed your data frame's time column name,

df.rename(columns={0:'Time'}, inplace= True)
df
Time            1         2         3         4         5         6         7 
23:10:50        all     28.36      0.00      0.38      0.25      0.00     71.02
23:10:51        all     22.77      0.00      0.84      0.12      0.00     76.27
23:10:52        all     32.06      0.00      0.86      0.00      0.00     67.08
23:10:53        all     31.38      0.00      0.61      0.00      0.00     68.01...

Now I can change your Time column into timedelta64[ns]

df.Time = pd.to_timedelta(df.Time)
df

When I type df.dtypes, I get this,

Time    timedelta64[ns]
1                object
2               float64
3               float64
4               float64
5               float64
6               float64
7               float64
dtype: object

So, you have to convert your column into timedelta, your seaborn plot should work.

i.n.n.m
  • 2,936
  • 7
  • 27
  • 51