2

I have a dataframe that consists of several experimental runs with different 'x-scales' to zero-in on a particular modelled behaviour, i.e.

  • Exp1: xs = np.linspace(0.005,0.75,10)
  • Exp2: xs = np.linspace(0.015,0.035,20)

Combining these into a single dataframe for processing is as simple as a pd.concat but my difficulty is in plotting results.

ax=v.plot(
figsize=(10,13),kind='line',
secondary_y='average_rx_delay',
logy=True,
title="Performance Comparison of Varying Packet Period Rates \n(counts on left, seconds on right)"
)
#ax.set_xlabel('Packet Emmission rate (per second)')
ax.set_ylabel('Packet Count')

enter image description here

As you can see, the data frame index is being used as the 'series title' you could say, but it's not being numerically assessed, leading to uneven and skewed lines.

It's slightly easier to see why this is happening if you plot it bar-wise

bar plots of the problem

What I'm looking for is something like the below but as lines.

linearised scatter graph

Which was generated lazily going the long way

f, ax1 = plt.subplots()
ax1.scatter(list(v.index),
     v.collisions, c='r')
ax1.scatter(list(v.index),
     v.tx_counts, c='b')
ax1.scatter(list(v.index),
     v.rx_counts, c='g')
ax1.scatter(list(v.index),
     v.enqueued, c='y')
ax2=ax1.twinx()
ax2.scatter(list(v.index),
     v.average_rx_delay, c='c')

Basically, I want line plots to take the v.index as the x-axis value but stick to being actual numbers!

I've tried adding x=v.index to the plot call, as well as adding the index as another column and tried using the new column in the same manner but that's been no joy.

Any magical ideas or should I just start going the long untidy DIY way?

Update

As per @ajean's question, this is what a selection of the data looks like. Note that PER is the 'added in again' index column for the x=v.PER attempt mentioned above, but it's correctly discarded by the main .plot anyway.

dataframe screenshot

Bolster
  • 7,460
  • 13
  • 61
  • 96
  • 1
    I'm slightly confused, what does `v.index` look like? You indicate that it's being used as "series names" but it looks to me like those are columns. Where do your two x-scales live in the DataFrame? – Ajean Nov 24 '14 at 17:03
  • @ajean updated to include dataview. Basically when plotting with the DataFrame, it looks like it's ignoring the idea that the indexes might be numerical instead of simply categorical. Long story short, I'm looking for 'straight' lines with a bit more detail between 0.015 and 0.035 because that's where all the interesting stuff is happening. – Bolster Nov 24 '14 at 17:10
  • 1
    Hmmm. Your first plot example doesn't use `x=v.PER`, so it should discard that, which is fine. Perhaps check what kind of index you have (i.e. make sure it's a `Float64Index` or if it's something that would translate to categorical)? I feel like `plot()` should just work, unless you've hit a bug in pandas. If you post a copy-pasteable piece of the dataframe I can test it. – Ajean Nov 24 '14 at 17:19

1 Answers1

1

It looks like your index is used as a categorial input. You can try df.column_name = df.column_name.astype(float). I have based this answer on Converting strings to floats in a DataFrame. If you want lines instead of points, then you should use plot instead of scatter.

Community
  • 1
  • 1
Hennep
  • 1,602
  • 1
  • 11
  • 20