0

I am new to Matplotlib. How do I prevent my graph from backtracking? Note the hook in the upper right of my graph. The X axis are strings, the Y axis are floats. My Matplotlib Graph

Here is my data:

['192.168.0.1', 2.568]
['96.120.96.153', 14.139]
['96.110.232.169', 10.505]
['162.151.49.133', 11.446]
['68.86.90.225', 24.335]
['68.86.84.226', 23.631]
['68.86.83.94', 29.011]
['173.167.58.162', 35.688]
['209.58.57.17', 162.768]
['64.86.79.2', 187.42]
['64.86.21.104', 162.461]
['63.243.205.1', 166.525]
['120.29.217.66', 156.898]
['209.58.86.143', 156.785]
['120.29.217.66', 181.599]

And the corresponding code:

import matplotlib.pyplot as plt

# x axis values
a = []
# corresponding y axis values
b = []

for k in range(15):
    a.append(dataArray[k][0])
    b.append(dataArray[k][1])

# plotting the points
plt.plot(a, b)

# naming the x axis
plt.xlabel('Hop Addresses')

# naming the y axis
plt.ylabel('average time (ms)')

# giving a title to my graph
plt.title('Time vs. Hops')

# function to show the plot
plt.show()
Slot Machine
  • 71
  • 1
  • 9
  • 1
    Can you post come code and your values? I suspect that your x values are not in order. – Cohan Jan 11 '20 at 15:53
  • That is correct, they are strings that are not in order. I'm trying to do a graph of a traceroute, time vs IP address, so I don't want them to be in order. – Slot Machine Jan 11 '20 at 15:55
  • so, the IP addresses are in order of access, but the times are independent of other times? – Cohan Jan 11 '20 at 16:10
  • 1
    Well, if they are not in order, they go back in time at some point. Sometimes, in life you have to make choices. – JohanC Jan 11 '20 at 16:12

2 Answers2

0

A line connects points in sequence. You need the data to be in the correct order, for the line graph to make sense. We can also just use the index as the 'x' value. For instance, for your data as below

data = [['192.168.0.1', 2.568],
        ['96.120.96.153', 14.139],
        ['96.110.232.169', 10.505],
        ['162.151.49.133', 11.446],
        ['68.86.90.225', 24.335],
        ['68.86.84.226', 23.631],
        ['68.86.83.94', 29.011],
        ['173.167.58.162', 35.688],
        ['209.58.57.17', 162.768],
        ['64.86.79.2', 187.42],
        ['64.86.21.104', 162.461],
        ['63.243.205.1', 166.525],
        ['120.29.217.66', 156.898],
        ['209.58.86.143', 156.785],
        ['120.29.217.66', 181.599]]

we can get only the delay in the same order provided by traceroute. Then the graph will be correct.

from matplotlib import pyplot as plt
plt.plot([i[1] for i in data]) 
plt.show()
darcamo
  • 3,294
  • 1
  • 16
  • 27
0

The issue you're having is that plt.plot() will plot lines from one data point to the next.

It is evident by the data that each timestep is independent of the other, but the order of IP addresses is important. plt.plot() tends to indicate a trend. Since the data is not a trend, rather a series of independent events, a different type of graph might be more appropriate. I suggest a bar graph in this case. And since the x-axis labels are bunched up, a horizontal bar chart would be even better.

You'll notice in the graph below that 120.29.217.66 appears twice. Once for each hop. To do account for this, plot against the index in the list rather than the IP address and then replace the y-axis labels.

import matplotlib.pyplot as plt

data = [
    ['192.168.0.1', 2.568],
    ['96.120.96.153', 14.139],
    ['96.110.232.169', 10.505],
    ['162.151.49.133', 11.446],
    ['68.86.90.225', 24.335],
    ['68.86.84.226', 23.631],
    ['68.86.83.94', 29.011],
    ['173.167.58.162', 35.688],
    ['209.58.57.17', 162.768],
    ['64.86.79.2', 187.42],
    ['64.86.21.104', 162.461],
    ['63.243.205.1', 166.525],
    ['120.29.217.66', 156.898],
    ['209.58.86.143', 156.785],
    ['120.29.217.66', 181.599],
]

idxs = range(len(data))
ips = [i[0] for i in data]
times = [i[1] for i in data]

plt.barh(idxs, times)  # plot times vs the index of the array

plt.ylabel('Hop Addresses')
plt.xlabel('average time (ms)')
plt.title('Time vs. Hops')

plt.yticks(idxs, ips)  # Replace tick labels with IP Addresses

plt.tight_layout()
plt.show()

enter image description here

Since we tend to read from top to bottom, you can always flip the y-axis.

plt.gca().invert_yaxis()

enter image description here

Cohan
  • 4,384
  • 2
  • 22
  • 40
  • Oh gosh, yes, I should have noticed that I had two identical addresses. That all makes perfect sense now. Thank you for noticing, and thanks for the formatting suggestions! – Slot Machine Jan 11 '20 at 20:49