3

I'm developing the following function: extract_name_value() that generates a step chart taking the values of a pandas DataFrame in Python, for now it works fine, but I want to add the values of the variable points_axisyvalue or values_list to it in each marker: Script Here

I tried to use the following examples:Data value at each marker, Matplotlib scatter plot with different text at each data point or How to put individual tags for a matplotlib scatter plot?, which would be something like what I want; also I even tried using plt.annotate(), but the data of the values does not come out the way I want it, plus I think it would cover up the graph a lot and not appreciate well. Below I put the code in which I'm using plt.annotate():

    # Function to extract the Name and Value attributes
    def extract_name_value(signals_df, rootXML):
        # print(signals_df)
        names_list = [name for name in signals_df['Name'].unique()]
        num_names_list = len(names_list)
        num_axisx = len(signals_df["Name"])
        values_list = [value for pos, value in enumerate(signals_df["Value"])]
        print(values_list)
        points_axisy = signals_df["Value"]
        print(len(points_axisy))

        colors = ['b', 'g', 'r', 'c', 'm', 'y']
    
        # Creation Graphic
        fig, ax = plt.subplots(nrows=num_names_list, figsize=(20, 30), sharex=True)
        plt.suptitle(f'File XML: {rootXML}', fontsize=16,         fontweight='bold', color='SteelBlue', position=(0.75, 0.95))
        plt.xticks(np.arange(-1, num_axisx), color='SteelBlue', fontweight='bold')
        labels = ['value: {0}'.format(j) for j in values_list]
        print(labels)
        i = 1
        for pos, name in enumerate(names_list):
            # get data
            data = signals_df[signals_df["Name"] == name]["Value"]
            print(data)
            # get color
            j = random.randint(0, len(colors) - 1)
            # get plots by index = pos
            x = np.hstack([-1, data.index.values, len(signals_df) - 1])
            y = np.hstack([0, data.values, data.iloc[-1]])
            ax[pos].plot(x, y, drawstyle='steps-post', marker='o', color=colors[j], linewidth=3)
            ax[pos].set_ylabel(name, fontsize=8, fontweight='bold', color='SteelBlue', rotation=30, labelpad=35)
            ax[pos].yaxis.set_major_formatter(ticker.FormatStrFormatter('%0.1f'))
             ax[pos].yaxis.set_tick_params(labelsize=6)
             ax[pos].grid(alpha=0.4)
             i += 1

             for label, x, y in zip(labels, x, y):
             plt.annotate(label, xy=(x, y), xytext=(-20, 20), textcoords='offset points', ha='right', va='bottom', bbox=dict(boxstyle='round,pad=0.5', fc='yellow', alpha=0.5),
                    arrowprops=dict(arrowstyle='->', connectionstyle='arc3,rad=0'))

        plt.show()

What I get is the annotations spliced and in different positions.

But, What does my code need to show each value at each point?

I've also been trying to use the code from the Matplotlib reference and couldn't get it done: Marker Reference. Thank you very much in advance, any comment helps.

MayEncoding
  • 87
  • 1
  • 12
  • Could you give an example of how you want your plots to look? – Yulia V Jul 09 '21 at 21:38
  • Hello @Yulia, the ideal is the example that I'd in onenotes and I have put here: [example_freehand.png](https://github.com/MarshRangel/Python/blob/master/example_freehand.png). I think we would have to use one of the variables marked in yellow or the last one (y) that I show in the following image: [data_values_chart.png](https://github.com/MarshRangel/Python/blob/master/data_values_chart.png) But I haven't been able to find how to do it. Thanks a lot. – MayEncoding Jul 09 '21 at 23:26

2 Answers2

2

You can use plt.annotate function in a loop to solve your problem.

I randomly generated some data and plotted it as a single plot. You can do the same inside a subplot, the function would be the same.

# sample data points for the plot
x=np.arange(1,10)
y=np.linspace(20,40,9)

plt.figure(figsize=[15,5],dpi=200)
plt.plot(x,y,drawstyle='steps-post', marker='o')
# using annotate function to show the changepoints in a loop 
for i in range(len(x)):
    # I rounded the y values as string and used the same x and y coords as the locations
    # next we can give a constant offset points to offset the annotation from each value
    # here I used (-20,20) as the offset values
    plt.annotate(f"{str(round((y[i])))}",(x[i],y[i]),xycoords='data',
                 xytext=(-20,20), textcoords='offset points',color="r",fontsize=12,
                 arrowprops=dict(arrowstyle="->", color='black'))

You can remove the arrowprops if you don't want the arrows.

enter image description here

Edited

I used the example1.xml file in your GitHub repo and edited the function a bit. All I did was add a loop and an if-else condition to your function.

# Initial part is same as yours
names_list = [name for name in signals_df['Name'].unique()]
num_names_list = len(names_list)
num_axisx = len(signals_df["Name"])
values_list = [value for pos, value in enumerate(signals_df["Value"])]
points_axisy = signals_df["Value"]
colors = ['b', 'g', 'r', 'c', 'm', 'y']
# start new figure
plt.figure(figsize=[20,28],dpi=200)
#start a loop with the subplots
for i in range(len(names_list)):
    # subplot has 14 rows, 1 column and the i+1 represents the i'th plot
    plt.subplot(num_names_list,1,i+1)
    # choose color
    col=np.random.randint(0, len(colors) - 1)
    # get the locations of the values with the similar name in your list
    locs=signals_df['Name']==names_list[i]
    # get the values in those locations
    data=signals_df['Value'][locs]
    # arrange the x and y coordinates
    x = np.hstack([-1, data.index.values, len(signals_df) - 1])
    y = np.hstack([0, data.values, data.iloc[-1]])
    # plot the values as usual
    plt.plot(x, y, drawstyle='steps-post', marker='o', color=colors[col], linewidth=3)
    plt.ylabel(names_list[i], fontsize=8, fontweight='bold', color='SteelBlue', rotation=30, labelpad=35)
    plt.grid(alpha=0.4)
    # this loop is for annotating the values 
    for j in range(len(x)):
        # I found it is better to alternate the position of the annotations 
        # so that they wont overlap for the adjacent values
        if j%2==0:
            # In this condition the xytext position is (-20,20)
            # this posts the annotation box over the plot value
            plt.annotate(f"Val={round((y[j]))}",(x[j],y[j]),xycoords='data',
                         xytext=(-20,20), textcoords='offset points',color="r",fontsize=8,
                         arrowprops=dict(arrowstyle="->", color='black'),
                        bbox=dict(boxstyle='round', pad=0.5, fc='yellow', alpha=0.5))
        else:
            # In this condition the xytext position is (-20,-20)
            # this posts the annotation box under the plot value
            plt.annotate(f"Val={round((y[j]))}",(x[j],y[j]),xycoords='data',
                 xytext=(-20,-20), textcoords='offset points',color="r",fontsize=8,
                 arrowprops=dict(arrowstyle="->", color='black'),
                         bbox=dict(boxstyle='round', pad=0.5, fc='yellow', alpha=0.5))

New Function Result

enter image description here

I hope that it is useful.

ArunRaj131
  • 21
  • 4
  • Thanks @ArunRaj131. It is like what I was trying in the last image I share in my question, but what would it be like if I am using subplots, so that it keeps the name, values and the len of the signals? See `x = np.hstack ([- 1, data.index.values, len (signals_df) - 1])` `y = np.hstack ([0, data.values, data.iloc [-1]])` of the code I share. – MayEncoding Jul 13 '21 at 15:29
  • Hi, I tried to edit your code a little bit. See the edited answer and the code. Hope this helps. – ArunRaj131 Jul 14 '21 at 04:04
  • Thank you @ArunRaj131. One question... The name variable of the plt.ylabel where are you taking it from? – MayEncoding Jul 14 '21 at 13:56
  • How did you upload it to tkinter? – MayEncoding Jul 14 '21 at 14:14
  • I already fixed it using the same variable from the first for with index inside plt.ylabel, type: `name_list [i]`. I just want to know how you uploaded it to tkinter? Thank you in advance for your help. – MayEncoding Jul 14 '21 at 14:22
  • Oh sorry, I forgot to change the ylabels. I also did it just like you said. – ArunRaj131 Jul 14 '21 at 15:23
  • And, I didn't upload it to tkinter. I don't understand why you got that doubt. – ArunRaj131 Jul 14 '21 at 15:24
  • Oh it's true! you take it out by console. What happens is that in GitHub, I upload it to tkinter: [py script](https://github.com/MarshRangel/Python/blob/develop/Load_File_XML.py) but thanks, I'll keep checking it. – MayEncoding Jul 14 '21 at 15:45
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/234878/discussion-between-marsh-rangel-and-arunraj131). – MayEncoding Jul 14 '21 at 23:09
1

I think it should be quite close to what you are after. I randomly generate data, then annotate it using matplotlib.text. It's not very pretty, you might want to add some padding and more refinements, but I hope it gives a good idea!

If two points are too close, you might want to annotate one on the left, and the other one on the right, like I am doing for the first point. I have not seen such a situation in the examples that you have given, so it's not handled.

Function place_label(label, xy, position, ax, pad=0.01) places the label where you want it to be. The rest of the code is demonstrating that it works, using randomly generated data.

enter image description here

import random
import numpy as np
import matplotlib.pyplot as plt

# function that places the label give the desired position
def place_label(label, xy, position, ax, pad=0.01):

  # annotate in the initial position, xy is the top right corner of the bounding box
  t_ = ax.text(x=xy[0], y=xy[1], s=label, fontsize=16)

  # find useful values
  tbb = t_.get_window_extent(renderer=rend)
  abb = ax.get_window_extent(renderer=rend)
  a_xlim, a_ylim = ax.get_xlim(), a_.get_ylim()

  # now adjust the position if needed
  new_xy = [xy[0], xy[1]]

  relative_width = tbb.width/abb.width * (a_xlim[1] - a_xlim[0])
  pad_x = pad * (a_xlim[1] - a_xlim[0])
  assert(position[0] in ['l', 'c', 'r'])
  if position[0] == 'c':
    new_xy[0] -= relative_width/2
  elif position[0] == 'l':
    new_xy[0] -= relative_width + pad_x
  else:
    new_xy[0] += pad_x

  relative_height =  tbb.height/abb.height * (a_ylim[1] - a_ylim[0])
  pad_y = pad * (a_ylim[1] - a_ylim[0])
  assert(position[1] in ['b', 'c', 't'])
  if position[1] == 'c':
    new_xy[1] -= relative_height/2
  elif position[1] == 'b':
    new_xy[1] -= relative_height + pad_y
  else:
    new_xy[1] += pad_y

  t_.set_position(new_xy)

  return t_

# generate data, plot it and annotate it!
axes_qty = 9
axes_gap = 0.035

fig = plt.figure(figsize=(10, 8))
ax = [plt.axes([axes_gap, axes_gap/2 + i*(1/axes_qty), 1-2*axes_gap, 1/axes_qty-axes_gap]) for i in range(axes_qty)]
rend = fig.canvas.get_renderer()

for a_ in ax:
  x_ = [random.randint(0, 10) for _ in range(5)]
  x_ = np.unique(x_)
  y_ = [random.randint(0, 12) for _ in x_]
  # as x is shared, we set the limits in advance, otherwise the adjustments won't be accurate
  a_.set_xlim([-0.5, 10.5])
  
  # plotting the data
  data_ = [[x_[0], y_[0]]]
  for i in range(1, len(x_)):
    data_ += [[x_[i-1], y_[i]], [x_[i], y_[i]]]
  a_.plot([d[0] for d in data_], [d[1] for d in data_])

  mid_y = 0.5 * (a_.get_ylim()[0] + a_.get_ylim()[1])

  # now let's label it
  for i in range(len(x_)):
    # decide what point we annotate
    if i == 0:
      xy = [x_  [0], y_[0]]
    else:
      xy = [x_[i-1], y_[i]]

    # decide its position
    position_0 = 'l' if i == 0 else 'r'
    position_1 = 'b' if xy[1] > mid_y else 't'

    place_label(label=str(xy[1]), xy=xy, position=position_0+position_1, ax=a_)

plt.show()
Yulia V
  • 3,507
  • 10
  • 31
  • 64
  • I've been going through your code, integrating the function place_label (label, xy, position, ax, pad = 0.01) with extract_name_value (signals_df, rootXML), but it doesn't come out exactly as I want it [check image](https://github.com/MarshRangel/Python/blob/develop/example_some_values.png), which is with a value in each marker, there are some values that don't even appear; by the way, the position doesn't matter. To review the code you can see the script directly [here](https://github.com/MarshRangel/Python/blob/develop/Load_File_XML.py). How could I make each value be in a marker? Thanks. – MayEncoding Jul 12 '21 at 23:19
  • one of the reasons is that you are using `get_ylim` before plotting the anything on the axes - this cannot work, obviously. – Yulia V Jul 13 '21 at 09:45
  • Try to replace line 127 in your script with `0.5 * (min(y) + max(y))` and see what happens. – Yulia V Jul 13 '21 at 09:46
  • @MarshRangel the easiest way for you would be to start with my example - it should work, then replace my dummy data with your data - I guess it will still work, then add your axes labels, and other bells and whistles. I honestly think it will be the fastest way to get your task done :) – Yulia V Jul 13 '21 at 09:55
  • also, if, when adding bells and whistles, your code no longer works, you know that it's because of bells and whistles, and you will need to ask a different question on SO. – Yulia V Jul 13 '21 at 09:56
  • I replaced line 127 with this: `0.5 * (min (y) + max (y))` and I get the same – MayEncoding Jul 13 '21 at 15:23
  • I will try to replace all my code as you tell me in your fourth comment and I will let you know how it turns out, in the end I can return it as it was, if it does not work; but I hope so. Thanks. – MayEncoding Jul 13 '21 at 15:23