0

I am trying to plot a scatter plot, with connected points, with gnuplot 4.6 on Ubuntu 15.10.
My .dat file looks as follows:

    X           Y           ?
63072000        33          New York
64022400        12          Sacramento
64022400        21          Seattle
315532800       33          Boston
639964800       21          San Francisco
706320000       33          Seattle

So, the X-axis contains the date, the Y axis contains the event, where the numbers symbolize the weather condition in groups (10 -> Sunny condition, 30 -> Rainy condition, and so on, where the second digit describes the severity). X, as well as Y can occur multiple times. The color (or shape) or the dots should indicate the location, which I marked with ? above. Ultimately, the graph should show the date, the event as well as the trend (by connecting the dots).

I tried the following, which I found in this SO post:

plot "weather.dat" u 1:2:3 with lines

But the X-range seems to be invalid. Does anyone see the error? :/

And one more thing: It doesn't matter, if gnuplot or matplotlib- I am thankful for hints in any direction :)

Thanks!

EDIT

Thanks to armatita, the plot is now almost done: Semi-Final Plot

Community
  • 1
  • 1
user1252280
  • 373
  • 1
  • 7
  • 23

2 Answers2

2

The link you've presented brings a world map. The example I'm showing here only has the markers with lines connecting them (so they are not geo-localized). In any case using matplotlib (don't know about GNUplot) you can adapt the following recipe:

    import random
    import matplotlib.pyplot as plt

    x = ['01.01.1960','12.01.1960','12.01.1960','01.01.1970','13.04.1980']
    y = ['Heavy Rain','Sunshine','Slight Hail','Heavy Rain','Slight Hail']
    l = ['New York','Sacramento','Seattle','Boston','San Francisco']

    nx,ny = [],[]
    for i in range(len(x)):
        nx.append(i)
        ny.append(-i)
        s = random.randint(100,150)
        m = random.choice(['o','s','^','d'])
        color = random.randint(0,255)/255,random.randint(0,255)/255,random.randint(0,255)/255
        plt.scatter(i,-i,s=s,marker=m,color=color,label=y[i])
        plt.text(i,-i,x[i])
    plt.plot(nx,ny,'--')
    plt.legend()
    plt.show()

Which will give an image like this:Markers with line connecting them

Notice I'm changing the size of the marker, the marker itself, color, adding text to each point, and in the end prompting a legend.

If you want to draw a map considering real locations you might want to take a look at Basemap

EDIT (after poster clearing it's intention): The following code:

    xt = [63072000,64022400,64022400,315532800,639964800,706320000]
    y2 = [33,12,21,33,21,33]
    l  = ['New York','Sacramento','Seattle','Boston','San Francisco','Seattle']
    lm  = ['o','s','^','d','*','^']
    cl  = ['red','blue','green','orange','purple','green']

    import matplotlib.pyplot as plt
    from matplotlib.dates import YearLocator, MonthLocator, DateFormatter,AutoDateLocator
    import datetime

    # get the dates into something readable
    x2 = [datetime.datetime.fromtimestamp(i) for i in xt]
    years = YearLocator()   # every year
    months = MonthLocator()  # every month
    yearsFmt = DateFormatter('%Y')
    auto = AutoDateLocator()

    # plot lines and markers
    fig, ax = plt.subplots()
    ax.plot_date(x2, y2, '--',color='black')
    for i in range(len(x2)):
        ax.scatter(x2[i],y2[i],s=300,marker=lm[i],color=cl[i])
        plt.text(x2[i],y2[i],l[i])

    # format the ticks
    ax.xaxis.set_major_locator(auto)
    ax.xaxis.set_major_formatter(yearsFmt)
    ax.xaxis.set_minor_locator(months)
    ax.autoscale_view()

    ax.set_yticks([10,20,30], minor=False)
    ax.set_yticklabels(['Sunny','More or Less','Rainy'])
    #ax.yticks([10,20,30], ['Sunny','More or Less','Rainy'], rotation='vertical')

    ax.fmt_xdata = DateFormatter('%Y-%m-%d')
    ax.grid(True)

    fig.autofmt_xdate()

    plt.show()

, will result in the following result:

enter image description here

NOTE: I must say this is a really strange plot. Intuitively it seems to me the trend should for each city and not between cities (you should have a line for each city, although you don't seem to have the data to do such a plot). In any case this is the code that does your request.

armatita
  • 12,825
  • 8
  • 48
  • 49
  • Thanks a lot for your answer: I actually prefer matplotlib, but was recommended gnuplot for some reason. I appreciate your code, but I think there was a misunderstanding: I would like to have the time on the X-Axis and the weather on the Y-axis. I also added a line in the sample data (the city can also occur multiple times). So I want to say: "In which cities occured the weather when" and then see the trend and how it changed over time. Also: thanks a lot for the basemap hint: Didn't know that before! – user1252280 Mar 18 '16 at 17:18
  • 1
    Sure. Matplotlib can handle directly the date, check this example: http://matplotlib.org/examples/pylab_examples/date_demo1.html As for the weather you need to somehow transform the strings you have into some kind of quantitative variable. For instance lower values for less rain, higher values for more rain in order to put them on the Y axis. Once you've decided how you want to do it just rewrite the labels in the plot. For this check: http://matplotlib.org/examples/ticks_and_spines/ticklabels_demo_rotation.html – armatita Mar 18 '16 at 21:56
  • Wow, you are pretty fast with answering- I cant keep up with my questions ;) So, I made several changes as you suggested: First of all: The date is now a not-delimited-string (UNIX timestamp), so the X-axis contains raising numbers. Second, as you kindly suggested, I now have numbers (groups) for each weather condition: Slight Rain (31), Rain (32), Heavy Rain (33), which I explain in the text around the plot. I also updated the initial question! Would you be so kind and update your answer? I can then accept it as the right one. Thanks for your help :) – user1252280 Mar 19 '16 at 12:43
  • 1
    I've edited the answer although I must say this is a really strange plot. Personally I advise you to get more data and do a line for each city. Trend analysis using different locations (and considering is weather what we are talking about) is, at the very least, uncommon (and debatable from the scientific point of view). – armatita Mar 19 '16 at 18:09
  • I am so very sorry, that I keep us both busy and apparently still not communicating the right thing :( I updated the question one final time with a plot- maybe it is clearer, how I tried to explain things to you. On the X-axis the ascending sorted Unix Timestamp and on the Y-Axis the number, which correlates to both: the weather condition (which I will explain in the text around the plot) and the Unix Timestamp. The last thing missing is the proper orientation of the axis labels (neither X nor Y are right besides the ticks). Sorry for this big misunderstanding (and waste of your time)! – user1252280 Mar 19 '16 at 20:10
  • 1
    @herMan the code has everything you need to do what you are requesting. It has methods to position the ticks (ax.set_yticks([10,20,30], minor=False)), methods to write whatever you want on the ticks (ax.set_yticklabels(['Sunny','More or Less','Rainy'])), and deals with both color and markers on the scatter function. I'm sure it will take you only a few minutes to solve these apparent problems and adapt whatever you need to adapt. – armatita Mar 19 '16 at 20:40
0

There are several corrections:

First, the column separator seems to be a tabulation, so you could write:

set datafile separator "\t"

otherwise, there would be an ambiguity between "Heavy Rain" "New York" and "Heavy" "Rain New York"

Second, the time format must be described:

set xdata time
set timefmt "%d.%m.%Y"

Third, there are no numeric data values but labels, and a palette of colors depending on the values, so there might be some interesting options to try:

plot ... with labels palette