1

Is It possible to plot single value as scatter plot? I can very well plot it in line by getting the ccdfs with markers but I want to know if any alternative is available?

Input:

Input 1

tweetcricscore 51 high active

Input 2

tweetcricscore 46 event based
tweetcricscore 12 event based
tweetcricscore 46 event based

Input 3

tweetcricscore 1 viewers 
tweetcricscore 178 viewers

Input 4

tweetcricscore 46 situational
tweetcricscore 23 situational
tweetcricscore 1 situational
tweetcricscore 8 situational
tweetcricscore 56 situational

I can very much write scatter plot code with bokeh and pandas using x and y values. But in case of single value ?

When all the inputs are merged as one input and are to be grouped by col[3], values are col[2].

The code below is for data set with 2 variables

import numpy as np
import matplotlib.pyplot as plt
from pylab import*
import math
from matplotlib.ticker import LogLocator
import pandas as pd
from bokeh.charts import Scatter, output_file, show

df = pd.read_csv('input.csv', header = None)

df.columns = ['col1','col2','col3','col4']

scatter = Scatter( df, x='col2', y='col3', color='col4', marker='col4', title='plot', legend=True)

output_file('output.html', title='output')

show(scatter)

Sample Output

enter image description here

Sitz Blogz
  • 1,061
  • 6
  • 30
  • 54
  • 1
    For scetterplot you have to say what do you want to have as x-axis and as y-axis. Alternatively you may have a `barplot` having category/etc. names as x-axis. – MaxU - stand with Ukraine May 12 '16 at 18:46
  • Yes it clear to have scatter plot x and y are needed. But in this case I have univariate data as output with different categories. Line and bar are very common. I am trying to find a much better visualisation type. – Sitz Blogz May 12 '16 at 18:50
  • @MaxU one more reason for not plotting bar charts is the inputs range to 1 to 1000s of values – Sitz Blogz May 12 '16 at 19:39
  • how many different categories are you going to have? I'm __not__ asking about values... – MaxU - stand with Ukraine May 12 '16 at 20:17
  • @MaxU As shown above three column with third column as category and 5 categories at max. – Sitz Blogz May 12 '16 at 20:33
  • you can simply sum up your values for each category - would it be useful for you? – MaxU - stand with Ukraine May 12 '16 at 20:49
  • @MaxU I cannot sum up coz these are outputs from some kind of classification and these categories are the classes. Can. Be written in one or independent output file. some kind of final results and how we show them in better way is challenge here.. – Sitz Blogz May 12 '16 at 23:07
  • 1
    Perhaps a [swarmplot](http://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.swarmplot.html)? – mwaskom May 12 '16 at 23:40
  • @mwaskom I am facing this issues with `swarmplot` data length is about 8k+ rows. – Sitz Blogz May 13 '16 at 13:27

4 Answers4

2

You could try a boxplot or violinplot. Alternatively if you don't like these and just want a vertical distribution of dots you could force a scatter to plot along a single x value. To do this you would need to create an array of a fixed value (say 1) that is the same length as the array you will be plotting:

ones = []
for range(len(data)):
    ones.append(1)

plt.scatter(ones,data)
plt.show()

That will give you something like this:

enter image description here

Grr
  • 15,553
  • 7
  • 65
  • 85
  • Thank you Grr for the solution. it's not about not liking line or bar plot. Its just that they are too common and want to search if any more interesting visualisations are available out thr that I should know about. – Sitz Blogz May 12 '16 at 19:06
  • one more reason for not plotting bar charts is the inputs range to 1 to 1000s of values – Sitz Blogz May 12 '16 at 19:40
  • How about swarm plot? – Sitz Blogz May 13 '16 at 01:33
  • 1
    Yea that's a good one. Documentation is [here](https://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.swarmplot.html). [This question](http://stackoverflow.com/questions/36153410/how-to-create-swarm-plot-with-matplotlib/36153957) has a good example of a swarmplot boxplot combo. – Grr May 13 '16 at 02:27
1

UPDATE:

look at Bokeh and Seaborn galleries - it might help you to understand what kind of plot fits your needs

you may try violinplot like this:

sns.violinplot(x="category", y="val", data=df)

enter image description here

or HeatMaps:

import numpy as np
import pandas as pd
from bokeh.charts import HeatMap, output_file, show

cats = ['active', 'based', 'viewers', 'situational']
df = pd.DataFrame({'val': np.random.randint(1,100, 1000), 'category': np.random.choice(cats, 1000)})

hm = HeatMap(df)
output_file('d:/temp/heatmap.html')
show(hm)
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
0

You can plot index on x-axis and column value on y-axis

df = pd.DataFrame(np.random.randint(0,10,size=(100, 1)), columns=list('A'))
sns.scatterplot(data=df['A'])

enter image description here

H_J
  • 406
  • 3
  • 7
0

Something I use rather regularly is a "size plot" – a visualization similar to the one you're requesting where a single feature can be compared across groups. Here is an example using your data:

a size plot made using matplotlib

Here is the code to achieve this size plot:

fig, ax = plt.subplots(1,1, figsize=(8,5))

colors = ['blue','green','orange','pink']

yticks = {"ticks":[],"labels":[]}
xticks = {"ticks":[],"labels":[]}

agg_functions = ["mean","std","sum"]

# Set size plot
for i, (label, group_df) in enumerate(df.groupby('type', as_index=False)):

    # Set tick
    yticks["ticks"].append(i)
    yticks["labels"].append(label)

    agg_values = group_df["tweetcricscore"].aggregate(agg_functions)

    for ii, (agg_f, x) in enumerate(agg_values.iteritems()):
        ax.scatter(x=ii, y = i, label=agg_f, s=x, color=colors[i])


        # Add your x axis
        if ii not in xticks["ticks"]:
            xticks["ticks"].append(ii)
            xticks["labels"].append(agg_f)


# Set yticks:
ax.set_yticks(yticks["ticks"]) 
ax.set_yticklabels(yticks["labels"], fontsize=12)

ax.set_xticks(xticks["ticks"]) 
ax.set_xticklabels(xticks["labels"], fontsize=12)


plt.show()
Yaakov Bressler
  • 9,056
  • 2
  • 45
  • 69