I've been trying to understand how to accomplish this very simple task of plotting two datasets, each with a different color, but nothing i found online seems to do it. Here is some sample code:
import pandas as pd
import numpy as np
import holoviews as hv
from holoviews import opts
hv.extension('bokeh')
ds1x = np.random.randn(1000)
ds1y = np.random.randn(1000)
ds2x = np.random.randn(1000) * 1.5
ds2y = np.random.randn(1000) + 1
ds1 = pd.DataFrame({'dsx' : ds1x, 'dsy' : ds1y})
ds2 = pd.DataFrame({'dsx' : ds2x, 'dsy' : ds2y})
ds1['source'] = ['ds1'] * len(ds1.index)
ds2['source'] = ['ds2'] * len(ds2.index)
ds = pd.concat([ds1, ds2])
Goal is to produce two datasets in a single frame, with a categorical column keeping track of the source. Then i try plotting a scatter plot.
scatter = hv.Scatter(ds, 'dsx', 'dsy')
scatter
And that works as expected. But i cannot seem to understand how to color the two datasets differently based on the source
column. I tried the following:
scatter = hv.Scatter(ds, 'dsx', 'dsy', color='source')
scatter = hv.Scatter(ds, 'dsx', 'dsy', cmap='source')
Both throw warnings and no color. I tried this:
scatter = hv.Scatter(ds, 'dsx', 'dsy')
scatter.opts(color='source')
Which throws an error. I tried converting the thing to a Holoviews dataset, same type of thing.
Why is something that is supposed to be so simple so obscure?
P.S. Yes, i know i can split the data and overlay two scatter plots and that will give different colors. But surely there has to be a way to accomplish this based on categorical data.