0

I'm starting on a project where I want to create an interactive plot from this dataset:

this dataset

For now I'm just trying to plot the first row from the 2000 to 2012 columns, for that I use this :

import pandas as pd
from bokeh.io import output_file
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure
from bokeh.plotting import show

output_file('test.html')

df = pd.read_csv('Swedish_Population_Statistics.csv', encoding="ISO-8859-1")
df.dropna(inplace=True)  # Drop rows with missing attributes
df.drop_duplicates(inplace=True)  # Remove duplicates

# Drop all the column I don't use for now
df.drop(['region', 'marital_status', 'sex'], inplace=True, axis=1)

x = df.loc[[0]]

print(x)

Which gives me this dataframe

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
0 10406 10362 10322 10288 10336 10336 10429 10585 10608 10718 10860 11121 11288

Now I want to take the column names as x-axis and the row values as y-axis.

This is where I'm stuck.

I figure the code would look like this but can't figure what to put in x and y

x = df.columns.tolist() #Take columns names into a list
y = df.loc[[0]].values.tolist() # Take the first row
source = ColumnDataSource(x, y)

p = figure(title="Test")
p.line(x='x', y='y', source=source, line_color="blue", line_width=2)

I get this error :

BokehUserWarning: ColumnDataSource's columns must be of the same length. Current lengths: ('x', 13), ('y', 1)

I don't understand why the lengths are not the same as I used tolist() on both.

Any help would be very appreciated, I've been trying to find a solution for the past 3 hours with no success.

Val1912
  • 1
  • 2
  • 1
    Welcome to Stack Overflow! Please include any relevant information [as text directly into your question](https://stackoverflow.com/editing-help), do not link or embed external images of source code or data. Images make it difficult to efficiently assist you as they cannot be copied and offer poor usability as they cannot be searched. See: [Why not upload images of code/errors when asking a question?](https://meta.stackoverflow.com/q/285551/15497888) – Henry Ecker May 30 '21 at 01:40
  • 1
    If you need assistance formatting a small sample of your DataFrame as a copyable piece of code for SO see [How to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/15497888). – Henry Ecker May 30 '21 at 01:40

2 Answers2

0

Okay so I found my problem, the main thing was that y was a 2-dimensional list but I needed a 1-d list. Which leads me to this this working code :

output_file('test.html')

df = pd.read_csv('Swedish_Population_Statistics.csv', encoding="ISO-8859-1")
df.dropna(inplace=True)  # Drop rows with missing attributes
df.drop_duplicates(inplace=True)  # Remove duplicates

# Drop all the column I don't use for now
df.drop(['region', 'marital_status', 'sex'], inplace=True, axis=1)

x = df.columns.tolist()
y = df.loc[[0]]
temp = []
temp2 = []

# Append each value of the dataframe row in a 1-dimension list one by one

for i in range(13):
    temp.append(y[str(2000+i)].tolist())
    temp2.append(temp[i][0])

p = figure(title="Test", sizing_mode="scale_both")
p.line(x, temp2, line_color="blue", line_width=2)
p.circle(x, temp2, fill_color="white", size=8)

show(p)

With this result :

Plot

Val1912
  • 1
  • 2
0

no need to create a loop. You were on the right track but you should not use double brackets

>>> df.loc[0].values.tolist()
[111, 222, 333]

Then the dimensions of x and y are the same.

jonas
  • 51
  • 6