1

I'm attempting to plot a time series date against a float value on a scatter plot using Pandas, however I get the odd error 'Length mismatch: Expected axis has 3 elements, new values have 2 elements' when attempting to plot the data.

Here is the Python code I am using

get_ipython().magic('matplotlib inline')
import matplotlib.pyplot as plt
import seaborn; seaborn.set()
import pandas as pandas


base_rate_over_time = pandas.read_csv("/Users/clarkj84/Desktop/boe-all-time-base-rate.csv")

base_rate_over_time = base_rate_over_time.drop(['SERIES'], axis=1)

base_rate_dates_as_series = pandas.Series(base_rate_over_time['DATE'])

base_rate_over_time['DATE'] = pandas.to_datetime(base_rate_dates_as_series)

base_rate_over_time.plot(0, 0)

base_rate_over_time.reset_index(inplace=True)

base_rate_over_time.columns = ['DATE','VALUE']

base_rate_over_time.plot(kind = 'scatter', x = 'DATE', y = 'VALUE')

plt.show()

Here is a snippet of the dataset I am attempting to plot against

       index       DATE  VALUE
0          0 1975-01-02  11.50
1          1 1975-01-03  11.50
2          2 1975-01-06  11.50
3          3 1975-01-07  11.50
4          4 1975-01-08  11.50
5          5 1975-01-09  11.50
6          6 1975-01-10  11.50
7          7 1975-01-13  11.50

What is causing the column error here?

  • I think there is problem with plot datetimes, check [this](https://stackoverflow.com/q/27472548) for possible solutions. – jezrael Apr 11 '18 at 08:40

2 Answers2

0

It looks like the line

base_rate_over_time.columns = ['DATE','VALUE']

is the reason for the error. This command is used to rename columns, however, at this time the base_rate_over_time dataframe has 3 columns while only two new column names are given. There are 3 columns here since reset_index() was used in the line above which will create the new index column.

You could simply remove these two lines or use drop('index', inplace=True) before renaming the columns.

Shaido
  • 27,497
  • 23
  • 70
  • 73
0

In my case it was delimiter problem,

Solved by the below stmt

df = pd.read_csv("pos.csv", sep='"', header=None)

to get

"string1, string2"

joydeba
  • 778
  • 1
  • 9
  • 24