Plotly chart is a mess of lines after index converted to pandas datetime

Question

My plotly chart is just a mess of zig-zagging lines (see chart here). This only happens after I use df['Date'] = pd.to_datetime(df.index) to convert the index to the datetime format.

Full code:

#IMPORTS
import yfinance as yf
import time
import pandas as pd
import datetime
import numpy as np
import xlsxwriter
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# SETTING UP DF
df = ((pd.read_csv('Book1.csv')).set_index('Date'))[:-1]
df['SMA30'] = df.Total.rolling(30).sum()
df['SMA365'] = df.Total.rolling(365).sum()
df['Monthly Avg'] = df.SMA30.mean()
df['Date'] = pd.to_datetime(df.index)

# PLOTTING FIGURE
fig = go.Figure()
fig.update_layout(title = 'EQ Footfall')
fig.add_trace(go.Scatter(x=df['Date'], y=df.Total, name = 'Footfall Daily'))
fig.add_trace(go.Scatter(x=df.index, y=df.SMA30, name = 'SMA30'))
fig.add_trace(go.Scatter(x=df.index, y=df.SMA365, name = 'SMA365'))
fig.update_xaxes(rangeslider_visible=True)
fig.update_xaxes(tickangle=-45)

`df = pd.DataFrame({"Date":pd.date_range("1-jan-2010", periods=365*10).astype(str), "Total":np.random.randint(1,5, 365*10)}).set_index("Date")` instead of `read_csv()` plots correctly. Clearly I don't have access to your CSV, however I suspect you have data issues in your data frame — Rob Raymond, Aug 27 '21 at 06:28
I suspect that sorting the index would help. `df.sort_index()` — Oddaspa, Aug 27 '21 at 07:10
@ShanGovind Please share a sample of your data as described [here](https://stackoverflow.com/questions/63163251/pandas-how-to-easily-share-a-sample-dataframe-using-df-to-dict/63163254#63163254). And make sure that you provide a complete code snippet that reproduces your problem. — vestland, Aug 27 '21 at 07:11
Thank you @Oddaspa Your suggestion worked to resolve the issue. I will repost it as an answer below and credit you. — Shan Govind, Sep 15 '21 at 02:04

score 2 · Accepted Answer · answered Aug 27 '21 at 07:14

the order of the dates in the dataframe index is important
have simulated dates in none sequential order in format YYYYMMDD
without this line df = df.reindex(df.sort_index().index), plot generated is drawing lines between x & y co-ordinates where x is not sequential
when date is a string it's a categorical so does behave differently, than a continuous variable

import yfinance as yf
import time
import pandas as pd
import datetime
import numpy as np
import xlsxwriter
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# SETTING UP DF
# df = ((pd.read_csv('Book1.csv')).set_index('Date'))[:-1]
df = pd.DataFrame({"Date":pd.Series(pd.date_range("1-jan-2018", periods=int(365*2.5))).dt.strftime("%Y%m%d"), 
                   "Total":np.random.randint(1,5, int(365*2.5))}).set_index("Date")
# simulate dates in none sequential order
np.random.shuffle(df.index.values)
# reindex with sequential dates,  NB dates or format YYYYMMDD are sortable in this way
df = df.reindex(df.sort_index().index)
df['SMA30'] = df.Total.rolling(30).sum()
df['SMA365'] = df.Total.rolling(365).sum()
df['Monthly Avg'] = df.SMA30.mean()
df['Date'] = pd.to_datetime(df.index)
# df["Date"] = df.index

# PLOTTING FIGURE
fig = go.Figure()
fig.update_layout(title = 'EQ Footfall')
fig.add_trace(go.Scatter(x=df['Date'], y=df.Total, name = 'Footfall Daily'))
fig.add_trace(go.Scatter(x=df.index, y=df.SMA30, name = 'SMA30'))
fig.add_trace(go.Scatter(x=df.index, y=df.SMA365, name = 'SMA365'))
fig.update_xaxes(rangeslider_visible=True)
fig.update_xaxes(tickangle=-45)

score 1 · Answer 2 · answered May 05 '22 at 06:38

1

Another solution I have found to this is amending the date structure in the csv file to yyyy-mm-dd. This seems to mainly be an issue with how plotly reads dates. Hope this helps.

answered May 05 '22 at 06:38

Shan Govind

27
5

score 0 · Answer 3 · answered Sep 15 '21 at 02:10

0

Suggestion from @Oddaspa also worked for me:

Sorting the index would help. df.sort_index()

answered Sep 15 '21 at 02:10

Shan Govind

27
5

Plotly chart is a mess of lines after index converted to pandas datetime

3 Answers3