0

I am creating a Python script that takes stock data and generates a graph for that but the problem I am facing is stock data is not available for every Saturday and Sunday so there is some gap in dates. So the graphs looks like this:

stock graph

As you can see here there is a lot of gap between this. What I want is to remove this gap which means removing unnecessary date which is automatically generated by matplotlib and doesn't have any data making it look like this. Does anybody know how do to do that I trying everything from 3 days but didn't get any solution for this.

My code:

import pandas as pd
import pytz
import matplotlib.pyplot as plt
import json

data = json.loads(open("data.json", "r").read())

local_timezone = pytz.timezone("Asia/Kolkata")
data = data[:501]
# Extracting relevant data and creating a pandas DataFrame
timestamps = [
    pd.to_datetime(row["Data"][2]["ScalarValue"])
    for row in data
]
values = [float(row["Data"][3]["ScalarValue"]) for row in data]

df = pd.DataFrame({"Timestamps": timestamps, "Values": values})

df.set_index("Timestamps", inplace=True)



plt.figure(figsize=(12, 6))
plt.plot(df.index, df["Values"], label="Original Data")
plt.xlabel("Timestamps")
plt.ylabel("Price")
plt.title(
    "Stock Price"
)
plt.legend()
plt.grid(True)
plt.show()

data.json file:

[
  {
    "Data": [
      {
        "ScalarValue": "Nifty Bank"
      },
      {
        "ScalarValue": "price"
      },
      {
        "ScalarValue": "2023-01-20 06:58:00.000000000"
      },
      {
        "ScalarValue": "42644.2"
      }
    ]
  },
  {
    "Data": [
      {
        "ScalarValue": "Nifty Bank"
      },
      {
        "ScalarValue": "price"
      },
      {
        "ScalarValue": "2023-01-20 06:59:00.000000000"
      },
      {
        "ScalarValue": "42642.25"
      }
    ]
  },
...
]
Vitalizzare
  • 4,496
  • 7
  • 13
  • 32
  • What does `ScalarValue` look like when there is no data? – Vishnudev Krishnadas Aug 05 '23 at 09:14
  • if `ScalarValue` exists it is guaranteed to have data. it's that there are objects missing in between that means the whole "Data" array is missing. and matplotlib is filling these dates automatically. – Jatin Thakur Aug 05 '23 at 09:51
  • I think you need this https://stackoverflow.com/questions/70392290/how-to-leave-gaps-in-plot-of-incomplete-timeseries – Vishnudev Krishnadas Aug 05 '23 at 10:05
  • If lines are misleading, it means you should plot only points with `plt.plot(..., kind='scatter')`! Lines between points serve no objective purpose. – OCa Aug 05 '23 at 11:10

1 Answers1

0

You can use weekday() function from datetime. You can try this with your complete data.json file.

data = json.loads(open("data.json", "r").read())

local_timezone = pytz.timezone("Asia/Kolkata")
data = data[:501]
# Extracting relevant data and creating a pandas DataFrame
timestamps = [
    pd.to_datetime(row["Data"][2]["ScalarValue"])
    for row in data 
    if pd.to_datetime(row["Data"][2]["ScalarValue"]).weekday() not in [5,6]
]
values = [float(row["Data"][3]["ScalarValue"]) for row in data 
          if pd.to_datetime(row["Data"][2]["ScalarValue"]).weekday() not in [5,6]]

df = pd.DataFrame({"Timestamps": timestamps, "Values": values})

df.set_index("Timestamps", inplace=True)



plt.figure(figsize=(12, 6))
plt.plot(df.index, df["Values"], label="Original Data")
plt.xlabel("Timestamps")
plt.ylabel("Price")
plt.title(
    "Stock Price"
)
plt.legend()
plt.grid(True)
plt.show()

The weekday() function returns the day of the week as an integer ranging from 0 to 6. 0 corresponds to Monday and 6 corresponds to Sunday. To skip Saturday and Sunday in your case, you can avoid considering weekdays 5 and 6.

Sup
  • 191
  • 5