0

I am trying to backtest my code by providing some data, but its showing "ValueError: No objects to concatenate"

This is my code:

import yfinance as yf
import numpy as np
import pandas as pd

sp_500 = yf.Ticker("MSFT")
sp_500 = sp_500.history(period="max")

# sp_500.plot.line(y="Close", use_index=True)

del sp_500["Dividends"]
del sp_500["Stock Splits"]

sp_500["Tomorrow"] = sp_500["Close"].shift(-1)

sp_500["Target"] = (sp_500["Tomorrow"] > sp_500["Close"]).astype(int)  # for boolean value instead of TRUE AND FALSE

sp_500 = sp_500.loc["2022-01-01":].copy()

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import precision_score

model = RandomForestClassifier(n_estimators=100, min_samples_split=100, random_state=1)

train = sp_500.iloc[:-100]
test = sp_500.iloc[-100:]

predictors = ["Close", "Volume", "Open", "High", "Low"]
model.fit(train[predictors], train["Target"])

preds = model.predict(test[predictors])
preds = pd.Series(preds, index=test.index)

combined = pd.concat([test["Target"], preds], axis=1)


def predict(train, test, predictors, model):
    model.fit(train[predictors], train["Target"])
    preds = model.predict(test[predictors])
    preds = pd.Series(preds, index=test.index, name="Predictions")
    combined = pd.concat([test["Target"], preds], axis=1)
    return combined


def backtest(data, model, predictors, start=2500, step=250):
    all_predictions = []

    for i in range(start, data.shape[0] - step, step):
        train = data.iloc[i:(i + step)]
        test = data.iloc[(i + step):(i + 2 * step)]
        predictions = predict(train, test, predictors, model)
        all_predictions.append(predictions)

    return pd.concat(all_predictions)

predictions = backtest(sp_500, model, predictors)
print(predictions["Predictions"].value_counts())

I was expecting an output that will be the count of predicted price increases and decreases during the backtesting process. The number of predicted "1" (price increase) and "0" (price decrease) in the "Predictions" column will be displayed

Vitalizzare
  • 4,496
  • 7
  • 13
  • 32
Devansh
  • 9
  • 2
  • 1
    Have you done even the most basic of debugging here? The error implies that either `preds` or `test["Target"]` is empty. Have you printed these values to see if that is the case? – Tim Roberts Jul 29 '23 at 19:39
  • 2
    Add the full error message. And make a [mre]. If the error occurs in `combined = pd.concat(...`, we obviously don't need the rest of the code. Can you reproduce the problem with even less code, for example without running RF? Trimming your code is also basic debugging. – Ignatius Reilly Jul 29 '23 at 20:25
  • See also [How to make good reproducible pandas examples](/q/20109391/4518341) – wjandrea Jul 30 '23 at 16:11

1 Answers1

0

In the function backtest you do not check if your data are big enough for the default values of the start=2500 and step=250 parameters. I believe this is the issue. At this time (Jul 30, 2023) when you run the code up until the first call of backtest, the number of records in sp_500 is only 394. So the function will not start the inner loop:

for i in range(start, data.shape[0] - step, step)

because the range(2500, 394-250, 250) is empty in this case, right? So it jumps right to the line

return pd.concat(all_predictions)

where all_predictions remains an empty list. That's why you get the ValueError: No objects to concatenate - you literally have no objects at the output.

Vitalizzare
  • 4,496
  • 7
  • 13
  • 32