I am writing a loop that would output a formula used for regression - I want to test which pair of variables would give the best fit. I am expecting the following output:
ClaimNb ~ C(DrivAgeGroup)+ C(BMGroup)
ClaimNb ~ C(DrivAgeGroup)+ C(Area)
ClaimNb ~ C(BMGroup)+ C(Area)
But instead I am getting:
ClaimNb ~ C(DrivAgeGroup)+ C(BMGroup)
ClaimNb ~ C(DrivAgeGroup)
Why is that? Thanks.
remaining = ['C(Area)','C(DrivAgeGroup)','C(BMGroup)']
for candidate in remaining:
fitted_var = remaining
fitted_var.remove(candidate)
formula = "{} ~ {} ".format('ClaimNb', '+ '.join(fitted_var))
print(formula)