1

I need to run two loops through my regression, one of them being the independent variable and the other is a suffix for the prediction I need to save with each round of independent variables. I can do either of these loops separately and it works fine but not when I combine them in the same regression. I think this has something to do with the loop mapping at the end of my regression after the %. I get the error code "TypeError: list indices must be integers, not str." But, that is because my Dependent variables are read as strings to get the values from SPSS data frame. Any way to map a for loop in a regression that includes string variables?

I have tried using the map() function, but I got the code that the iteration is not supported.

begin program.
import spss,spssaux
dependent = ['dv1', 'dv2', 'dv3', 'dv4', 'dv5']
spssSyntax = ''
depList = spssaux.VariableDict(caseless = True).expand(dependent)
varSuffix = [1,2,3,4,5]


for dep in depList:
    for var in varSuffix:
        spssSyntax += '''
    REGRESSION 
      /MISSING LISTWISE 
      /STATISTICS COEFF OUTS R
      /CRITERIA=PIN(.05) POUT(.10) 
      /NOORIGIN 
      /DEPENDENT %(dep)s 
      /METHOD=FORWARD  iv1 iv2 iv3
      /SAVE PRED(PRE_%(var)d).
    '''%(depList[dep],varSuffix[var])
end program. 

I get the error code 'TypeError: list indices must be integers, not str' with the code above. How do I map the loop while also including a string?

J_R_H
  • 13
  • 2

1 Answers1

0

In Python, when you loop directly through an iterable, the loop variable becomes the current value so there is no need to index original lists with depList[dep] and varSuffix[var] but use variables directly: dep and var.

Additionally, consider str.format for string interpolation which is the Python 3 preferred method rather than the outmoded, de-emphasized (not yet deprecated) string modulo % operator:

for dep in depList:
    for var in varSuffix:
        spssSyntax += '''REGRESSION 
                           /MISSING LISTWISE 
                           /STATISTICS COEFF OUTS R
                           /CRITERIA=PIN(.05) POUT(.10) 
                           /NOORIGIN 
                           /DEPENDENT {0} 
                           /METHOD=FORWARD  iv1 iv2 iv3
                           /SAVE PRED(PRE_{1})
                     '''.format(dep, var)

Alternatively, consider combining the two lists for one loop using itertools.product, then use a list comprehension to build string with join instead of concatenating loop iterations with +=:

from itertools import product
import spss,spssaux

dependent = ['dv1', 'dv2', 'dv3', 'dv4', 'dv5']    
depList = spssaux.VariableDict(caseless = True).expand(dependent)
varSuffix = [1,2,3,4,5]

base_string = '''REGRESSION 
                   /MISSING LISTWISE 
                   /STATISTICS COEFF OUTS R
                   /CRITERIA=PIN(.05) POUT(.10) 
                   /NOORIGIN 
                   /DEPENDENT {0} 
                   /METHOD=FORWARD  iv1 iv2 iv3
                   /SAVE PRED(PRE_{1})
              '''

# LIST COMPREHENSION UNPACKING TUPLES TO FORMAT BASE STRING
# JOIN RESULTING LIST WITH LINE BREAKS SEPARATING ITEMS
spssSyntax = "\n".join([base_string.format(*dep_var) 
                           for dep_var in product(depList, varSuffix)])

Now if you need to iterate in parallel elementwise between the equal length lists consider zip instead of product:

spssSyntax = "\n".join([base_string.format(d,v) 
                           for d,v in zip(depList, varSuffix)])

Or enumerate for index number:

spssSyntax = "\n".join([base_string.format(d,i+1) 
                           for i,d in enumerate(depList)])
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • Is there a way to have the loops work in tandem with each other instead of through each other? From both examples above (which worked beautifully in all other aspects), each dependent variable first went through all 5 prediction suffixes before moving onto the next dependent. This left me with 5 Predictions all for the first dependent variable, and no predictions for the other dependent variables since PRE_1- PRE_5 were already taken. Did I just misuse the iterpools.product? Are there any other ways to link the two loops? Or possibly print the index number in the prediction suffix slot? – J_R_H Sep 21 '19 at 05:03
  • See edit with extensions for `zip` or `enumerate`. The `itertools.product` was to replicate your nested `for` loops. – Parfait Sep 21 '19 at 12:47