1

I'm trying to loop through a dictionary, but use the looped keys (or values if I so choose) in a calculation.

EDIT 11/24: Adding additional context

So the example would be:

STEP 1) I have a CSV file with the following columns... Column 1, Column 2, Column 3

file = pd.read_csv("/Path/test.csv")

time   Change    Column 1   Column 2   Column 3   Column 4
01/01    .5          3          5          7         1
01/01    .3          5          1          4         2
01/01    .5          1          3          8         2
01/02    .3          5          1          4         1
01/02    .5          1          3          8         3

STEP 2) I have a dictionary that groups these columns and provides information regarding the calculations required for the columns in Key Value pairs. I am additionally assigning a code to each column for use later (NOTE: There are 10 groups of columns, containing anywhere from 4-20 columns per group, below is simply an example):

dict = {
'group': {
    'a':{
       'ranges':{
             'a':{'<':0,'>':-1},
             'b':{'<':0,'>':-2}} 
       'indicators':{
             'a':'Column 1',
             'b':'Column 2'}
    'b':{
       'ranges':{
             'a':{'<':0,'>':-1},
             'b':{'<':0,'>':-2}} 
       'indicators':{
             'a':'Column 3',
             'b':'Column 4'}

STEP 3 [This is where I'm having trouble]) The 'Groups' are a collection of columns with 'similar' data as well as corresponding ranges I want to test against. I am trying to compare different combinations of groups with other groups. So compare ALL the different RANGES of Column 1 to all the different RANGES of Column 3, then 4... then compare ALL the different RANGES of Column 2 to all the different RANGES of Column 3, then 4. My thought was to assign a CODE based on the corresponding letters for GROUP, RANGES, and INDICATORS... so for example "aaa" would be GROUP a, RANGES {'<':0,'>':-1}, in INDICATOR Column 1

NOTE: I only need to compare 3 columns at a time, e.g. the loop can stop at 3 combinations, however I'd like to also know if 2 combinations are better than 3.

STEP 4) I want to group the different combinations to see which combinations of COLUMNS and RANGES work the best, by calculating the MEAN, MAX, MIN, and then COUNT of each combination.

I am attempting to create a huge loop for this, but in an effort to learn on my own I'm only asking the community pieces of it. But I'm seeing that this piecemeal approach isn't helpful to understand the context of my issue. So hopefully this additional explanation provides a bit more clarity

My eventual desired output:

Code   Average   Max   Min   Count
aaa    0.25      0.5   0.3   3
aab    0.25      0.5   0.3   3
aac    0.25      0.5   0.3   3
aba    0.25      0.5   0.3   3
abb    0.25      0.5   0.3   3
abc    0.25      0.5   0.3   3
jrcart
  • 59
  • 4
  • Can you provide outputs for the example you provided? Maybe that would make the question clearer. Also, it would be nice to define the question more concisely, like function ideally: what's your input and whats the output. – giliev Nov 21 '20 at 23:54
  • The output would be the calculation, so : file["Total"] = file["Column 1"]/file["Column 2"]. Really my question is not a solution, but just how to reference the items in the columns. I will try and edit the question to provide a little more calrity. – jrcart Nov 22 '20 at 14:51

2 Answers2

0

Are you just having trouble with iterating over nested dictionaries and accessing their keys and values? Something like this?

for group_name, columns in dict['group'].items():
    for col, comparisons in columns.items():
        for operator, number in comparisons.items():
            print(operator, number)

UPDATE:

If you want to create the code to be executed using the retrieved operators and numbers, one approach that I can think of is to use eval.

First, format the string, inserting the column name, operator and number:

f"df['{col}'] {operator} {number}"
# result: "df['Column 6'] > 6"

You can then evaluate this string using the built-in Python eval function:

eval(f"df['{col}'] {operator} {number}")

Note that this will not print anything; if you wanted to see the output, you'd have to add print:

eval(f"print(df['{col}'] {operator} {number})")
# This should print something like:
# index1 False
# index2 True
# index3 False
# Name: Column 6, dtype: bool

The entire code could look like this:

for group_name, columns in dict["group"].items():
    for col, comparisons in columns.items():
        for operator, number in comparisons.items():
            column_fulfils_condition = eval(f"df['{col}'] {operator} {number}")  
            if column_fulfils_condition["index1"]:
                print(f"Yay! index1 for {col} is {operator} than {number})  # or do whatever else you need

eval can only handle very simple code (single expressions), so it may not be enough for your use case. There are other approaches, which have been discussed elsewhere.

natka_m
  • 1,297
  • 17
  • 22
  • Thank you. My issue is accessing and using the values from the loop. So I am able to get the operator and the number, but how would I then use those in an equation? So for example: IF ['Column 1'] operator number... e.g. IF ['Column 1'] >-1 – jrcart Nov 24 '20 at 13:50
  • @jrcart I've updated my answer. This is only one available approach, and not necessarily the best one (have a look at the other new answer too), but it depends on what your data looks like and what you want to do. I'd say that if you have further doubts, you should ask a new question (but first search for the already existing ones, naturally ;)). – natka_m Nov 24 '20 at 14:41
  • 1
    Thank you. This helps, but makes me realize that I probably need to re-create the dictionary as I get a TraceBack once it gets to "indicators". Perhaps I need to put "indicators" as the first Key, match to that, then move down the Key Values from there. This is helpful though and helps point me farther in the right direction. – jrcart Nov 24 '20 at 17:42
0

If I understand you correctly, you would like to turn the string representation of an operator into an operator you can apply to the values in the dictionary (is that correct?) In that case you could, for example, define a dictionary mapping the symbol to the corresponding operator from the module operator,

import operator
ops = {'<': operator.lt,
       '<=': operator.le,
       '=': operator.eq,
       '>': operator.gt,
       '>=': operator.ge,
       '!=': operator.ne
      }

and then receive this operator from its symbol s via ops[s], e.g.

s= '<'
ops[s](1, 2)  # evaluates to True
ctenar
  • 718
  • 5
  • 24
  • Hopefully the edits I just made provided a bit more context. The question is more about 'using' the operators vs. populating them. How do I pull them out of the dictionary and use them in a groupby calculation as I loop through numerous combinations. – jrcart Nov 24 '20 at 15:00
  • I didn't want to repeat the stuff mentioned already in the existing answer. But once you've pulled out the operator symbol and the values you want to apply it to, you can get an actual operator from a predefined dictionary, as I suggested, or you can use ```eval``` as the author of the older answer suggested. – ctenar Nov 24 '20 at 15:06