1

I have a function in excel like this

=IF(B17="","",MIN(MAX(CEILING((B17-MIN(B$17:B$46))/((MAX(B$17:B$46)-MIN(B$17:B$46))/10),1),1),10))

input:

Column1 output
512.96  10
307.41  3
413.76  7
323.65  4
376.84  5
368.79  5
367.77  5
345.65  4

It can be dissected as follows

ceiling((min-max)/10, 1)
max(ceiling, 1)
min(max,10)

I have the code running till ceiling function as below

def point_10_conversion(new_df):
    g = ((new_df.sub(new_df.min(axis=0))) / ((new_df.max(axis=0)) - (new_df.min(axis=0))))/10
    f = np.around(g.astype(np.double), 3)
    ceil = np.ceil(f)
    print(ceil)

can someone help to convert this excel function into pandas or python code? I am using dataframe for calculations.

Thank you in advance!!

dumb_coder
  • 315
  • 5
  • 21
  • kindly share some data with expected ouptut https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – sammywemmy Mar 16 '20 at 10:18
  • @sammywemmy, I have the updated the question likewise – dumb_coder Mar 16 '20 at 10:21
  • [Please don't post images of code/data (or links to them)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question) – jezrael Mar 16 '20 at 10:22
  • 1
    @dumb_coder can you please put it as text in a code block as well as the output associated with that input? That way people can copy/paste into their interpreter to actually try stuff out on it... – Jon Clements Mar 16 '20 at 10:22
  • Thanks and what is `B$17:B$46` - do we have all the necessary information here to actually calculate the output given just `column1` here? (and it'd be handy to either have what you want the output as if it's a single value, or showing a `column2` with what the value should be...) – Jon Clements Mar 16 '20 at 10:33
  • @JonClements, consider the B17 as column 1 row 1. The output is likewise – dumb_coder Mar 16 '20 at 10:37

2 Answers2

1

Consider passing in a Pandas Series as parameter in order to return a same length Series as the Excel formula runs by individual cells to return results of same length results. Then either call the Python function for single column assignment or with DataFrame.transform for select columns or all columns assignment.

def point_10_conversion(ser):
    g = (ser - ser.min()) / ((ser.max() - ser.min())/10)    
    res = pd.Series(np.ceil(g))

    # SERIES APPLY APPROACH (POSSIBLY SLOWER)
    # m_res = res.apply(lambda x: min(max(x, 1), 10))

    # NUMPY ARRAY APPROACH
    m_res = np.minimum(np.maximum(res, 1), 10)

    return m_res


# ASSIGN A SINGLE COLUMN
df['Output'] = point_10_conversion(df['Column1'])

# ASSIGN SELECTED MULTIPLE COLUMNS (BY JOINING DFs)            
df = pd.concat([df, (df.reindex(['Column1', 'Column2', 'Column3'], axis = 'columns')
                       .transform(point_10_conversion)
                       .set_axis(['Col1_Output', 'Col2_Output', 'Col3_Output'], 
                                 axis = 'columns', inplace = False)
                    )],
               axis = 1)    

# REPLACE ALL COLUMNS (ASSUMING ALL INT/FLOAT TYPES)
df = df.transform(point_10_conversion)

Output (comparing to actual Excel formula output that does not match OP's posted numbers)

Excel

Excel Output

Python

  • Assign a Single Column

        Column1  Output
    0   512.96    10.0
    1   307.41     1.0
    2   413.76     6.0
    3   323.65     1.0
    4   376.84     4.0
    5   368.79     3.0
    6   367.77     3.0
    7   345.65     2.0
    
  • Assign Multiple Columns (with random generated data)

    np.random.seed(3162020)
    
    df = pd.DataFrame({'Column1': [512.96, 307.41, 413.76, 323.65, 376.84, 368.79, 367.77, 345.65],
                       'Column2': np.random.uniform(350, 500, 8),
                       'Column3': np.random.uniform(350, 500, 8)})
    
    # ASSIGN SELECTED MULTIPLE COLUMNS  
       Column1     Column2     Column3  Column1  Column2  Column3
    0   512.96  498.143814  465.920589     10.0     10.0      8.0
    1   307.41  405.430558  451.238911      1.0      4.0      7.0
    2   413.76  355.728386  362.713986      6.0      1.0      1.0
    3   323.65  498.231310  363.784559      1.0     10.0      1.0
    4   376.84  488.124593  420.322426      4.0     10.0      5.0
    5   368.79  469.047969  441.922624      3.0      8.0      7.0
    6   367.77  435.742375  492.355799      3.0      6.0     10.0
    7   345.65  474.028331  387.297520      2.0      9.0      2.0
    
    # REPLACE ALL COLUMNS (ASSUMING ALL INT/FLOAT TYPES)
       Column1  Column2  Column3
    0     10.0     10.0      8.0
    1      1.0      4.0      7.0
    2      6.0      1.0      1.0
    3      1.0     10.0      1.0
    4      4.0     10.0      5.0
    5      3.0      8.0      7.0
    6      3.0      6.0     10.0
    7      2.0      9.0      2.0
    

Online Demo (click Run at top)

Parfait
  • 104,375
  • 17
  • 94
  • 125
1

There is way of translating a function like the one presented in the question without necessarily having to write the Python yourself.

Libraries such as PyCel, Formulas, xlcalculator and Koala use an AST to translate Excel formulas into Python.

I am the project owner of xlcalculator so I will use that library in a demonstration. That said the other libraries are well capable of this particular task. Each library has different heritage and so they have different strengths.

Usually the above mentioned libraries read an Excel file, translates the formulas into Python and then provides functionality to evaluate. Xlcalculator can also parse a specially crafted dict which is what I'm taking advantage of here.

from xlcalculator import ModelCompiler
from xlcalculator import Model
from xlcalculator import Evaluator

input_dict = {
        "Sheet1!B16" : "Column1",
        "Sheet1!B17" : 512.96,
        "Sheet1!B18" : 307.41,
        "Sheet1!B19" : 413.76,
        "Sheet1!B20" : 323.65,
        "Sheet1!B21" : 376.84,
        "Sheet1!B22" : 368.79,
        "Sheet1!B23" : 367.77,
        "Sheet1!B24" : 345.65,

        "Sheet1!C16" : "OP results",
        "Sheet1!C17" : 10,
        "Sheet1!C18" : 3,
        "Sheet1!C19" : 7,
        "Sheet1!C20" : 4,
        "Sheet1!C21" : 5,
        "Sheet1!C22" : 5,
        "Sheet1!C23" : 5,
        "Sheet1!C24" : 4,

        "Sheet1!D16" : "Actual Output (Parfait)",
        "Sheet1!D17" : '=IF(B17="", "", MIN(MAX(CEILING((B17-MIN(B$17:B$46))/((MAX(B$17:B$46)-MIN(B$17:B$46))/10),1),1),10) )',
        "Sheet1!D18" : '=IF(B18="", "", MIN(MAX(CEILING((B18-MIN(B$17:B$46))/((MAX(B$17:B$46)-MIN(B$17:B$46))/10),1),1),10) )',
        "Sheet1!D19" : '=IF(B19="", "", MIN(MAX(CEILING((B19-MIN(B$17:B$46))/((MAX(B$17:B$46)-MIN(B$17:B$46))/10),1),1),10) )',
        "Sheet1!D20" : '=IF(B20="", "", MIN(MAX(CEILING((B20-MIN(B$17:B$46))/((MAX(B$17:B$46)-MIN(B$17:B$46))/10),1),1),10) )',
        "Sheet1!D21" : '=IF(B21="", "", MIN(MAX(CEILING((B21-MIN(B$17:B$46))/((MAX(B$17:B$46)-MIN(B$17:B$46))/10),1),1),10) )',
        "Sheet1!D22" : '=IF(B22="", "", MIN(MAX(CEILING((B22-MIN(B$17:B$46))/((MAX(B$17:B$46)-MIN(B$17:B$46))/10),1),1),10) )',
        "Sheet1!D23" : '=IF(B23="", "", MIN(MAX(CEILING((B23-MIN(B$17:B$46))/((MAX(B$17:B$46)-MIN(B$17:B$46))/10),1),1),10) )',
        "Sheet1!D24" : '=IF(B24="", "", MIN(MAX(CEILING((B24-MIN(B$17:B$46))/((MAX(B$17:B$46)-MIN(B$17:B$46))/10),1),1),10) )'
    }

compiler = ModelCompiler()
my_model = compiler.read_and_parse_dict(input_dict)
evaluator = Evaluator(my_model)

print(evaluator.evaluate("Sheet1!C16"))
print("Sheet1!C17", evaluator.evaluate("Sheet1!C17"))
print("Sheet1!C18", evaluator.evaluate("Sheet1!C18"))
print("Sheet1!C19", evaluator.evaluate("Sheet1!C19"))
print("Sheet1!C20", evaluator.evaluate("Sheet1!C20"))
print("Sheet1!C21", evaluator.evaluate("Sheet1!C21"))
print("Sheet1!C22", evaluator.evaluate("Sheet1!C22"))
print("Sheet1!C23", evaluator.evaluate("Sheet1!C23"))
print("Sheet1!C24", evaluator.evaluate("Sheet1!C24"))
print()
print(evaluator.evaluate("Sheet1!D16"))
print("Sheet1!D17", evaluator.evaluate("Sheet1!D17"))
print("Sheet1!D18", evaluator.evaluate("Sheet1!D18"))
print("Sheet1!D19", evaluator.evaluate("Sheet1!D19"))
print("Sheet1!D20", evaluator.evaluate("Sheet1!D20"))
print("Sheet1!D21", evaluator.evaluate("Sheet1!D21"))
print("Sheet1!D22", evaluator.evaluate("Sheet1!D22"))
print("Sheet1!D23", evaluator.evaluate("Sheet1!D23"))
print("Sheet1!D24", evaluator.evaluate("Sheet1!D24"))
>python stackoverflow.py
OP results
Sheet1!C17 10
Sheet1!C18 3
Sheet1!C19 7
Sheet1!C20 4
Sheet1!C21 5
Sheet1!C22 5
Sheet1!C23 5
Sheet1!C24 4

Actual Output (Parfait)
Sheet1!D17 10.0
Sheet1!D18 1
Sheet1!D19 6.0
Sheet1!D20 1.0
Sheet1!D21 4.0
Sheet1!D22 3.0
Sheet1!D23 3.0
Sheet1!D24 2.0
bradbase
  • 409
  • 6
  • 9