SAS like Macros in Python

Question

I am very new to the world of Python and have started to learn the coding gradually. I am actually trying to implement all my SAS codes in Python to see how they work. One of my code involves using the macros. The code looks something like this

%macro bank_data (var,month);

proc sql;
create table work_&var. as select a.accountid, a.customerseg. a.product, a.month, b.deliquency_bucket from 
table one as a left join mibase.corporate_&month. as b
on a.accountid=b. accountid and a.month=b.&month;
quit
%mend;
% bank_data (1, 202010);
%bank_data(2,202011);
%bank_data(3,202012);

I am quite comfortable with the merging step in python but want to understand how do i do this macro step in Python?

Python doesn't have macros. The closest you can get is string formatting, like `"create table work_{var}.".format(var=1)` — ForceBru, May 10 '21 at 18:01
is there a way I can refer the different month tables i have at the bottom in Python? — Swati Kanchan, May 10 '21 at 18:09
@SwatiKanchan Write a function that takes a month value (and any other arguments) as parameter(s). — AlexK, May 10 '21 at 18:21
@SwatiKanchan stop trying to transliterate SAS into Python. *Learn Python* then write code in Python. — juanpa.arrivillaga, May 10 '21 at 18:26
@SwatiKanchan I can't give an example because I don't know what your function will actually do (what you will be merging, what method you will use for the merge, and what the output/returned value will be). Python does not have an equivalent of proc sql, so your SAS code cannot be easily translated to Python without incorporating third-party libraries. There is a lot of tutorials on Python functions, please spend some time learning how to write a basic function in Python and come back with a focused question on anything you don't understand. Good luck! — AlexK, May 10 '21 at 18:57
Just to be clear - "translate this code in X language into Y" is *not on topic here*. What is on topic is asking how to do a specific thing in X language, given the requirements, and then you can show how you did it in another language - but it still has to be a good question without the other (here, SAS) code. — Joe, May 10 '21 at 19:27
First convert your SQL to python merge/lookup/join. Then figure out how to filter it. Then wrap it in a function that you can pass the parameters too. Similar in SAS - first get your code working without macro, then add your macro stuff. — Reeza, May 10 '21 at 20:54
@Joe Actually I think my question got misunderstood. The question was not to basically replicate the proc sql part in Python - I wanted to understand how do i basically do this macro thing in python where I can repeat one code multiple times. — Swati Kanchan, May 11 '21 at 10:09

fthomson · Answer 1 · 2021-05-11T15:51:32.833

0

If you are trying to run this SQL from python I would suggest something like this

import pyodbc
var = [1,2,3]
months = [202010,202011,202012]

def bank_data(var, month):
    # search for how to format connection string
    cnc_string = "DRIVER={SQL Server};SERVER=YOURSERVER;DATABASE=YOURDATABASE;Trusted_Connection=yes"
    
    query = f"""
    proc sql;
    create table work_{var}. as select a.accountid, a.customerseg. a.product, a.month, b.deliquency_bucket from 
    table one as a left join mibase.corporate_{month}. as b
    on a.accountid=b. accountid and a.month=b.{month};
    quit
    """
    
    with pyodbc.connect(conn_str) as conn:
        conn.execute(query)

for v, m in zip(var, months):
    bank_data(v, m)

also I got a bit lazy you should really parameterize this to prevent sql injections pyodbc - How to perform a select statement using a variable for a parameter

edited May 11 '21 at 15:51

answered May 10 '21 at 22:42

fthomson

773
3
9

Thank you @fthomson. This was indeed helpful but as I mentioned above my question was not about writing this proc sql in Python but more to understand how can I basically loop the macro in Python. How do i repeat the same thing again and again for different months in Python like I could do in SAS using macro – Swati Kanchan May 11 '21 at 10:12
I am kind of guessing at what you are looking for, but it sounds like you just want to loop through a list of months and vars. This will input bank_data(1,202010), bank_data(2,202011) and so on. I would suggest looking into constructing loops and creating functions in python before continuing on with sql. – fthomson May 11 '21 at 15:55

score 0 · Accepted Answer · answered May 18 '21 at 19:20

Swati, this a a great way to learn Python, I hope that my answer helps.

Background

First, the data structure that best resembles a SAS dataset is the Pandas DataFrame. If you have not installed the Pandas library, I strongly encourage you to do so and follow these examples.

Second, I assume that the table 'one' is already a Pandas DataFrame. If that is not a helpful assumption, you may need to see code to import SAS datasets, assign file paths, or connect to a database, depending on your data management choices.

Also, here is my interpretation of your code:

%macro bank_data(var,month);
    proc sql;
        create table work.work_&var. as
        select      a.accountid,
                    a.customerseg,
                    a.product,
                    a.month,
                    b.deliquency_bucket
        from        work.one as a
        left join   mibase.corporate_&month. as b
        on          a.accountid = b.accountid
        and         a.month = b.&month;
    quit;
%mend;

Pandas Merge

Generally, to do a left join in Pandas, use Dataframe.merge(). If the left table is called "one" and the right table is called "corporate_month", then the merge statement looks as follows. The argument left_on applies to the left-dataset "one" and the right_on argument applies to the right-dataset "corporate_month".

month = 202010

corporate_month = 'corporate_{}.sas7bdat'.format(month)

work_var = one.merge(right=corporate_month, how='left', left_on=['accountid', 'month'], right_on=['accountid', month])

Dynamic Variable Assignment

Now, to name the resulting dataset based on a variable. SAS Macro's are simply text replacement, but you cannot use that concept in variable assignment in Python. Instead, if you insist on doing this, you will need to get comfortable with dictionaries. Below is how I would implement your requirement.

var = 1
month = 202010
dict_of_dfs = {}

corporate_month = 'corporate_{}.sas7bdat'.format(month)
work_var = 'work_{}'.format(var)

dict_of_dfs[work_var] = one.merge(right=corporate_month, how='left', left_on=['accountid', 'month'], right_on=['accountid', month])

As a Function

Lastly, to turn this into a function where you pass "var" and "month" as arguments:

dict_of_dfs = {}

def bank_data(var, month):

   corporate_month = 'corporate_{}.sas7bdat'.format(month)
   work_var = 'work_{}'.format(var)

   dict_of_dfs[work_var] = one.merge(right=corporate_month, how='left', left_on=['accountid', 'month'], right_on=['accountid', month])

bank_data(1, 202010)
bank_data(2, 202011)
bank_data(3, 202012)

Export

If you want to export each of the resulting tables as SAS datasets, look into the SASPy library.

@Greg- This was extremely helpful. Thank you so much – Swati Kanchan May 20 '21 at 05:48 — Swati Kanchan, May 20 '21 at 05:48