2

trying to take values from a user's input, known as the symbols parameter in my def tickers() function, and feed it into my SQL query as a filter. Right now it's vulnerable to injections:

@app.callback(
    dash.dependencies.Output('output-graph', 'figure'),
    [dash.dependencies.Input('dynamic-dropdown', 'value')])

def tickers(symbols):
    conn.rollback()
    if symbols == None:
        raise PreventUpdate
    elif symbols == '':
        raise PreventUpdate


    trace1 = [] #lines for stocks

    d = {} #dates
    p = {} #prices
    stock_info = {}

    for stock in symbols:
        stock_info[stock] = get_dict_resultset(f"SELECT date, close FROM security_price WHERE security_price.id  = '{stock}' ;")
        d[stock]  = [rec['date'] for rec in stock_info[stock]]
        p[stock]  = [rec['close'] for rec in stock_info[stock]]

        trace1.append(go.Scatter(x=d[stock],
                                 y=p[stock],
                                 mode='lines',
                                 opacity=0.7,
                                 name=stock,
                                 textposition='bottom center'))

I have sanitized my f-strings using the placeholder format. I couldn't get it to work using the format above.

For example, this works great:

def statsTable(symbols):
    conn.rollback()
    if symbols == None:
        conn.rollback()
        raise PreventUpdate
    
    
    placeholders = ", ".join(['%s' for _ in symbols])

    # PREPARED STATEMENT WITH PARAM PLACEHOLDERS
    sql = f"""SELECT id, cast(marketcap as money), cast(week52high as money), cast(week52low as money)
                     , to_char(dividend_yield * 100, '99D99%%')
                     , pe_ratio, ROUND(beta,2) 
              FROM security_stats 
              WHERE security_stats.id IN ({placeholders})
           """

Going back to the original, first code block, above. I tried adding %s as a placeholder to sanitize the same way I did before, but it does not work.

@app.callback(
    dash.dependencies.Output('output-graph', 'figure'),
    [dash.dependencies.Input('dynamic-dropdown', 'value')])

def tickers(symbols):
    conn.rollback()
    if symbols == None:
        raise PreventUpdate
    elif symbols == '':
        raise PreventUpdate

    placeholders = ", ".join(['%s' for _ in symbols])
    trace1 = [] #lines for stocks

    d = {} #dates
    p = {} #prices
    stock_info = {}

    for stock in placeholders:
        stock_info[stock] = get_dict_resultset(f"SELECT date, close FROM security_price WHERE security_price.id  = '{stock}' ;")
        d[stock]  = [rec['date'] for rec in stock_info[stock]]
        p[stock]  = [rec['close'] for rec in stock_info[stock]]

        trace1.append(go.Scatter(x=d[stock],
                                 y=p[stock],
                                 mode='lines',
                                 opacity=0.7,
                                 name=stock,
                                 textposition='bottom center'))

I'm trying to pull up and find the error code, but it's not coming up. I will update this post when I can.

andres
  • 1,558
  • 7
  • 24
  • 62
  • _"which is why you see it hashtagged out"_ The correct term is **commented out** – Pranav Hosangadi Sep 30 '20 at 22:05
  • 1
    How do you connect to the database? If you're using `psycopg`, does this answer your question? [Parameterized queries with psycopg2 / Python DB-API and PostgreSQL](https://stackoverflow.com/questions/1466741/parameterized-queries-with-psycopg2-python-db-api-and-postgresql) – Pranav Hosangadi Sep 30 '20 at 22:11
  • @PranavHosangadi thanks for correcting me, I have deleted the text as it does not accurately reflect the code [forgot to edit it out during my draft]. I do use psycopg, will check out the link you passed through. The issue I currently have is what happens if the list of values is dynamic and not static? For example, there can be 1 to 10 items in a value passed to the SQL query. – andres Sep 30 '20 at 22:18
  • 2
    Do not use any language-level facility to prepare an SQL statement. Use the SQL library's tools for parameterized queries. Start by reading the documentation. (It will generally include explicit warnings not to use language-level facilities for this.) – Karl Knechtel Sep 30 '20 at 22:26
  • 1
    See [sql](https://www.psycopg.org/docs/sql.html) for dynamic query generation in psycopg2. – Adrian Klaver Sep 30 '20 at 22:51
  • @AdrianKlaver thanks for sharing that my way. I checked it out and have a question in there example: `cur.execute( sql.SQL("insert into {} values (%s, %s)") .format(sql.Identifier('my_table')), [10, 20])` For this specific case, they know that there are two items in values, what if I do not know how many items are in the value before hand? – andres Sep 30 '20 at 23:48
  • See example further down in `psycopg2.sql.Placeholder`. – Adrian Klaver Oct 01 '20 at 00:24
  • @AdrianKlaver thank you I solved my issue. I have answered my own question and will write the solution there. Thanks again for your help. – andres Oct 01 '20 at 00:36

1 Answers1

0

Here is how to sanitize a dynamic query using a python loop when pulling data!

please note that my function get_dict_resultset is used to connect to the dB, pull data, and store it in a python dictionary.

def get_dict_resultset(query, param):
    cur = conn.cursor(cursor_factory=psycopg2.extras.DictCursor)
    cur.execute(query, param)
    ans =cur.fetchall()
    dict_result = []
    for row in ans:
        dict_result.append(dict(row))
    return dict_result

I added param as an argument in the function and cur.execute.

for stock in symbols:
    stock_info[stock] = get_dict_resultset("SELECT 
                                            date, close          
                                            FROM security_price 
                                            WHERE 
                                            security_price.id=%s;", [stock])

That is how I safely edited my code to sanitize my user inputs and protect my dB

andres
  • 1,558
  • 7
  • 24
  • 62
  • That is multiple selects. Why not `sym_ids = tuple([id for id in symbols])`. And then `security_price.id=%s;", [sym_ids]. See [tuple](https://www.psycopg.org/docs/usage.html#adapt-tuple). – Adrian Klaver Oct 01 '20 at 01:03
  • @AdrianKlaver, so removing the loop and adding you edits? – andres Oct 01 '20 at 01:11
  • Also there is [executemany](https://www.psycopg.org/docs/cursor.html) and [execute_batch](https://www.psycopg.org/docs/extras.html#fast-exec). – Adrian Klaver Oct 01 '20 at 01:35
  • What's your recommendation between the two? I'm trying to keep everything fast and less resource intensive as possible. Depending on the amount of inputs per user, server demands will exponentially grow. – andres Oct 01 '20 at 02:06
  • That is something you will need to test for your particular situation. `execute_batch` was created to improve performance. For `INSERT` and `UPDATE` it seems to do that. I have no experience with what it does for `SELECT`. – Adrian Klaver Oct 01 '20 at 04:16
  • 1
    Ignore my comment about `executemany` and `execute_batch` as they don't work well with `SELECT` statemants. – Adrian Klaver Oct 01 '20 at 16:57