-1

I have this dictionary which keys I later want to pass into a sql statement:

dic = {'address': 'Emil-Dannecker-Straße 2, 78234 Engen',
       'addressLink': 'https://atlas.immobilienscout24.de/adresse/78234-engen-emil-dannecker-str-2',
      'date': '28.12.2018',
      'distance': '85 m',
      'id': '109087587',
      'livingArea': '27 m²',
      'position': {'lat': 47.85883, 'lng': 8.76808},
      'price': '320 €',
      'priceSqm': '11,85 €/m²',
      'realtor': {'contactPageUrl': None, 'label': 'von Privat'},
      'roomCount': '1',
      'status': 'Deactivated'}

My desired output is this:

(realtor,
 distance,
 date,
 addressLink,
 status,
 roomCount,
 priceSqm,
 id,
 position,
 livingArea,
 address,
 price)

So far, I have tried tuple(dic.keys()), which gives me:

('realtor',
 'distance',
 'date',
 'addressLink',
 'status',
 'roomCount',
 'priceSqm',
 'id',
 'position',
 'livingArea',
 'address',
 'price')

Is there an easy way to transfrom the keys into the desired format?

Lisardo Erman
  • 148
  • 1
  • 8

2 Answers2

2

First of all: don't use user-provided input to generate SQL! That's a recipe for SQL injections, a major security flaw. Use a library like SQLAlchemy to handle dynamic SQL instead.

If you have specific formatting requirements, don't use the built-in containers. Those have only a single mode of representation, aimed at helping debugging.

Just print your values directly:

print("({})".format(",\n ".join(dic)))

There is no need to call dic.keys() here, iteration will give you the keys directly.

Take into account that the order of the keys may be arbirtary! Python 3.6 and newer preserve insertion order in dictionaries, but it may be that the way your dictionary was built still gives you a different order than you want to see in the output.

Demo:

>>> dic = {'address': 'Emil-Dannecker-Straße 2, 78234 Engen',
...        'addressLink': 'https://atlas.immobilienscout24.de/adresse/78234-engen-emil-dannecker-str-2',
...       'date': '28.12.2018',
...       'distance': '85 m',
...       'id': '109087587',
...       'livingArea': '27 m²',
...       'position': {'lat': 47.85883, 'lng': 8.76808},
...       'price': '320 €',
...       'priceSqm': '11,85 €/m²',
...       'realtor': {'contactPageUrl': None, 'label': 'von Privat'},
...       'roomCount': '1',
...       'status': 'Deactivated'}
>>> print("({})".format(",\n ".join(dic)))
(address,
 addressLink,
 date,
 distance,
 id,
 livingArea,
 position,
 price,
 priceSqm,
 realtor,
 roomCount,
 status)

You can't reliably use this for SQL insert statements!, not only because there is a potential for SQL injection, but also because any key names that are also reserved SQL keywords will lead to errors, and so will spaces or special characters.

E.g. this would be a really bad dictionary to try and turn into columns:

dic = {
    "select": "reserved keyword",
    'use " quotes': "contains special characters",
    "Robert'; DROP TABLE students; --": "There goes your table!",
}

You'd want to use double quotes around the column names to properly escape any problematic values. Any double quotes in the column name should be doubled. So to properly format and escape value, use this:

columns = "({})".format(',\n '.join([
    '"{}"'.format(key.replace('"', '""'))
    for key in dic]))

This, at least, protects you from SQL injections when generating column names:

>>> dic = {
...     "select": "reserved keyword",
...     'use " quotes': "contains special characters",
...     "Robert'; DROP TABLE students; --": "There goes your table!",
... }
>>> columns = "({})".format(',\n '.join([
...     '"{}"'.format(key.replace('"', '""'))
...     for key in dic]))
>>> print(columns)
("select",
 "use "" quotes",
 "Robert'; DROP TABLE students; --")

SQLAlchemy would handle this kind of detail for you, including validating if the column names even exist at all.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Many thanks for the elaborate response. The first bit of code is what I wanted. Since I'm working on a web crawler that writes the scraped pages to a database, SQL injection is not really an issue in this case (database is not publicly available). I just wanted to make sure that only those entries are written to the database that exist on the page. – Lisardo Erman Aug 26 '19 at 10:12
  • @LisardoErman: of course it is an issue. Those web pages you scrape are untrusted, don't think for one second that they could not be altered to include sql injection. – Martijn Pieters Aug 26 '19 at 10:13
1
print('(%s)' % ',\n '.join(dic.keys()))
Kostas Charitidis
  • 2,991
  • 1
  • 12
  • 23