When writing a regular expression, it is possible to write the expression across multiple lines and including annotation, then compile the expression using the re.VERBOSE
option before passing the compiled version. I'd like to do something similar with a pandas.read_sql_query
.
For example, instead of:
result = pd.read_sql_query('select a.gvkey, a.tic, a.datadate as fyearend, year(a.datadate) as year, month(a.datadate) as fyrc, b.datadate, month(b.datadate) as month, b.trt1m from COMPM.FUNDA a join COMPM.SECM b on a.gvkey = b.gvkey and year(a.datadate) = year(b.datadate) where a.TIC = "IBM" and a.datafmt = "STD" and a.consol="C" and a.indfmt = "INDL" and year(a.datadate)>1980', engine)
I would like to write something like:
q = """select a.gvkey,
a.tic, #COMMENTS
a.datadate as fyearend, #COMMENTS
year(a.datadate) as year, #COMMENTS
month(a.datadate) as fyrc, b.datadate,
month(b.datadate) as month,
b.trt1m
from COMPM.FUNDA a join COMPM.SECM b on a.gvkey = b.gvkey and year(a.datadate) = year(b.datadate)
where a.TIC = "IBM"
and a.datafmt = "STD"
and a.consol="C"
and a.indfmt = "INDL"
and year(a.datadate)>1980
"""
result = p.read_sql_query(q ,engine)
My question is related to this question about splitting python commands across multiple lines, but I'd like to add comments inside the query.
As I mentioned, what I'd like to do in the pandas/SQL case is similar to what can be done in the regular expression case with re.VERBOSE
. Here is an example with regex:
pattern = r'''\s(shares?| #COMMENTS
warrants?| #COMMENTS
stock| #AND SO ON...
(non)?vest(ed)?
)\b
'''
crit = re.compile(pattern_nopt, re.VERBOSE)
match=re.search(crit, string)
This would make the query more readable and I find it important to annotate queries exhaustively when sharing code with coauthors.