0

I'm trying to run the following code in Google Colab but I'm receving the error 'EOF in multi-line string'

import sqlite3
from google.colab import files
uploaded = files.upload()
#upload file Db-IMDB.db
connection = sqlite3.connect('./Db-IMDB.db')


Yash = pd.read_sql_query("""SELECT da.*
FROM (SELECT pd.pid, pd.name as Director, pa.pid, pa.name as actor, COUNT(*) as cnt,
             RANK() OVER (PARTITION BY pa.pid ORDER BY COUNT(*) DESC) as seqnum,
             COUNT(*) OVER (PARTITION BY pa.pid, COUNT(*)) as num_with_cnt,
      FROM M_director d JOIN
           Person pd
           ON pd.pid = d.pid JOIN
           M_cast c
           ON c.MID = d.MID JOIN
           Person pa
           ON pa.PID = c.PID
      GROUP BY pd.pid, pd.name, pa.pid, pa.name
     ) da
WHERE director = 'Yash Chopra' AND
      seqnum = 1 AND
      num_with_cnt = 1;""",connection)

print(Yash)

I've gone through queries posted by other users on this error, from what I've seen it is always due to a missing apostraphie ' however, I've checked my code and I didn't miss any. Can you please help me figure out what is causing the error ?

My references : Token error: EOF in multi-line statement Getting error when loading CSV file to dataframe using Jupyter notebook

Complete error :

ERROR:root:An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line string', (1, 23))

---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
   1430             else:
-> 1431                 cur.execute(*args)
   1432             return cur

OperationalError: near "(": syntax error

During handling of the above exception, another exception occurred:

DatabaseError                             Traceback (most recent call last)
5 frames
/usr/local/lib/python3.6/dist-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
   1429                 cur.execute(*args, **kwargs)
   1430             else:
-> 1431                 cur.execute(*args)
   1432             return cur
   1433         except Exception as exc:

DatabaseError: Execution failed on sql 'SELECT da.*
FROM (SELECT pd.pid, pd.name as Director, pa.pid, pa.name as actor, COUNT(*) as cnt,
             RANK() OVER (PARTITION BY pa.pid ORDER BY COUNT(*) DESC) as seqnum,
             COUNT(*) OVER (PARTITION BY pa.pid, COUNT(*)) as num_with_cnt,
      FROM M_director d JOIN
           Person pd
           ON pd.pid = d.pid JOIN
           M_cast c
           ON c.MID = d.MID JOIN
           Person pa
           ON pa.PID = c.PID
      GROUP BY pd.pid, pd.name, pa.pid, pa.name
     ) da
WHERE director = 'Yash Chopra' AND
      seqnum = 1 AND
      num_with_cnt = 1;': near "(": syntax error
  • 1
    Please post the exact error message. It should have a line number and probably point to where the error is. – Barmar Sep 21 '19 at 18:42
  • I'm sorry, I've posted the complete error, it says its near '(', I didn't find any issue before any of the parenthesis – Hemanth Ravavarapu Sep 21 '19 at 18:50
  • The sql syntax error is caused by using window functions with a version of sqlite older than 3.25. – Shawn Sep 21 '19 at 18:52
  • `as num_with_cnt,as num_with_cnt,`<<--remove the trailing comma.(this is one of the reasons why people prefer to put the commas at the beginning of the lines) – wildplasser Sep 21 '19 at 18:57
  • @wildplasser: Removed the comma, still getting the error – Hemanth Ravavarapu Sep 21 '19 at 19:02
  • `(PARTITION BY pa.pid, COUNT(*))` makes no sense to me. Same for `(PARTITION BY pa.pid ORDER BY COUNT(*) DESC)` – wildplasser Sep 21 '19 at 19:09
  • That's basically to avoid ties. I've removed num_with_cnt = 1 at the bottom, still getting the same error COUNT(*) OVER (PARTITION BY pa.pid, COUNT(*)) as num_with_cnt and – Hemanth Ravavarapu Sep 21 '19 at 19:11
  • From my research, the error about the multiline string may be an iPython issue. See https://github.com/ipython/ipython/issues/6864. Try running the script from the CLI instead of iPython. – Barmar Sep 21 '19 at 23:40

0 Answers0