0

hi i want to save the position of all the lines that contains "CREATE TABLE" in a list a) is there a better and right way to do it? (i'm new to python) b) why does it matter that tell is being used for the iterator? i thought it's a read method (or equivalent) thus just telling the position shouldn't hurt the file iteration proccess.

so i have the following class:

class SQLParser(object):
def __init__(self,filename):
    self.file = open(filename,'r')
    self.createTablePositions=[]
    self.insertIntoPositions=[]

def findCreateTable(self):
    for line in self.file:
        if line.find("CREATE TABLE") is 0:
            print(line)
            self.createTablePositions.append(self.file.tell())



sqlhandler = SQLParser("sql.sql")
sqlhandler.findCreateTable()
print(sqlhandler.createTablePositions)

that yields the following error: "Traceback (most recent call last): File "C:/Users/user/PycharmProjects/sqlparser/sqlparser.py", line 18, in sqlhandler.findCreateTable() File "C:/Users/user/PycharmProjects/sqlparser/sqlparser.py", line 12, in findCreateTable curPos = self.file.tell() OSError: telling position disabled by next() call"

i've searched the net and stackoverflow but i didn't find a direct solution to my problem. --currently solution like rewritting next() method are beyond my knowledge and i doubt this excercise aims for that.

please your advice will be highly appriciated!

LiorA
  • 441
  • 3
  • 13

2 Answers2

0

For starters, you are never closing the file, that is not good.

The error arises mainly due to the internal behaviour of the tell() method. By iterating the file via for line in file you are constantly calling the internal operation next() which messes up how the tell() method works. Usually it is better to use the specific methods for reading data from a file: readline() or readlines(). Unless you know exactly what you are doing, iterating over objects that are controlled by the OS (file system) can cause errors due to conflicting access methods (sometimes) or other.

Also file.tell() method returns the position of the cursor not the line where you are. So if say you were reading the first line that had 20 characters, after using file.readline() the method file.tell() would return 22 (number of characters plus endline character or other)

Rather than what you are doing I'd suggest thinking about it a different way around.

class SQLParser(object):
    """
    Parses a SQL file.
    """

    def __init__(self,filename):
        self.createTablePositions= self.findCreateTable(filename)
        self.insertIntoPositions=[]

    def findCreateTable(self, filename):
        temp = []
        with open(filename, 'r') as file:
            # with operator closes the file upon exit of block
            fileNum = 0
            for line in file.readlines():
                if "CREATE TABLE" in line:
                    print(line)
                    temp.append(fileNum)
                fileNum += 1
        return temp


sqlhandler = SQLParser("sql.sql")
print(sqlhandler.createTablePositions)

Therefore now, you will parse the file upon initialisation of the class object.

You can then proceed to doing a similar thing for the other method of insertIntoPosition.

Mixone
  • 1,327
  • 1
  • 13
  • 24
  • hi the thing is i need the amount of charaters to take specific words and rephrase them on another file but that direction is fine i guess. – LiorA Apr 18 '16 at 08:01
  • 1
    hiya thanks first of all, regarding closing the file you're right but i haven't finished the class so it was just a thing i'll do last(btw isn't it like in C++ - the destructor of the object doesn't handle it by itself?). one last question how do i jump let's say to row 16 in the file? if you can answer that (trying to find in the python docs) that would be awsome! (that's why i wanted tell() so i can use seek() later on!) – LiorA Apr 18 '16 at 08:19
  • for the jumping lines: [linecache](https://docs.python.org/3.5/library/linecache.html) – Mixone Apr 18 '16 at 11:09
  • in python there is no garbage collection like in C++, it is automatic (with some exceptions I'd guess but which do not apply to this case) – Mixone Apr 18 '16 at 11:11
0

If your SQL file is too large, there are two solutions according to this answer:

  1. Using file.readline() instead of next()
with open(path, mode) as file:
    while True:
        line = file.readline()
        if not line:
            break
        file.tell()
  1. Using offset += len(line) instead of file.tell()
offset = 0
with open(path, mode) as file:
    for line in file:
        offset += len(line)
Zhou Hongbo
  • 1,297
  • 13
  • 25