0

My goal here is to print lines from text files together. Some lines, however, are not together like they should be. I resolved the first problem where the denominator was on the line after. For the else statement, they all seem to have the same value/index.

import fitz  # this is pymupdf

with fitz.open("math-problems.pdf") as doc: #below converts pdf to txt
    text = ""
    for page in doc:
        text += page.getText()

file_w = open("output.txt", "w") #save as txt file
file_w.write(text)
file_w.close()

file_r = open("output.txt", "r") #read txt file
word = 'f(x) = '

#--------------------------
list1 = file_r.readlines()  # read each line and put into list

list2 = [k for k in list1 if word in k] # look for all elements with "f(x)" and put all in new list

list1_N = list1
list2_N = list2
list1 = [e[3:] for e in list1] #remove first three characters (the first three characters are always "1) " or "A) "

char = str('\n')

for char in list2:
    index = list1.index(char)
    def digitcheck(s):
        isdigit = str.isdigit
        return any(map(isdigit,s))
    xx = digitcheck(list1[index])
    if xx:
        print(list1[index] + " / " + list1_N[index+1])
    else:
        print(list1[index] + list1[index+1]) # PROBLEM IS HERE, HOW COME EACH VALUE IS SAME HERE?


Output from terminal:

f(x) = x3 + x2 - 20x
 / x2 - 3x - 18

f(x) = 
2 + 5x

f(x) = 
2 + 5x

f(x) = 
2 + 5x

f(x) = 
2 + 5x

f(x) = x2 + 3x - 10
 / x2 - 5x - 14

f(x) = x2 + 2x - 8
 / x2 - 3x - 10

f(x) = x - 1
 / x2 + 8

f(x) = 3x3 - 2x - 6
 / 8x3 - 7x + 4

f(x) = 
2 + 5x

f(x) = x3 - 6x2 + 4x - 1
 / x2 + 8x


Process finished with exit code 0
Daniel Walker
  • 6,380
  • 5
  • 22
  • 45
  • I think the problem is in `index = list1.index(char)` whatever `char` is that result in `xx` being false might be a is something that repeat in list2 and/or is always the same, so when you ask for the index of it in list1 you of course get the same result – Copperfield Feb 10 '21 at 02:06
  • put some print in various places and/or print the content of list so you can figure out the problem... – Copperfield Feb 10 '21 at 02:09
  • @Copperfield Thank you for comment. `xx` produces true or false values to determine if each index has an integer or not. if it has an integer, it produces `true`, whereas if it does not have an integer, it produces `false`. These `true` values repeat. yet have difference indices. The `false` values repeat and have the same index. `xx` is supposed to have repeating boolean operators. – theanton205 Feb 10 '21 at 02:18
  • that is interesting, I guess, but not the problem, like I said I think the problem is in `char` or more precisely in list2, which might look like something to the effect of `['1',' ','2',' ','3']` so when you iterate over it, it work fine when char is `'1'` for example, but when char is `' '` and you ask for the index of it you get the first occurrence of it on the other the list, and thus you always get the same result – Copperfield Feb 10 '21 at 02:33
  • @Copperfield Oh, I see. I am fairly new to Python so I appreciate your help. I will see if it's the issue with `char`. Thanks. – theanton205 Feb 10 '21 at 04:12

1 Answers1

0

SOLVED @copperfield was correct, I had repeating values so my index was repeating. I solved this using a solution by @Shonu93 in here. Essentially it locates all indices of duplicate values and puts these indices into one list elem_pos and then prints each index from list1

if empty in list1:
counter = 0
elem_pos = []
for i in list1:
    if i == empty:
        elem_pos.append(counter)
    counter = counter + 1
xy = elem_pos

for i in xy:
print(list1[i] + list1_N[i+1])