1

I am trying to convert number of integers to a list in python, from a pdf. These numbers are actually the page numbers of the pdf and I am getting multiple pdf's page numbers printed on my terminal.

The program when run, outputs the page numbers of PDF on my terminal correctly, when i try to convert it to a list ( I actually want to convert the integer number to a list so that I can inject this list of integers in to my pandas dataframe.) it gives me a None output on my terminal. Here's my code for the same.

allmypdfs = []
files = []
num_pages = []
for folder in folders:
    alllfiles = os.listdir(folder)
    firstpdfs = ""
    for i in alllfiles:
        if '.pdf' or '.PDF' in i:
            firstpdfs = i
            print('PDF-Names--', firstpdfs)
            files.append(firstpdfs)
            break
    with open(folder + firstpdfs, 'rb') as fh:
        for page in PDFPage.get_pages(fh, caching=True, check_extractable=True):
            page_interpreter.process_page(page)

        parser = PDFParser(fh)
        document = PDFDocument(parser)
        for pages, pdfObjects in enumerate(PDFPage.create_pages(document)):
            dc = pages + 1
            print(dc)

        #other code
    print('HEYY', num_pages.append(dc))

This gives me an output like

289
290
291
292
293
HEYY None

While I want the output to be displayed as

289
290
291
292
293
HEYY [289, 290, 291, 292, 293]

Where am I making a mistake? I have tried using a map() function along with a list but that didn't help! I even tried appending the result in for loop but that gave me the same result.

quamrana
  • 37,849
  • 12
  • 53
  • 71
technophile_3
  • 531
  • 6
  • 21
  • Append doesn't return anything. If you want to see your array, you need just write `print('HEYY', num_pages)` – magicarm22 Jul 12 '21 at 09:25
  • There are multiple problems in this code. For instance, `if '.pdf' or '.PDF' in i:` does not do what you expect it to. – Karl Knechtel Sep 14 '22 at 15:22

3 Answers3

2

.append() is an inplace function. It does not return anything. By default when a function returns nothing, it returns None. You should move num_pages.append(dc) to a new line and then print the list. Also, the print function should be outside the loop or it will print the list with every iteration

    num_pages.append(dc) #== indentation to be inside the loop
print('HEYY', num_pages)
quamrana
  • 37,849
  • 12
  • 53
  • 71
1

list.append(x) does not return the list, it mutates it. Mutate it in a different line and then just print the list next to heyy to fix!

Pretty sure you want to append inside the for loop, so it appends each item to the list

lollylegs2
  • 21
  • 3
1

I think you are trying to combine print and append, whereas you should keep those separate:

for folder in folders:
    alllfiles = os.listdir(folder)
    firstpdfs = ""
    for i in alllfiles:
        if '.pdf' or '.PDF' in i:
            firstpdfs = i
            print('PDF-Names--', firstpdfs)
            files.append(firstpdfs)
            break
    with open(folder + firstpdfs, 'rb') as fh:
        for page in PDFPage.get_pages(fh, caching=True, check_extractable=True):
            page_interpreter.process_page(page)

        parser = PDFParser(fh)
        document = PDFDocument(parser)
        for pages, pdfObjects in enumerate(PDFPage.create_pages(document)):
            dc = pages + 1
            print(dc)
            num_pages.append(dc)
        #other code
        
    print('HEYY', num_pages)
quamrana
  • 37,849
  • 12
  • 53
  • 71