3

I am trying to write app in Python which will allow to convert xls file to pdf. The xls file has 3 columns: Index, PLN price and EUR price(prices are constant). What I want, is to generate printable pdf label with all this info for each index- big bolded index and below it prices. So basically the label should have big index, and these two prices, in other words one row should be one pdf page in exact size. And it also needs to have simple gui- just 2 buttons, upload file and generate.

For now I tried with openpyxl to get all the rows:

import openpyxl

wb = openpyxl.load_workbook('arkusz.xlsx')
ws = wb.get_sheet_by_name('Arkusz1')
mylist = []
for row in ws.iter_rows('A{}:C{}'.format(ws.min_row,ws.max_row)):
    for cell in row:
        mylist.append(cell.value)
print (mylist)

I get the rows but now I have trouble to write it to pdf. I can't find any lib that will suit my requirements. Could you please advise best lib for this app?

Saladyn
  • 33
  • 1
  • 3
  • If you search on SO or with google you may find https://stackoverflow.com/questions/2252726/how-to-create-pdf-files-in-python but also https://github.com/brechtm/rinohtype#readme or https://stackoverflow.com/questions/20854840/xlsx-and-xlslatest-versions-to-pdf-using-python. If this helps you to get on the road then post an answer yourself (OP can answer own question) when you succeeded (sharing knowledge). And don't forget to finalize the tour. It's free of charge ;-) – ZF007 Jan 01 '18 at 18:44

1 Answers1

4

If you're just reading from excel and then creating an original pdf, I would recommend just using pandas.read_excel for reading the .xlsx file.

For creating the pdf part, there are several options including pydf2, pdfdocument and FPDF. The FPDF library is fairly stragihtforward to use and is what I've used in this example. FPDF Documentation can be found here.

I've posted a fully reproducible example below, using pandas and fpdf (it also uses numpy to create a sample dataframe). I loop through the whole dataframe in my example but you could select specific rows based on the index if you wanted to do so.

import pandas as pd
import numpy as np
from fpdf import FPDF

# Creating a dataframe and saving as test.xlsx in current directory
df_1 = pd.DataFrame(np.random.randn(10, 2), columns=list('AB'))
writer = pd.ExcelWriter('test.xlsx')
df_1.to_excel(writer)
writer.save()

#read in the .xlsx file just created
df_2 = pd.read_excel('test.xlsx')

#creating a pdf in called test.pdf in the current directory
pdf = FPDF()
pdf.add_page()
pdf.set_xy(0, 0)
pdf.set_font('arial', 'B', 14)
pdf.cell(60)
pdf.cell(70, 10, 'Writing a PDF from python', 0, 2, 'C')
pdf.cell(-40)
pdf.cell(50, 10, 'Index Column', 1, 0, 'C')
pdf.cell(40, 10, 'Col A', 1, 0, 'C')
pdf.cell(40, 10, 'Col B', 1, 2, 'C')
pdf.cell(-90)
pdf.set_font('arial', '', 12)
for i in range(0, len(df_2)-1):
    col_ind = str(i)
    col_a = str(df_2.A.ix[i])
    col_b = str(df_2.B.ix[i])
    pdf.cell(50, 10, '%s' % (col_ind), 1, 0, 'C')
    pdf.cell(40, 10, '%s' % (col_a), 0, 0, 'C')
    pdf.cell(40, 10, '%s' % (col_b), 0, 2, 'C')
    pdf.cell(-90)
pdf.output('test.pdf', 'F')

Expected pdf output would look like this: pdf output

patrickjlong1
  • 3,683
  • 1
  • 18
  • 32