4

I am solving a timetable scheduling problem and wants to print out the Final Output in the form of PDF or set of images. I have multiple sections and each section has its own schedule.

I have created a 2D array for each section. The array is of 5 x 5 size (5 days, each day has 5 five slots) and each index of the array represents a lecture slot. Now, this 2D array contains the lectures for every course in the timetable of that specific section. Sample Output is below, (It's a dictionary, and each key is a section. Values against each key is a 2D array

CS-3B :  [['', '', 'DS ', '', 'COaAL '], ['', 'COaAL ', '', 'DS ', 'OOP '], ['DS-L ', 'DS-L ', 'OOP-L ', 'OOP-L ', 'FoM '], ['COaAL-L ', 'COaAL-L ', 'OOP ', '', ''], ['', 'FoM ', 'DE ', '', 'DE ']]
SE-3A :  [['', 'OOP-L ', 'OOP-L ', '', 'SRE '], ['SRE ', 'OOP ', 'DS-L ', 'DS-L ', ''], ['', 'DS ', '', '', 'MM '], ['DS ', 'MM ', '', 'LA ', ''], ['OOP ', 'HCI ', '', 'LA ', 'HCI ']]
CS-7F :  [['', '', '', '', ''], ['RSaG ', '', '', '', ''], ['ST ', '', 'RSaG ', '', ''], ['', '', '', '', ''], ['', 'ST ', '', '', '']]
CS-1C :  [['IS ', 'ECaC-L ', 'ECaC-L ', '', 'PF '], ['ECaC ', 'PF-L ', 'PF-L ', 'ECaC-L ', 'ECaC-L '], ['DLD ', 'ECaC ', '', 'PF ', 'ItIaCT '], ['DLD-L ', 'DLD-L ', 'IS ', 'LA ', ''], ['ECaC ', 'ECaC ', 'ItIaCT ', 'DLD ', 'LA ']]
CS-1D :  [['PF-L ', 'PF-L ', 'ECaC-L ', 'ItIaCT ECaC-L ', 'ItIaCT '], ['IS ', 'AP ', 'ECaC-L ', 'ECaC-L ', ''], ['PF ', 'PF ', '', 'ECaC ', ''], ['CaAG ', 'ECaC ', 'ECaC ', '', 'IS '], ['', 'CaAG ', '', 'ECaC ', 'AP ']]
CS-7A :  [['', 'DM ', '', 'PPiI ', 'DS '], ['AI-L ', 'AI-L ', '', 'AI ', 'IS '], ['', '', 'DS ', '', ''], ['SE ', 'SE ', '', 'PPiI ', ''], ['', 'AI ', 'IS ', '', 'DM ']]
CS-7B :  [['', 'DS ', '', 'DS ', 'DM '], ['', '', '', 'PPiI ', ''], ['', 'PPiI ', '', 'SE ', ''], ['', 'DM ', '', 'IS ', ''], ['', '', 'IS ', 'SE ', '']]
CS-1B :  [['LA ', '', '', 'DLD ', 'DLD '], ['ECaC ', 'IS ', '', 'PF ', 'ECaC '], ['ECaC-L ', 'ECaC-L ', 'DLD-L ', 'DLD-L ', 'ItIaCT '], ['ECaC ', 'PF-L ', 'PF-L ', 'ECaC-L ', 'ECaC-L '], ['ECaC ', 'PF ', 'IS ', 'LA ', 'ItIaCT ']]
CS-1A :  [['', 'PF-L ', 'PF-L ', 'ECaC ', ''], ['ECaC ', '', 'ItIaCT ', 'LA ', 'ECaC '], ['PF ECaC-L ', 'ItIaCT ECaC-L ', '', 'DLD-L ', 'DLD-L '], ['IS ', 'PF ', 'ECaC-L ', 'ECaC-L ', ''], ['DLD ', 'IS ', 'LA ', 'DLD ', 'ECaC ']]
CS-7D :  [['AML ', '', 'IS ', '', 'AML '], ['', '', '', '', ''], ['IS ', 'SfMD ', '', '', ''], ['', '', '', '', 'SfMD '], ['PPiI ', '', 'PPiI ', '', '']]
CS-7C :  [['SfMD ', '', '', 'AML ', ''], ['PPiI ', '', '', '', ''], ['', 'SfMD ', '', '', ''], ['', '', 'AML ', 'IS ', ''], ['', '', 'PPiI ', 'IS ', '']]
CS-3C :  [['MM ', 'COaAL-L ', 'COaAL-L ', 'DS ', ''], ['', '', '', '', ''], ['DS-L ', 'DS-L ', 'DS ', '', 'DE '], ['', '', '', '', ''], ['', 'DE ', '', '', 'MM ']]
CS-5C :  [['', 'CN-L ', 'CN-L ', '', 'CN '], ['PaS ', 'CN ', '', '', 'ToA '], ['', '', '', 'SDaA ', 'AP '], ['AP ', '', '', 'ToA ', 'SDaA '], ['', 'PaS ', '', '', '']]
CS-5B :  [['', '', 'WP ', '', ''], ['WP ', 'ToA ', 'MM ', 'CN-L ', 'CN-L '], ['SDaA ', '', '', 'MM ', 'CN '], ['SDaA ', '', '', 'ToA ', ''], ['', '', '', 'CN ', '']]
CS-1E :  [['PF-L ', 'PF-L ', 'AP ', 'ECaC ', 'ECaC '], ['ECaC-L ', 'ECaC-L ', 'PS ', 'ItIaCT ', 'AP '], ['', 'PF ', 'CaAG ', 'ECaC-L ', 'ECaC-L '], ['PS ', '', 'ItIaCT ', '', ''], ['', 'CaAG ', 'PF ', 'ECaC ', 'ECaC ']]
SE-3B :  [['LA ', '', '', '', ''], ['DS ', 'HCI ', '', '', ''], ['DS ', 'LA ', '', '', ''], ['', 'DS-L ', 'DS-L ', 'SRE ', 'F&A '], ['F&A ', 'HCI ', '', '', 'SRE ']]
SE-5B :  [['', '', '', 'PaS ', 'TaBW '], ['SCaD-L ', 'SCaD-L ', 'SCaD ', 'OR ', 'SQE '], ['', '', 'TaBW ', '', 'SCaD '], ['', 'SQE ', '', '', ''], ['PaS ', '', '', '', 'OR ']]
SE-5A :  [['OS-L ', 'OS-L ', 'OS ', 'SCaD-L ', 'SCaD-L '], ['OR ', 'DS ', '', 'OR ', 'TaBW '], ['DS-L ', 'DS-L ', 'PaS ', 'SCaD ', 'OS '], ['', 'SQE ', 'SCaD ', 'PaS ', 'TaBW '], ['', '', 'DS ', '', 'SQE ']]
CS-3A :  [['DS-L ', 'DS-L ', 'LA ', 'CaAG ', 'DS '], ['F&A ', 'DS ', 'DLD ', 'DS ', 'OOP '], ['CaAG ', 'LA ', 'COaAL ', 'OOP-L ', 'OOP-L '], ['DE AP ', 'COaAL-L ', 'COaAL-L ', 'OOP ', 'COaAL '], ['AP ', 'DE ', 'F&A ', 'DLD ', 'DS ']]

Please note CS-1D as example in this,

CS-1D :  [['PF-L ', 'PF-L ', 'ECaC-L ', 'ItIaCT ECaC-L ', 'ItIaCT '], ['IS ', 'AP ', 'ECaC-L ', 'ECaC-L ', ''], ['PF ', 'PF ', '', 'ECaC ', ''], ['CaAG ', 'ECaC ', 'ECaC ', '', 'IS '], ['', 'CaAG ', '', 'ECaC ', 'AP ']]

There are two things that I need to take care of. First, every Lab (courses ending with -L have lectures in consecutive slots. So that means, I want the cells (two cells in timetable) to be horizontally merged when represneting a Lab.

Second, at some indexes, there are two lectures happening at the same time. For example, notice the 4th slot of Monday (0 index) in CS-1D. ItIaCT and ECaC-L are two different courses but have lectures at the same time. (In this 2D Array, if there are two or more lectures happening at the same time, then they are separated by a space in that index). For this, I want the cell of that lecture slot to be horizontally divided to fit in both the lectures.

A sample final output looks something like this (each cell will also tell what instructor is teaching the course and in which room the class is being held)

I do not want 13 different slots, but instead only five slots per day. My problem is,

  • I have to do this using Python and I do not know how to start. I have timetables created using an algorithm for each section (as shown above) but I can't figure out how to make a timetable (Output) out of this

  • Secondly, I want to make a PDF File that will contain Timetable of all the Sections. I don't know how to do it. I am assuming that I need to make a Image for each section's timetable and then combine all those images (just like I shared one image of one section's timetable above) into a PDF. However, I do not know how would I convert one timetable to Image.

Also, please note that I kind of made something similar by using plain HTML, the code and result of which I will share below. I am trying to replicate a bit something similar using Python.

<!DOCTYPE html>
<html>
  <style>
.center
{
  text-align: center;
 
}
td{
  height:75px;
  width:150px;
}


  </style>
<body>
<!-- Heading -->
    <h1 class="center">BCS-7D</h1>

<!-- Table -->
    <table border="5" cellspacing="5" align="center">
        
<!-- Day/Periods -->
        <tr>
            <td class="center" ><br>
                <b>Day/Period</b></br>
            </td>
            <td class="center" >
                <b>I</b>
            </td>
            <td class="center" >
                <b>II</b>
            </td>
            <td class="center">
                <b>III</b>
            </td>
            <td class="center">
                <b>1:15-1:45</b>
            </td>
            <td class="center" >
                <b>IV</b>
            </td>
            <td class="center" >
                <b>V</b>
            </td>
           
        </tr>
<!-- Monday -->
        <tr>
            <td class="center">
                <b>Monday</b></td>
            <td class="center">Linear Algebra, Mr. Raheel Ahmad, Room 1</td>
            <td class="center">X</td>
            <td class="center">X</td>
            <td rowspan="6" class="center">
                <h2>L<br>U<br>N<br>C<br>H</h2>
            </td>
            <td colspan="2" class="center">LAB</td>
        
        </tr>
<!-- Tuesday -->
        <tr>
            <td class="center">
                <b>Tuesday</b>
            </td>
            <td class="center">X</td>
            <td colspan="2" class="center">LAB
            </td>
            
            <td class="center">X</td>
            <td class="center">X</td>
        </tr>
<!-- Wednesday -->
        <tr>
            <td class="center">
                <b>Wednesday</b>
            </td>
            <td class="center">Object Oriented Programming, Ms. Jen Ledger, Room 13<hr>Programming Fundamentals, Mr. Zahid Iqbal, Room 6</td>
            <td class="center">X</td>
            <td class="center">X</td>
            <td class="center">X</td>
            <td colspan="3" class="center">X
            </td>
        </tr>
<!-- Thursday -->
        <tr>
            <td class="center">
                <b>Thursday</b>
            </td>
            <td class="center">X</td>
            <td class="center">X</td>
            <td class="center">X</td>
            <td colspan="3" class="center">Object Oriented Programming - Lab, Ms. Zain Malik, Lab 6
            </td>
          
        </tr>
<!-- Friday -->
        <tr>
            <td class="center">
                <b>Friday</b>
            </td>
            <td colspan="2" class="center">LAB
            </td>
            <td class="center">X</td>
            <td class="center">X</td>
            <td class="center">X</td> 

        </tr>
       
    </table>
</body>
  
</html>

Screenshot of the Output, (Please note that this is hard coded layout. The labs can be anywhere in the timetable (for a lab, two consecutive slots must be combined) and two lectures being at the same time can also happen at any time. For that, there should be a horizontal separator on that lecture slot)

Awais Shahid
  • 117
  • 4
  • 15

2 Answers2

3

you can use jinja for html templating and pdfkit to convert to pdf (for pdfkit to work you may have to install wkhtmltopdf on your OS (not python)). I made a direct example based on your data and on your html example:

from typing import List

import pdfkit
from jinja2 import FileSystemLoader, Environment

input_data = {
    "CS-3B": [['', '', 'DS ', '', 'COaAL '], ['', 'COaAL ', '', 'DS ', 'OOP '],
              ['DS-L ', 'DS-L ', 'OOP-L ', 'OOP-L ', 'FoM '], ['COaAL-L ', 'COaAL-L ', 'OOP ', '', ''],
              ['', 'FoM ', 'DE ', '', 'DE ']],
    "SE-3A": [['', 'OOP-L ', 'OOP-L ', '', 'SRE '], ['SRE ', 'OOP ', 'DS-L ', 'DS-L ', ''], ['', 'DS ', '', '', 'MM '],
              ['DS ', 'MM ', '', 'LA ', ''], ['OOP ', 'HCI ', '', 'LA ', 'HCI ']],
    "CS-7F": [['', '', '', '', ''], ['RSaG ', '', '', '', ''], ['ST ', '', 'RSaG ', '', ''], ['', '', '', '', ''],
              ['', 'ST ', '', '', '']],
    "CS-1C": [['IS ', 'ECaC-L ', 'ECaC-L ', '', 'PF '], ['ECaC ', 'PF-L ', 'PF-L ', 'ECaC-L ', 'ECaC-L '],
              ['DLD ', 'ECaC ', '', 'PF ', 'ItIaCT '], ['DLD-L ', 'DLD-L ', 'IS ', 'LA ', ''],
              ['ECaC ', 'ECaC ', 'ItIaCT ', 'DLD ', 'LA ']],
    "CS-1D": [['PF-L ', 'PF-L ', 'ECaC-L ', 'ItIaCT ECaC-L ', 'ItIaCT '], ['IS ', 'AP ', 'ECaC-L ', 'ECaC-L ', ''],
              ['PF ', 'PF ', '', 'ECaC ', ''], ['CaAG ', 'ECaC ', 'ECaC ', '', 'IS '],
              ['', 'CaAG ', '', 'ECaC ', 'AP ']],
    "CS-7A": [['', 'DM ', '', 'PPiI ', 'DS '], ['AI-L ', 'AI-L ', '', 'AI ', 'IS '], ['', '', 'DS ', '', ''],
              ['SE ', 'SE ', '', 'PPiI ', ''], ['', 'AI ', 'IS ', '', 'DM ']],
    "CS-7B": [['', 'DS ', '', 'DS ', 'DM '], ['', '', '', 'PPiI ', ''], ['', 'PPiI ', '', 'SE ', ''],
              ['', 'DM ', '', 'IS ', ''], ['', '', 'IS ', 'SE ', '']],
    "CS-1B": [['LA ', '', '', 'DLD ', 'DLD '], ['ECaC ', 'IS ', '', 'PF ', 'ECaC '],
              ['ECaC-L ', 'ECaC-L ', 'DLD-L ', 'DLD-L ', 'ItIaCT '], ['ECaC ', 'PF-L ', 'PF-L ', 'ECaC-L ', 'ECaC-L '],
              ['ECaC ', 'PF ', 'IS ', 'LA ', 'ItIaCT ']],
    "CS-1A": [['', 'PF-L ', 'PF-L ', 'ECaC ', ''], ['ECaC ', '', 'ItIaCT ', 'LA ', 'ECaC '],
              ['PF ECaC-L ', 'ItIaCT ECaC-L ', '', 'DLD-L ', 'DLD-L '], ['IS ', 'PF ', 'ECaC-L ', 'ECaC-L ', ''],
              ['DLD ', 'IS ', 'LA ', 'DLD ', 'ECaC ']],
    "CS-7D": [['AML ', '', 'IS ', '', 'AML '], ['', '', '', '', ''], ['IS ', 'SfMD ', '', '', ''],
              ['', '', '', '', 'SfMD '], ['PPiI ', '', 'PPiI ', '', '']],
    "CS-7C": [['SfMD ', '', '', 'AML ', ''], ['PPiI ', '', '', '', ''], ['', 'SfMD ', '', '', ''],
              ['', '', 'AML ', 'IS ', ''], ['', '', 'PPiI ', 'IS ', '']],
    "CS-3C": [['MM ', 'COaAL-L ', 'COaAL-L ', 'DS ', ''], ['', '', '', '', ''], ['DS-L ', 'DS-L ', 'DS ', '', 'DE '],
              ['', '', '', '', ''], ['', 'DE ', '', '', 'MM ']],
    "CS-5C": [['', 'CN-L ', 'CN-L ', '', 'CN '], ['PaS ', 'CN ', '', '', 'ToA '], ['', '', '', 'SDaA ', 'AP '],
              ['AP ', '', '', 'ToA ', 'SDaA '], ['', 'PaS ', '', '', '']],
    "CS-5B": [['', '', 'WP ', '', ''], ['WP ', 'ToA ', 'MM ', 'CN-L ', 'CN-L '], ['SDaA ', '', '', 'MM ', 'CN '],
              ['SDaA ', '', '', 'ToA ', ''], ['', '', '', 'CN ', '']],
    "CS-1E": [['PF-L ', 'PF-L ', 'AP ', 'ECaC ', 'ECaC '], ['ECaC-L ', 'ECaC-L ', 'PS ', 'ItIaCT ', 'AP '],
              ['', 'PF ', 'CaAG ', 'ECaC-L ', 'ECaC-L '], ['PS ', '', 'ItIaCT ', '', ''],
              ['', 'CaAG ', 'PF ', 'ECaC ', 'ECaC ']],
    "SE-3B": [['LA ', '', '', '', ''], ['DS ', 'HCI ', '', '', ''], ['DS ', 'LA ', '', '', ''],
              ['', 'DS-L ', 'DS-L ', 'SRE ', 'F&A '], ['F&A ', 'HCI ', '', '', 'SRE ']],
    "SE-5B": [['', '', '', 'PaS ', 'TaBW '], ['SCaD-L ', 'SCaD-L ', 'SCaD ', 'OR ', 'SQE '],
              ['', '', 'TaBW ', '', 'SCaD '], ['', 'SQE ', '', '', ''], ['PaS ', '', '', '', 'OR ']],
    "SE-5A": [['OS-L ', 'OS-L ', 'OS ', 'SCaD-L ', 'SCaD-L '], ['OR ', 'DS ', '', 'OR ', 'TaBW '],
              ['DS-L ', 'DS-L ', 'PaS ', 'SCaD ', 'OS '], ['', 'SQE ', 'SCaD ', 'PaS ', 'TaBW '],
              ['', '', 'DS ', '', 'SQE ']],
    "CS-3A": [['DS-L ', 'DS-L ', 'LA ', 'CaAG ', 'DS '], ['F&A ', 'DS ', 'DLD ', 'DS ', 'OOP '],
              ['CaAG ', 'LA ', 'COaAL ', 'OOP-L ', 'OOP-L '], ['DE AP ', 'COaAL-L ', 'COaAL-L ', 'OOP ', 'COaAL '],
              ['AP ', 'DE ', 'F&A ', 'DLD ', 'DS ']]
}


def organise_input_data(elements: List[List[str]]) -> List[list]:
    """
    Organises the input data to find double courses for easier use in templates
    """
    new_elements = []
    for day in elements:
        last_course = None
        course_list = []
        index = 0
        for course in day:
            # cleanup data
            course = course.strip().replace(" ", "<hr>")
            # check if long course (and not lunch time)
            if course != "" and course == last_course and index != 3:
                course_list.remove((course, 1))
                course_list.append((course, 2))
                course_list.append(("none", 0))
            else:
                course_list.append((course.replace(" ", "<hr>"), 1))
            last_course = course
            index += 1
        new_elements.append(course_list)

    return new_elements


def generate_html(template, name: str, elements: List[list]) -> str:

    new_elements = organise_input_data(elements=elements)

    rendered = template.render(
        name=name,
        monday=new_elements[0],
        tuesday=new_elements[1],
        wednesday=new_elements[2],
        thursday=new_elements[3],
        friday=new_elements[4]
    )

    with open(f"out_{name}.html", "w+") as file:
        file.write(rendered)

    return rendered


def run():
    # Init jinja
    file_loader = FileSystemLoader('.')
    env = Environment(loader=file_loader)
    template = env.get_template('template.html')

    full_text = ""
    for name, elements in input_data.items():
        full_text += generate_html(template=template, name=name, elements=elements)

    pdfkit.from_string(full_text, "out.pdf")


if __name__ == '__main__':
    run()

the organise_input_data function is only for data preperation. The template.html looks like:

<!DOCTYPE html>
<html>
  <style>
.center
{
  text-align: center;

}
td{
  height:75px;
  width:150px;
}


  </style>
<body>
<!-- Heading -->
    <h1 class="center">{{name}}</h1>

<!-- Table -->
    <table border="5" cellspacing="5" align="center">

<!-- Day/Periods -->
        <tr>
            <td class="center" ><br>
                <b>Day/Period</b></br>
            </td>
            <td class="center" >
                <b>I</b>
            </td>
            <td class="center" >
                <b>II</b>
            </td>
            <td class="center">
                <b>III</b>
            </td>
            <td class="center">
                <b>1:15-1:45</b>
            </td>
            <td class="center" >
                <b>IV</b>
            </td>
            <td class="center" >
                <b>V</b>
            </td>

        </tr>
<!-- Monday -->
        <tr>
            <td class="center">
                <b>Monday</b></td>
            {% for course in monday %}
                {% if loop.index == 4 %}
                    <td rowspan="6" class="center">
                        <h2>L<br>U<br>N<br>C<br>H</h2>
                    </td>
                {% endif %}
                {% if course[1] != 0 %}
                    <td colspan={{course[1]}} class="center">{{course[0]}}</td>
                {% endif %}
            {% endfor %}

        </tr>
<!-- Tuesday -->
        <tr>
            <td class="center">
                <b>Tuesday</b>
            </td>
            {% for course in tuesday %}
                {% if course[1] != 0 %}
                    <td colspan={{course[1]}} class="center">{{course[0]}}</td>
                {% endif %}
            {% endfor %}
        </tr>
<!-- Wednesday -->
        <tr>
            <td class="center">
                <b>Wednesday</b>
            </td>
            {% for course in wednesday %}
                {% if course[1] != 0 %}
                    <td colspan={{course[1]}} class="center">{{course[0]}}</td>
                {% endif %}
            {% endfor %}
        </tr>
<!-- Thursday -->
        <tr>
            <td class="center">
                <b>Thursday</b>
            </td>
            {% for course in thursday %}
                {% if course[1] != 0 %}
                    <td colspan={{course[1]}} class="center">{{course[0]}}</td>
                {% endif %}
            {% endfor %}

        </tr>
<!-- Friday -->
        <tr>
            <td class="center">
                <b>Friday</b>
            </td>
            {% for course in friday %}
                {% if course[1] != 0 %}
                    <td colspan={{course[1]}} class="center">{{course[0]}}</td>
                {% endif %}
            {% endfor %}

        </tr>

    </table>
</body>

</html>

Output pdf:

enter image description here

Explanation: Jinja is an templating language/library so you can create html with some variables and logic (mostliy in {} brackets) in it, so you do not have to create the whole html by python itself. I decided to create Tuples for each Course with course name and duration (1 or 2 time slots in a row). If you control the whole process of timetable creation you can directly generate this data instead of your input_data list.

The example is very simple and extremely near on your given data to make the step to it very easy. You also can create a (in my eyes) cleaner html file by using more logic in html/jinja, like:

{% for day in days %}
    {% with courses=course_list[loop.index-1] %}
        {% include 'template_day.html' %}
    {% endwith %}
{% endfor %}

and a extra template file (template.html) with the logi per day:

<tr>
    <td class="center">
        <b>{{day}}</b>
    </td>
    {% for course in courses %}
        {% if day == "Monday" and loop.index == 4 %}
            <td rowspan="6" class="center">
                <h2>L<br>U<br>N<br>C<br>H</h2>
            </td>
        {% endif %}
        {% if course[1] != 0 %}
            <td colspan={{course[1]}} class="center">{{course[0]}}</td>
        {% endif %}
    {% endfor %}
</tr>

With this the html is much shorter (beayuse all days are iterated in the html and not manually declared). So if you want to add some more data (like Room) you only have to change it on one place and not on 5. The new part of the render call will be shorter too:

rendered = template.render(
        name=name,
        days=["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"],
        course_list=new_elements
    )
D-E-N
  • 1,242
  • 7
  • 14
1

There are many ways to generate such a layout with dynamic data, and generally for python I'd either recommend you use excel, or use a web-based layout, and then use that to generate your output pdf.

Using HTML Manipulation

If we are talking about the web-based layout, you would need to use beautifulsoup, and the approach you would have would look like this:

  1. Create a standard layout with the cells having unique ids for each position (You could use something similar to your last example).

  2. Loop through your bigger containers which might need to be split, you can use beautifulsoup with an if condition to add different html depending if you need to split the data or not, e.g.:

import bs4

with open("template.html") as t:
    text = t.read()
    soup = bs4.BeautifulSoup(text)

td = soup.find(id="B6")
entry = soup.new_tag("p")
entry.append("LAB")
# insert the new tag after the current tag
td.insert_after(new_tag)
  1. You can additional at this stage add the layout dynamically, possibly row by row, adding the td tags instead of their contents.

  2. Finally, once done you can save your html file using:

with open("template.html", "w") as t:
    t.write(str(soup))
  1. After saving it, you need to generate the pdf using pdfkit which is very straight forward:
import pdfkit
pdfkit.from_url('template.html', 'out.pdf')

Using Excel Manipulation

I will not go into as much depth for the excel method, but if you prefer to do so you can look into using openpyxl and you can see a number of examples here

After generating your excel, you can convert it into a pdf using pandas in addition to pdf kit, you can find more information on the answer to this question, but to keep info here for completness sake:

  1. Convert excel to a pandas object
  2. Convert pandas object to html
  3. Convert html to pdf

Sample code taken from Thomas Devoogdt's answer:

import pandas as pd
import pdfkit

df = pd.read_excel("file.xlsx")
df.to_html("file.html")
pdfkit.from_file("file.html", "file.pdf")

Conclusion

While there are many other methods to generate this kind of layout (e.g. using OpenCV) it can get very complicated to manually generate the layouts you have in mind, thus using either excel or preferrably HTML-manipulation (since it seems you have worked with that and can create the layout you had in mind in html) would give you a more flexible approach that makes use of the method.

Zaid Al Shattle
  • 1,454
  • 1
  • 12
  • 21