I am trying to extract the borderless tables from the PDF document, I have tried few combination with PDF table_settings parameter, however pdfplumber cannot recognize the borderless tables correctly
pdf file can be downloaded from the link
Here is my code
import pdfplumber
pdf_file="pdffile"
with pdfplumber.open(pdf_file) as pdf:
for i in range(0,len(pdf.pages)):
try:
if i==7:
bold_title_text=pdf.pages[i]
ff=bold_title_text.extract_table(table_settings=
{"vertical_strategy": "text",
"horizontal_strategy": "lines",
"keep_blank_chars": "True",
"snap_tolerance": 4,
})
display(ff[1])
except IndexError:
print("")
break
output ['Element','nt Attribute Size Input Type Requirement']
Expected Output ['Element', 'Attribute', 'Size', 'Input Type', 'Requirement']