I tried to use Python package, tabula-py to read table in pdf, It seems that line breaks in pdf table cells would separate the contents in the original cell into multiple cells.
I tried to search for all kinds of python packages to solve this problem. It seems that tabula-py is the most steady package to convert pdf table into pandas data. However, if this problem cannot be solved, I have to turn to online service, which would produce ideal excel output for me.
from tabula import read_pdf
df=read_pdf("C:/Users/Desktop/test.pdf", pages='all')
I expected the pdf table can be converted correctly with this.