I'm wondering if anyone who is familiar with the tabula-py module for Python can help me with this question. It is not clear in any of the tabula-py documentation whether the tabula.read_pdf()
function uses lattice or stream mode extraction as its default setting if no lattice or stream argument is passed to the function. Does the code somehow guess which of the two modes would be preferable depending on the "table" encountered in the pdf text and, if not, could you please clarify which of the two extraction modes is being used as the default (therefore rendering one of the two arguments redundant since, de facto, if you set lattice to False
then you must by definition be setting stream to True
, and vice versa)? Thanks in advance.
It's easy to set the tabula.read_pdf()
mode to either lattice or stream mode extraction, so that's not my issue. I just want to know which of the two is used as the default extraction mode if I don't specify which one I want to use.