0

I am using tabula to read pdf tables, but get the file not found error, i have tried chardet (https://pypi.org/project/chardet/#files) to find if there is any encoding problem, but encoding was None.

from tabula import read_pdf
from tabulate import tabulate
df = read_pdf('C:\\Users\\YQ\IPA.pdf')
df

FileNotFoundError Traceback (most recent call last) ~\Anaconda3\lib\site-packages\tabula\wrapper.py in read_pdf(input_path, output_format, encoding, java_options, pandas_options, multiple_tables, **kwargs) 107 try: --> 108 output = subprocess.check_output(args) 109

~\Anaconda3\lib\subprocess.py in check_output(timeout, *popenargs, **kwargs) 388 return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, --> 389 **kwargs).stdout 390

~\Anaconda3\lib\subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs) 465 --> 466 with Popen(*popenargs, **kwargs) as process: 467 try:

~\Anaconda3\lib\subprocess.py in init(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text) 768 errread, errwrite, --> 769 restore_signals, start_new_session) 770 except:

~\Anaconda3\lib\subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, unused_restore_signals, unused_start_new_session) 1171
os.fspath(cwd) if cwd is not None else None, -> 1172 startupinfo) 1173 finally:

FileNotFoundError: [WinError 2] The system cannot find the file specified

During handling of the above exception, another exception occurred:

JavaNotFoundError Traceback (most recent call last) in ----> 1 df = read_pdf('C:\Users\YQ\IPA.pdf') 2 df

~\Anaconda3\lib\site-packages\tabula\wrapper.py in read_pdf(input_path, output_format, encoding, java_options, pandas_options, multiple_tables, **kwargs) 109 110 except FileNotFoundError as e: --> 111 raise JavaNotFoundError(JAVA_NOT_FOUND_ERROR) 112 113 except subprocess.CalledProcessError as e:

JavaNotFoundError: java command is not found from this Python process. Please ensure Java is installed and PATH is set for java

  • I think the problem is with the path you are giving. `C:\\Users\\YQ\IPA.pdf` . The `\\` varies with operating system. Please visit https://stackoverflow.com/questions/16010992/how-to-use-directory-separator-in-both-linux-and-windows-in-python for more information – Devang Padhiyar Apr 10 '19 at 04:20
  • 1
    The last one looks like java is required and could not be found. – PowerStat Apr 10 '19 at 05:47
  • Thanks for answering, however, i am using windows and this path was copied from the system and i have tried'/' as well. this still doesnt work. – YQ Yang Apr 10 '19 at 07:52

0 Answers0