1

I'm using Python 3.5 with the Anaconda distribution. tabula-py version 1.1.1 is installed. When I run the following simple program:

import tabula

df = tabula.read_pdf("sample.pdf", pages=1, encoding="ISO-8859-1")

df.columns = df.iloc[0]

df.drop(0, inplace=True)

I get the following error message:

AttributeError: module 'tabula' has no attribute 'read_pdf'

HOWEVER: If I open Spyder and first type "import tabula" in the IPython console before running the code, it runs just fine. If I restart the kernal, I get the same error until I close and reopen Spyder.

Any thoughts? Thanks in advance.

Makoto
  • 104,088
  • 27
  • 192
  • 230
Steve Olsen
  • 161
  • 2
  • 5
  • Yes. Spyder has a shared namespace. Imported modules are cached. You should not be relying on this behaviour; you should explicitly import the module at the top of your script – roganjosh Sep 24 '18 at 19:57
  • Did you call you script `tabula.py`? – roganjosh Sep 24 '18 at 20:01

1 Answers1

2

Spyder has a shared namespace between your console and your scripts. I answered the reverse of this problem here.

Anything defined in the console will be accessible in the namespace of the scripts you run. The module imports are cached across all of your scripts, so you can import it once in the console and then access it in all of your scripts indefinitely (until you reset the kernel).

You should not rely on this behaviour because the script will not work outside of Spyder. Instead, you should explicitly import the module at the top of your script.

In this case, it's likely that you've called your script tabula.py and you should rename it.

roganjosh
  • 12,594
  • 4
  • 29
  • 46
  • Not sure I understand your comment, roganjosh. Look at my script. The first line is "import tabula", so I did import it at the top of my script, and I didn't call it tabula-py. – Steve Olsen Sep 25 '18 at 14:30
  • @SteveOlsen yes, but imports are cached so the import at the top of your script does nothing after you import a module with that name in the console. What do you get for `import tabula; print(tabula.__file__)` a) after restarting the kernel and only importing in the script and b) restarting the kernel and only importing in console? – roganjosh Sep 25 '18 at 14:33
  • If I run it in the editor before running it in the console, I get the above error, "no attribute 'read_pdf'. If I run it in the console before running it in the editor, it works. If I restart the kernel, it won't work unless I shut down Spyder; I get the same error even if I do 'import tabula' in the console. If I run "import tabula; print(tabula.__file__) in either the console or the editor, I get the same thing, it maps to the __init__.py file in the tabula library. – Steve Olsen Sep 25 '18 at 14:58
  • One more note: If I open Spyder, highlight just the first row (import tabula) and run that single line in the editor, then re-run the entire program, that also works. But if I start out just running the program in the editor, it throws the error. – Steve Olsen Sep 25 '18 at 15:03
  • After trying every thread on the net on fixing this, with no luck, I changed to working on another Python project for a few days, which involved installing and updating various packages. When I came back to this project - the problem is gone. No idea why - sorry for others having this problem, this obviously doesn't help you. After 5 years developing apps in SAS, I love Python, but have to admit with greater power open source software can be a little buggy sometimes. – Steve Olsen Oct 17 '18 at 19:23