1

#Background and Problem# I'm trying to build a web-scraper to back up my social media accounts (summer project, I know it's useless).

I'm trying to create a nice class structure, so I've come up with the following structure (I accept critique, I'm pretty new to programming):

\social-media-backup

    \Files
        __init__.py
        File.py
        Image.py
        Video.py

    \SocialMedia
        SocialMediaFeed.py
        SocialMediaPost.py
        \Instagram
            __init__.py
            \MediaTypes
                __init__.py
                GraphImage.py
                GraphVideo.py
            \SearchTypes
                __init__.py
                InstagramUser.py
        \Twitter
        \VSCO

(Twitter and VSCO are, for now, empty. Anything without extension and starting with , is a folder. Every file has a class with the same name as the file inside)

#Questions#

  1. Where can I learn Python's packaging system definitively? Any book or web-site recommendation?
  2. How would I import Image into GraphImage? How would I import File into SocialMediaPost?
  3. What do I need to write in __init__.py so as to import SOME_PACKAGE and have it import every module? (e.g.: import Files and have Files.Image and File.Video accessible).

(I know there are a lot of questions. They're written in order of importance)

#To accomplish importing File into SocialMediaPost I've tried:#

from Files.File import File

from ...Files.File import File

import File

from File import File

And almost any combination you can imagine.

I always get an Unable to import, No module named '__main__.Files' or Attempted relative import beyond top-level package.

#Expected behavior#

I'm used to Java's way of doing this, and I cannot figure out how to do this in Python. It seems so messed up. I really miss just adding a package and a folder tree from where the compiler would run.

knocks desk THERE MUST BE A BETTER WAY

##THANKS!!##

Peanut14
  • 113
  • 4
  • Try starting from root of the project (but before, rename it to social_media_backup) - **"from social_media_backup.Files.File ..."**, also, it's widely recommended to name modules with lowercase, and use underscore instead of dash. Also, put **__init__.py** file in **social_media_backup**. – Damian Jan 28 '19 at 04:21
  • Putting `WhateverClass` in a file named `WhateverClass.py` is a *really bad* idea in Python. If you really want one class per file, at least name it `whateverclass.py` so you don't get super-confusing name collisions between the module and the class. – user2357112 Jan 28 '19 at 04:22
  • Stackoverflow is a better place for "One question at a time". One thing you should consider is, where do you launch this app from? You should search for `Minimal Python Package`. It's not very trivial, but you should take your time to read through Packaging – user1767754 Jan 28 '19 at 06:22
  • @Damian Thanks for your reply! I was using [this](https://www.python.org/dev/peps/pep-0008/#class-names) to name my classes, but I mixed it up with how you do things in Java. I think something clicked in me with this: when you import a module you get access to the class within it as `module.class`. However, if you `from module import class`, you can just use `class`? I'll change my naming convention and try all other solutions after that. – Peanut14 Jan 28 '19 at 15:54
  • @user2357112 I think I understand why with Damian's comment. I'll be sure to change that before attempting any other solution. – Peanut14 Jan 28 '19 at 16:04
  • @user1767754 I'll read up on that. I know I have much reading to do, but I haven't found a clear and consice source that gives many examples. In my past (hobby) Python projects, I've always tried imports until it worked, but it's about time I learn this. – Peanut14 Jan 28 '19 at 16:05

2 Answers2

3

Lots of stuff is written about this... however most guides focus on how you do it, not what to do.

How I do it (for small to medium-sized projects):

  1. Do not mess with sys.path.
  2. Have one "project root" directory with your modules / packages underneath (as you already do).
  3. Use absolute imports always, except for "sister" modules.
  4. Always run as module, i.e. using python -m foo.bar

Concrete example using your structure. Assuming that your entry point might be in \SocialMedia\SocialMediaFeed.py, use import statements:

  • from . import SocialMediaPost (sister module)
  • import Instagram (child module)
  • from Files import Image (other module)

and run using: python -m SocialMedia.SocialMediaFeed

By running as module, you always have the project root (social-media-backup) added as "search path". This way absolute imports refering to its subfolders always work. By the way, you can print out the module search path using import sys; print(sys.path).

Some of this might seem overcomplicated, but I found that following the above points pays out greatly as soon as you try to package up stuff for installation (keyword setup.py).

Edit: to answer 3rd question: Have __init__.py contain:

from . import File
from . import Image
from . import Video
Torben Klein
  • 2,943
  • 1
  • 19
  • 24
  • Thanks for the awesome reply. I was trying to avoid modifying `sys.path` and `PYTHONPATH` at all costs. When you say I should always use absolute imports, are they refered to my CMD execution directory? I mean, if I execute `python -m SocialMedia.SocialMediaFeed` standing in `PATH\TO\MY\PROJECT\DIRECTORY\social-media-downloader`, shouldn't I make all of my imports `from social-media-backup.some.other.folder import some_thing`? (I've already changed - for _ in my project, as suggested). – Peanut14 Jan 28 '19 at 19:04
  • Yes - on script start, your current directory is added to the python search path (``sys.path`` variable). The import statement should not include the root folder name: ``from some.other.folder import some_thing``. Oh, and your package folder names must not include hyphens, dots, spaces or the like. Otherwise import is not possible. – Torben Klein Jan 29 '19 at 06:14
  • Yeah, I've fixed my naming after some commenters pointed it out. After understanding that 1 class =/= 1 module, I've flattened my structure a la Python Zen. I was getting confused due to the `-m` flag needing to be executed from the main package directory, so I had import errors all over the place. But thanks to your answer, I now understand what Python does to `sys.path` and how that affects my imports. I'll refer back to it if at any time I find import troubles again! Thanks! – Peanut14 Jan 29 '19 at 14:23
0

I would second the comments by Damian and user2357112 - try to avoid name collisions between folder, file and class/function when creating modules.

You probably won't be able to import anything outside of the current working directory without adding it to your PYTHONPATH. Adding a folder to your PYTHONPATH environment variable means that python will always check that folder when importing modules, so you'll be able to import it from anywhere.

There is a good thread on this already that will put you in the right direction: Permanently add a directory to PYTHONPATH? (It's a lot to cover in one post)

Cameron Hyde
  • 118
  • 7
  • Thanks for replying! I kind of knew about `PYTHONPATH`, but was trying to leave it alone as that would become to "computer-specific". This, in turn, would make my application hard to port to another computer. Right? It seems a bit hacky to add entries to `PYTHONPATH` via script, so I'm trying to avoid it. Anyways, I still think it's a useful tool to have in the toolbox :). – Peanut14 Jan 28 '19 at 19:10
  • Yes in your case PYTHONPATH is not the way to go, I failed to realise that portability is important in your case! I have added a single folder in my PYTHONPATH where all my custom modules can be imported from, but I should also have mentioned that you must be careful not to pollute the PYTHONPATH by overwriting existing modules! Torben's answer is much more useful for your problem! – Cameron Hyde Jan 29 '19 at 20:38