from python import confusion
At some point in your attempt to master Python, the import
statement starts to cause trouble. In this article, I'm going to try to clarify how packages, modules and imports work together so you'll never have to play whack-a-mole again. I'll scaffold an 'import-naive' project structure, explain how python import resolution works, and then use our new-found understanding to fix it.
Here's the layout:
app/
app.py
module_a.py
lib/
util.py # problem: can't run this directly
log.py
format.py
test/
test_module_a.py
test_util.py
And here it is again, annotated with the relevant import statements:
app/
app.py # import module_a
module_a.py # from lib import util
lib/
util.py # from . import log, format
log.py
format.py
test/
test_module_a.py # import ..module_a
test_util.py # import ..lib.util
From inside your app
directory you can run the app as a script with python app.py
. This also works when you run the top-level submodule with python module_a.py
as well as an import-free module in a subdirectory, such as lib/log.py
. No problems so far.
But what if you have a file like lib/util.py
that pulls in sibling files log.py
and format.py
? Unfortunately, running python lib/util.py
as a script fails:
Traceback (most recent call last):
File "/path/to/app/lib/util.py", line 1, in <module>
from . import log, format
^^^^^^^^^^^^^^^^^^^^^^^^^
ImportError: attempted relative import with no known parent package
Seeing this, you might say to yourself: ah, my app needs to be a package. I'll go and add __init__.py
files to the app/
and lib/
because I've heard that will make an app into a package.
That's a good step to take. It means your app is now a regular package. But the truth is that by the time Python reached 3.3, your app was already by default a package. Not a regular package, a namespace package. But a package nonetheless!
After adding the __init.py__
files, we have this structure:
app/
__init__.py
app.py # import module_a
module_a.py # from lib import util
lib/
__init__.py
util.py # from . import log, format
log.py
format.py
test/
__init__.py
test_module_a.py # import ..module_a
test_util.py # import ..lib.util
It's definitely better, but it still fails when you run python lib/util.py
.
Let's read the error more closely: 'attempted relative import with no known parent package'. What does 'no parent package' mean? Isn't lib
a package?
To sort this out, we'll need dig into two features of the python module system:
- the module attribute,
__name__
, which tells us how our module was loaded - the
sys.path
attribute, which tells us where python goes looking for modules referenced by theimport
statement
What's in a __name__
?
Every module has a global __name__
attribute whose value indicates how that module is being run.
Let's demonstrate by adding an lib/exp.py
file and putting print(__name__)
at the top.
# file: lib/exp.py
print(__name__)
If we run it directly with python lib/exp.py
, it prints __main__
.
If we import it into our main app.py
file and run python app.py
, lib/exp.py
prints something interesting: lib.exp
. The value in this case tells us the full module path of exp
relative to the root of the project. From this we learn that from the perspective of app.py
as the main, the util
module is registered as a submodule of the lib
package.
So when imported, lib/util.py
gets a module path and 'knows' where it sits in the app. But when run directly, it doesn't. What can we do about this?
The answer is to use the python interpreter's -m
flag, and pass in a module path (e.g. lib.lib
) instead of a file path (lib/util.py
). But we haven't covered enough for that to make sense yet. We need to talk about sys.path
first.
Module names resolve against sys.path
sys.path
is a list of file paths that the python intrepreter assembles on startup to resolve module names specified in your import
statements. Importantly, it checks the paths in that list in order and stops when it can resolve a module (again, namespace packages work a little differently). There is a ton of complexity to how sys.path
is constructed (see the David Beazley talk in the references section), but all we need to know to solve our import issue is how it determines the first path the list.
Let's experiment again by printing sys.path
at the top of our lib/exp.py
.
# file: lib/exp.py
print(__name__)
print(sys.path)
If we run python lib/exp.py
to execute the file as a script, its name is __main__
and the first path in sys.path
becomes the directory where the __main__
file is located, /full/path/to/app/lib
.
That means when it tries to resolve our imports, python will first look inside of /full/path/to/app/lib
.
Recall our previous error: 'no known path to parent package'. Looking at sys.path
, we can now see why python fails at from . import log, format
: there are no packages named 'lib' in /full/path/to/app/lib
because /full/path/to/app/lib
itself is the package we're looking for.
Don't fret! We can use python -m
, which creates a different sys.path
entry. From /full/path/to/app
, we can run python -m lib.lib
and it will put the current working directory (where we are running python -m
from) at the head of sys.path
. In this case, it's /full/path/to/app
. It will also load the module and run it as __main__
.
Since the sys.path
now is one directory level up, the lib
directory is visible as a package, so the module import resolves correctly.
To highlight the difference: for python -m
, the current directory is your source code's module reference point for resolution. When running python app.py
, it's instead the parent directory of the script file (whose name is '__main__'
).
Running everything with python -m
If python -m
lets us run modules with the correct module paths for resolution, you might ask: can I run everything this way?
Yes, you can.
If we put our source directory, app/
, into a parent directory, package-demo/
, and run python -m app.app
from our new root, package-demo
, it will load app.py
as a module and run it so that its __name__
is '__main__'
.
Note that any absolute imports in your app will require adjustment after we created this directory. Relative ones do not need to change.
Here's our new structure:
package-demo/ # run python -m app.app from inside this directory
app/
app.py # import app.module_a
module_a.py # from app.lib import util
lib/
util.py # from . import log, format
# OR from app.lib import log, format
log.py
format.py
test/
test_module_a.py # import app.module_a
test_util.py # import app.lib.util
And what about tests?
We can also use python -m <package>
to run builtin, standard library or pip-installed modules. Let's try it out with pytest
.
Assuming you have installed pytest (pip install pytest
will do it), make sure you're in package-demo
and run python -m pytest -s package-demo/test
to execute your tests.
Conclusion
That's three separate types of files you can run all with python -m
: your app entrypoint, your submodules and your tests. I hope this helped clarify your mental model of how packages, modules and imports work together.
References
- David Beazley's talk, 'Live and Let Die', on Modules and Packages: https://www.youtube.com/watch?v=0oTh1CXRaQ0
- https://stackoverflow.com/questions/14132789/relative-imports-for-the-billionth-time/79209936#79209936
- Python 3 module tutorial: https://docs.python.org/3/tutorial/modules.html