I read some articles about using a virtual environment in Docker. Their argument are that the purpose of virtualization in Docker is to introduce isolation and limit conflicts with system packages etc.
However, aren’t Docker and Python-based images (e.g., python:*) already doing the same thing?
Can someone eli5 this whole thing?
It’s not necessary but there is no reason not to.
Pros:
- production and development programs are more similar
- upgrading your base image won’t affect your python packages
- you can use multi stage builds to create drastically smaller final images
Cons:
- you have to type
venv/bin/python3
instead of justpython3
in the run line of your dockerfile
Hah my base python has never seen a command.
Biggest reason for me is that local dev happens in a venv and translating to a container is 100% 1:1 then
upgrading your base image won’t affect your python packages
Surely if upgrading python will affect your global python packages it will also affect your venv python packages?
you can use multi stage builds to create drastically smaller final images
This can also be done without using venv’s, you just need to copy them to the location where global packages are installed.
Upgrading the base image does not imply updating your python, and even updating your python does not imply updating your python packages (except for the standard libraries, of course).
Sure, but in the case where you upgrade python and it affects python packages it would affect global packages and a venv in the same way.
Sure If that happens. But it may also not. Which is actually usually the case. Sure, it’s not 100% safe, but it is safer.
It’s easy to set the path to include the venv in the Dockerfile, that way you never have to activate, either in the run line, nor if you exec into it. Also this makes all your custom entry points super easy to use. Bonus, it’s super easy to use uv to get super fast image builds like that. See this example https://gist.github.com/dwt/6c38a3462487c0a6f71d93a4127d6c73
If you’re on an apple silicon mac, docker performance can be atrocious if you are emulating. It can also be inconvenient to work with Docker volumes and networks. Python already has
pyenv
and tools likepoetry
andrye
. Unless there’s a need for Docker, I personally would generally avoid it (tho I do almost all my deployments via docker containers)
It’s a bit unclear to me what you refer to with “their argument”. What argument exactly?
need for isolation inside container even with python image.
Are you referring to https://hynek.me/articles/docker-virtualenv/ ?
not exactly.
Does Hynek’s article convince you?
yes, but will need some more practical usage to fully grasp.
Could you share the article?
I can think of only two reasons to have a venv inside a container:
-
If you’re running third-party services inside a container, pinned to different Python versions.
-
If you do local development without docker and scripts that have to activate the venv from inside the script. If you move the scripts inside the container, now you don’t have a venv. But then it’s easy to just check an environment variable and skip, if inside Docker.
For most applications, it seems like an unnecessary extra step.
If you do multi stage builds (example here) it is slightly easier to use venvs.
If you use the global environment you need to hardcode the path to global packages. This path can change when base images are upgraded.
But then it’s easy to just check an environment variable and skip, if inside Docker.
How is forcing your script to be Docker-aware simpler than just always creating a venv?
One Docker env variable and one line of code. Not a heavy lift, really. And next time I shell into the container I don’t need to remind everyone to activate the venv.
Creating a venv in Docker just for the hell of it is like creating a symlink to something that never changes or moves.
How can you be sure it’s one line of code? What if there are several codepaths, and venvs are activated in different places? And in any case, even if there is only one conditional needed, that is still one branch more than necessary to test.
Your symlink example does not make sense. There is someting that is changing. In fact, it may even be the opposite: if you need to use file A in s container, and file B otherwise, it may make perfect sense to symlink the correct file to C, so thst your code does not need to care about it.
-