Skip to content

Instantly share code, notes, and snippets.

@tiran
Last active May 21, 2024 08:46
Show Gist options
  • Save tiran/2dec9e03c6f901814f6d1e8dad09528e to your computer and use it in GitHub Desktop.
Save tiran/2dec9e03c6f901814f6d1e8dad09528e to your computer and use it in GitHub Desktop.
Negative Python user experience on Debian/Ubuntu

Negative Python user experience on Debian/Ubuntu

The user experience of Python on a minimal Debian or Ubuntu installation is bad. Core features like virtual environments, pip bootstrapping, and the ssl module are either missing or do not work like designed and documented. Some Python core developers including me are worried and consider Debian/Ubuntu's packaging harmful for Python's reputation and branding. Users don't get what they expect.

Reproducer

The problems can be easily reproduced with official Debian and Ubuntu containers in Docker or Podman. Debian Stable (Debian 10 Buster) comes with Python 3.7.3. Ubuntu Focal (20.04 LTS) has Python 3.8.5.

Run Debian container

$ docker run -ti debian:stable

Run Ubuntu container

$ docker run -ti ubuntu:focal

Install Python3

# apt update
# apt install python3

venv is broken

venv is another Python standard library module. It provides support for creating lightweight "virtual environments". The venv module is available but dysfunctional. It cannot create virtual environments out of the box.

# python3 -m venv /tmp/venv
The virtual environment was not created successfully because ensurepip is not
available.  On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.

    apt-get install python3-venv

You may need to use sudo with that command.  After installing the python3-venv
package, recreate your virtual environment.

Failing command: ['/tmp/venv/bin/python3', '-Im', 'ensurepip', '--upgrade', '--default-pip']

Update Julien Palard wrote that one of his students ran into another issue with venv. Debian's venv can give an invalid advise when a user has multiple Python versions installed.

ensurepip is missing

The ensurepip package is part of Python's standard library and provides support for bootstrapping the pip installer into an existing Python installation or virtual environment. The ensurepip package is missing on Debian/Ubuntu.

# python3 -m ensurepip
/usr/bin/python3: No module named ensurepip
# pip
bash: pip: command not found

After installation of python3-venv, the ensurepip package is failing with a different error message:

# python3 -m ensurepip
ensurepip is disabled in Debian/Ubuntu for the system python.

Python modules for the system python are usually handled by dpkg and apt-get.

    apt-get install python-<module name>

Install the python-pip package to use pip itself.  Using pip together
with the system python might have unexpected results for any system installed
module, so use it on your own risk, or make sure to only use it in virtual
environments.
# echo $?
1

distutils is stripped down and missing most code

The distutils package is mostly missing. Only the package root and distutils.version is available. The remaining code has been moved to python3-distutils by Debian/Ubuntu packagers. The python3-distutils is not installed with python3 and must be installed separately.

# python3
Python 3.7.3 (default, Jul 25 2020, 13:03:44)  
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import distutils
>>> from distutils import sysconfig
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
ImportError: cannot import name 'sysconfig' from 'distutils' (/usr/lib/python3.7/distutils/__init__.py)

ssl module cannot verify connections

A minimal installation has no CA certificates because neither the python3 package nor OpenSSL libraries depend on ca-certificates.

>>> import urllib.request
>>> urllib.request.urlopen("https://pypi.org/")
Traceback (most recent call last):
 ...
 File "<stdin>", line 1, in <module>
 File "/usr/lib/python3.7/urllib/request.py", line 222, in urlopen
   return opener.open(url, data, timeout)
 File "/usr/lib/python3.7/urllib/request.py", line 525, in open
   response = self._open(req, data)
 File "/usr/lib/python3.7/urllib/request.py", line 543, in _open
   '_open', req)
 File "/usr/lib/python3.7/urllib/request.py", line 503, in _call_chain
   result = func(*args)
 File "/usr/lib/python3.7/urllib/request.py", line 1367, in https_open
   context=self._context, check_hostname=self._check_hostname)
 File "/usr/lib/python3.7/urllib/request.py", line 1326, in do_open
   raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)>

Incompatible OpenSSL downstream patch

Debian/Ubuntu have applied downstream patches to OpenSSL. The patches have caused breakage of user applications or Python's CI tests. Examples for issues and workarounds:

lib2to3 is missing

The lib2to3 package is moved to python3-lib2to3 package, which is not installed by default.

tkinter is in an extra package (ok)

The tkinter package is not part of the default distribution. For once this is a good decision. tkinter depends on libtk and whole lot of X11 libraries. Graphical user interface libraries should not be installed by default on headless servers and containers. I just find it confusing that the tkinter package is provided by a python3-tk package and not by python3-tkinter.

Python 3.9 is missing dependency on tzdata

Paul Ganssle added a zoneinfo implementation with timezons to Python 3.9, see PEP 615. The feature requires tzdata database. As of 2020-11-13 Debian and Ubuntu's python3.9 package are missing a dependency on the tzdata package. The zoneinfo module does not work without tzdata:

>>> import zoneinfo
>>> zoneinfo.available_timezones()
set()
>>> zoneinfo.ZoneInfo("CET")
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/zoneinfo/_common.py", line 24, in load_tzdata
    raise ZoneInfoNotFoundError(f"No time zone found with key {key}")
zoneinfo._common.ZoneInfoNotFoundError: 'No time zone found with key CET'

NOTE The issue has been fixed by Anthony Sottile in Deadsnakes PPA, see comment.

UPDATE My launchpad bug 1904271 was closed as Invalid. Matthias wrote that tzdata is a required package and pointed to Debian policy. However the package is not installed by default in the official Debian and Ubuntu container images.

New virtualenvs contain unwanted libraries

Virtualenvs contain de-vendored dependencies of pip and setuptools, https://bugs.launchpad.net/ubuntu/+source/python-virtualenv/+bug/1904945

Expectations and Proposal

Minimalization of Python installation is a legitimate effort. However a minimal installation of Python with core features missing should not be called a Python installation. Users should expect that package-manager install python3 gets them a working Python interpreter with majority of stdlib packages (with exception to tkinter GUI and test package).

I propose

  1. Debian's current minimized Python package python3 should rather be called python3-minimal or something similar. This package would still users to get a stripped down interpreter if they explicitly ask for it.
  2. apt install python3 should provide a Python installation with working venv, ensurepip (*), distutils, and ssl modules.

(*) I define working ensurepip as python3 -m ensurepip does not fail and python3 -m pip works afterwards. It does not imply that stdlib's pip bundle must be shipped with Python distribution package. Debian could also provide an API compatible ensurepip facade and make python3 package depend on python3-pip.

@flowerbug
Copy link

https://gist.github.com/tiran/2dec9e03c6f901814f6d1e8dad09528e#gistcomment-3598489

@stefanor thank you for writing that, you can ask me about what i was doing and i'm willing to help out on any issue i report as best i can. i did admit i was using Debian testing and perhaps it was a transition or update in progress issues. it did get resolved somehow, but i'm not sure how. :)

i'm actually probably a pretty good newbie user because while i do have computer experience i do not have python experience, so when i do hit something and report it it is likely something a novice user might be confused by. my e-mail address is available in the mailing list for debian-users, and that is a good place to respond as it does reach others besides me who might have a use for that information.

i'm also willing to help out any other way i can, but i'm overwhelmed at times like anyone else. the best i can do and have been doing is using Debian testing and reporting things as i come across them, that is my primary way of helping. if i'm submitting a bug or issue i'm always willing to dig into something, please don't be afraid to ask.

my primary problem in understanding Python and how it is set up in any system is figuring out how to get back to a guaranteed default state where i know that all Debian and the base Python packages are installed, verifieably so, and all caches are cleared out so what i am seeing is not the result of some cached code that is being used instead of what i think is being used.

the virtual environment is the primary way i hope to accomplish this. :)

i have both a stable and testing partition available and i can also set up unstable or experimental if needed, so breakages are often something i can work around for the short term. if you want me to test or try something out or have a question ask away.

as for Debian packaging of Python, i don't understand the issues or conflicts with upstream, i've read this whole conversation and it looks like there's a lot of history. knowing certain DDs (debian developers) and their communcations styles for many years (from reading bug reports and various mailing lists i don't interact with them often so they wouldn't know me personally) it is an issue that is personality and sometimes conflict driven and one that causes friction even in the best of times let alone when there is pressure and a release coming up.

@tiran
Copy link
Author

tiran commented Feb 9, 2021

Y'all,

could you please move the discussion to a more appropriate place? I propose https://discuss.python.org/ .

@stefanor
Copy link

stefanor commented Feb 10, 2021

@flowerbug: Ah, I misunderstood your issue earlier. I have no idea what I was reading, maybe I confused you with somebody else. The danger of trying to respond to too many things in one thread..

Yeah that looks like a broken state in the 3.8->3.9 transition, caused by a distutils mismatch to the stdlib.

That's reproduce-able with 3.9 and 3.10 co-installed. Let's see if we can improve it...

Debian Bug: https://bugs.debian.org/979819

@ehashman
Copy link

I am going to follow up with a post to the Debian Python mailing list and will not continue monitoring this gist.

@ssbarnea
Copy link

ssbarnea commented Mar 22, 2021

Apparently ansible package install by ubuntu/(debian?) is also broken, it cannot be called when a virtualenv is activated. The rpm version on fedora is not affected by the same bug and if someone installs it using pip, it would work.

Because Github has the "inspiration" of preinstalling ansible (from apt) on their GHA images, they also provide a partially broken set of ansible binaries, ones that work only on some cases. I am going to raise an issue with them, asking them to remove ansible from the preinstalled list, or replace it with a pip/pipx installed version, preferably at user level instead of root.

actions/runner-images#3001 -- request to remove broken ansible from GHA

@webknjaz
Copy link

Apparently ansible package install by ubuntu/(debian?) is also broken, it cannot be called when a virtualenv is activated. The rpm version on fedora is not affected by the same bug and if someone installs it using pip, it would work.

AFAIK Fedora uses full interpreter paths in their installed scripts so they aren't affected by the env vars...

@stefanor
Copy link

Debian strongly encourages full interpreter paths for that reason, too.

@sfermigier
Copy link

FYI:

root@puer:~# lsb_release -d
Description:	Ubuntu 20.04.2 LTS
root@puer:~# python -m ensurepip
ensurepip is disabled in Debian/Ubuntu for the system python.

Python modules for the system python are usually handled by dpkg and apt-get.

    apt-get install python-<module name>

Install the python-pip package to use pip itself.  Using pip together
with the system python might have unexpected results for any system installed
module, so use it on your own risk, or make sure to only use it in virtual
environments.

# But...
root@puer:~# apt-get install python-pip
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package python-pip

# 💡
root@puer:~# apt-get install python3-pip
Reading package lists... Done
Building dependency tree
Reading state information... Done
python3-pip is already the newest version (20.0.2-5ubuntu1.1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

# 😖
root@puer:~# pip

Command 'pip' not found, but there are 18 similar ones.

I have 25 years of experience with Python. I wonder how a beginner would feel in the same situation.

(BTW: the solution was to use pip3 instead of pip).

@barneygale
Copy link

I'm a casual Python and Linux Mint user. Naive question: could pip be made to generate a .deb, and then hand off to apt/dpkg for the final installation? I think that would prevent apt and pip stomping on eachother, and so might ameliorate some of the Debian concerns for including ensurepip. Virtualenv has been split up into plugins for different tasks. Could the same happen to pip, with a DebPackager plugin interfacing with apt/dpkg?

@webknjaz
Copy link

could pip be made to generate a .deb, and then hand off to apt/dpkg for the final installation?

Pip is not the right tool for this. It should be a PEP 517 capable build front-end that works within the Debian packaging ecosystem. And even so, it has nothing to do with Debian's wish to delete/relocate parts of packages (OS-level package managers already use proper artifacts with all the files to build stuff).

@webknjaz
Copy link

I think that would prevent apt and pip stomping on eachother

There's an INSTALLER file in the installed dist metadata. It contains string pip if it was installed by pip, other package managers are supposed to put their names there and only modify a package if they installed it, based on that file.

@pradyunsg
Copy link

pradyunsg commented Jan 6, 2022

@tiran If there's been some improvements on this front or concensus on actionable items here, it might make sense to update this gist to note those at the end.

PS: Happy new year folks! Hope you're doing well, to the extent that you can in the current state of things. :)

@tiran
Copy link
Author

tiran commented Jan 7, 2022

@tiran If there's been some improvements on this front or concensus on actionable items here, it might make sense to update this gist to note those at the end.

There have been some improvements in latest Debian and Ubuntu. However bug https://bugs.launchpad.net/ubuntu/+source/python3.6/+bug/1879310 is not still fixed for Ubuntu 20.04 LTS and Debian 10. Python's ecosystem cannot rely on presence of CA certificates for another 3 to 8 years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment