Skip to content

Instantly share code, notes, and snippets.

@Kasimir123
Last active July 25, 2023 16:34
Show Gist options
  • Save Kasimir123/9bdc38f2eac9a19f83ef6cc52c7ce82e to your computer and use it in GitHub Desktop.
Save Kasimir123/9bdc38f2eac9a19f83ef6cc52c7ce82e to your computer and use it in GitHub Desktop.
Skeleton of the patch that is being applied to projects vulnerable to CVE-2007-4559. Based off of code from pip's unpacking.py.
import os
def is_within_directory(directory, target):
abs_directory = os.path.abspath(directory)
abs_target = os.path.abspath(target)
prefix = os.path.commonprefix([abs_directory, abs_target])
return prefix == abs_directory
def safe_extract(tar, path=".", members=None, *, numeric_owner=False):
for member in tar.getmembers():
member_path = os.path.join(path, member.name)
if not is_within_directory(path, member_path):
raise Exception("Attempted Path Traversal in Tar File")
tar.extractall(path, members, numeric_owner=numeric_owner)
@mara004
Copy link

mara004 commented Feb 27, 2023

(1) False negatives, e.g. dest_dir = Path('/dest') and member.name = Path('/dest2').

Oh, I see. Thanks! (I wrongly assumed str(path) would add a trailing slash to directories.)
The is_relative_to() strategy should work correctly, though?

You could simply change dest_dir.resolve() to Path(path).resolve(), assuming that's how one initialises Path objects (you didn't show us from where you import this type).

Imported from pathlib, of course. Doing Path(dest_dir).resolve() would be an option but isn't necessary if the caller uses pathlib already.

@mara004
Copy link

mara004 commented Mar 2, 2023

@jowagner Erm, doesn't the os.path.commonprefix() strategy have exactly the same false negative, also in @Kasimir123's (and pip's) code above? Shouldn't it rather be os.path.commonpath()?

@mara004
Copy link

mara004 commented Mar 2, 2023

>>> from pathlib import Path
>>> import os.path
>>> p1 = Path("/a/b/c")
>>> p2 = Path("/a/b/c2")
>>> os.path.commonprefix([p1, p2])
'/a/b/c'
>>> os.path.commonprefix([p1, p2]) == str(p1)
True
>>> os.path.commonpath([p1, p2])
'/a/b'
>>> p2.is_relative_to(p1)
False

@mara004
Copy link

mara004 commented Jul 25, 2023

See also PEP 706 for recent progress on this.

Here's an updated safe extraction implementation I'm now using for my projects:
https://gist.github.com/mara004/6fe0ac15d0cf303bed0aea2f22d8531f


And FWIW, some threads confirming my suspicion that os.path.commonprefix() is invalid in the Trellix patch:
https://stackoverflow.com/a/43613742/15547292
http://grokbase.com/t/python/python-ideas/153nyg6qt1/os-path-commonprefix-yes-that-old-chestnut

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment