Skip to content

Instantly share code, notes, and snippets.

@AdamISZ
Last active April 3, 2023 20:09
Show Gist options
  • Save AdamISZ/2c13fb5819bd469ca318156e2cf25d79 to your computer and use it in GitHub Desktop.
Save AdamISZ/2c13fb5819bd469ca318156e2cf25d79 to your computer and use it in GitHub Desktop.
SNICKER BIP draft

  BIP: ??
  Layer: Applications
  Title: SNICKER - Simple Non-Interactive Coinjoin with Keys for Encryption Reused
  Author: Adam Gibson <[email protected]>
  Comments-Summary: No comments yet.
  Comments-URI: -
  Status: Proposed
  Type: Informational
  Created: -
  License: BSD-2-Clause

Table of Contents

Abstract

SNICKER (Simple Non-Interactive Coinjoin with Keys for Encryption Reused) is a simple method for allowing the creation of a two party coinjoin without any synchronisation or interaction between the participants. It relies on the idea of reused keys (either reused addresses, or identification of ownership of signed inputs and thus pubkeys in signatures). The address reuse scenario may mean that this is suitable as a privacy repair mechanism, but it would likely be used more broadly as a method for opportunistic coinjoin requiring literally zero user (inter)action within wallets. The implementation requirements for wallet developers are minimal.

For a discursive treatment of the idea, see the original blog post[1] (although any conflicting details are superseded by this document).

The purpose of this document is to standardise those features of the protocol which could be shared across wallets.

Copyright

This BIP is licensed under the 2-clause BSD license.

Motivation

Existing use of CoinJoin [2] is very valuable in creating sets of utxos in an obfuscated/mixed state, leveraging to the maximum what has sometimes been called the "intrinsic fungibility" of Bitcoin.

As well as past systems such as SharedCoin[3] and DarkWallet[4], we have, as of writing in mid-2019, JoinMarket[5], Wasabi Wallet[6] and Samourai's[7] tools Whirlpool and Stowaway. Coinshuffle[8] is an example of an arguably more powerful cryptographic protocol to achieve similar goals, although as of writing is not in production use on Bitcoin.

The most difficult thing about these CoinJoin protocols and systems is that they require coordination between participants, which can be difficult in any case, but is *especially* difficult when we want to ensure a strong level of anonymity for the participants. A logical way to ameliorate this problem is to have a coordinating server to create CoinJoin transactions, but this is a tradeoff which degrades the privacy guarantees of the user (even if very subtly), and can also be fragile to attack (central point of failure). Giving any detailed information to a server over a network connection is problematic, even when those connections are over an anonymising network like Tor.

On the other hand with no controlling central party, Sybil attacks can be quite effective in damaging the viability of such protocols in various ways - jamming the protocol, snooping the network level, Sybilling to swamp the participant and break or severely damage the privacy boost he was trying to achieve, or in cases like Joinmarket, damaging the economic incentive model.

SNICKER tries to occupy a radically different position in the landscape of tradeoffs here. No coordination or synchronisation is needed between users, all that is needed is one side to broadcast encrypted data, and the other side to receive it. The reception could happen over radio or satellite, so that it's basically physically impossible for anyone to know what happened (this is not a suggestion; just illustrating how important the broadcast element is). The SNICKER coinjoin model is very limited, as laid out in this document (2 party only!), but it's considered that for some scenarios, this is a worthwhile tradeoff because all of the coordination issues are no longer applicable.

Conceptual summary

The basic idea is for one party, using information only from the blockchain, to create a partially signed coinjoin transaction, deducing an output for a second party without their involvement, then encrypting that proposal and broadcasting it to the world in some way. Note the proposal is for a 2 party coinjoin (more parties is theoretically possible but much more complex). Such proposals can be constructed based on making inferences from public keys seen on the blockchain - either from address reuse or simply extracted from transaction inputs. Notice that multiple proposals with the same input utxos is not problematic. Whether such proposals will be taken up (i.e. downloaded from some server, decrypted, co signed and broadcast onto the Bitcoin network) will depend on whether the proposal is valid (i.e. whether keys were correctly identified), whether the counterparty is aware of the proposal, and whether they are willing to do the coinjoin. For the latter, economic incentives are relevant.

Identification of candidates

For the former of the two points just mentioned, it'll be important for the first party (the "Proposer") to find candidates with some reasonable (even if low) probability. This could be bootstrapped by choosing on-chain keys participating in transactions with certain flags (for example: Joinmarket transactions). Over time, this could be supplanted by searching for transactions which themselves are already SNICKER (hence "bootstrap" - after some time SNICKER txs could spread into a large graph).

We will see in the Specification below, that there are two ways proposed for making this identification - address reuse or inference of co-ownership of pubkeys in transaction inputs. It is worthy of note here that another option without address reuse will exist if and when Taproot[9] is activated on the Bitcoin network, since in that case the scriptPubKey will expose the pubkey directly.

Specification

Here we seek to standardise certain features, to make it feasible for a wallet software developer to create an implementation of SNICKER that is compatible with other wallets.

It is noted however that due to the scanning/searching nature of the behaviour of the Proposer, it may be that wallet developers would find focusing only on the Receiver side of the specification is more practical (or, occasionally, vice versa).

Definition of Terms

  • Proposer - scans the blockchain for relevant information, then creates a partially signed coinjoin transaction and publishes an encrypted version of it on the bulletin board.
  • Receiver - discovers the encrypted version of the partially signed coinjoin transaction on the bulletin board (or otherwise) and decrypts it, co-signs it (if desirable) and broadcasts it onto the Bitcoin network.
  • Bulletin board - Any public read/writable location, ideally publishing to it and reading from it should be anonymous, so a good example is a Tor hidden service.
  • c - the tweak used to generate a new destination for the Receiver (32 byte group element, encoded as for private keys).
  • C - ciphertext
  • ECDH(P, Q) - outputs a 32 byte shared secret between owners of P and Q, defined the following operations:
    • given a secp256k1 pubkey P with known privkey p and a second secp256k1 pubkey Q:
    • perform scalar multiplication to derive a new pubkey: p * Q
    • serialize this pubkey as for Bitcoin (so, 02 or 03 are prepended to the x-coordinate and the complete serialization is exactly 33 bytes)
    • Calculate the 32 byte **shared secret** S as: S = SHA256(serialized pubkey from previous step)
    • Note that this process is that followed by the libsecp256k1 ECDH module[10]
  • ECIES-E(P, m) - outputs an encryption of message m to secp256k1 public key P, defined by the following operations[11]:
    • construct a new secp256k1 pubkey R from a newly generated privkey r.
    • perform scalar multiplication to derive a new pubkey S: S = r * P
    • serialize both these pubkeys as for Bitcoin (so, 02 or 03 are prepended to the x-coordinate and the complete serialization is exactly 33 bytes). Call these R-ser, S-ser.
    • calculate K = SHA512(S-ser)
    • Set IV equal to the first 16 bytes of K. Set K-AES to the second 16 bytes of K. Set K-MAC to the last 32 bytes of K.
    • Perform AES128 encryption of the message m with IV IV and encryption key K-AES. Use PKCS 7 as the padding method and CBC as the block cipher mode. Let the ciphertext output be C = AES128-CBC(IV, K-AES, m).
    • Calculate a MAC using HMAC-SHA256 with the key K-MAC to find mac = HMAC-SHA256(K-MAC, C).
    • Concatenate to find the output bytestring, first prepending 4 magic bytes BIE1 or 0x42494531 : o = 4 magic bytes || R-ser || C || mac.
    • Finally, the output can optionally be base64 encoded for transfer. Note that this algorithm, including the specific magic bytes, conforms to that used currently by Electrum[12].
  • ECIES-D(p, o) - outputs the decryption of an ECIES output o, assuming it had been encrypted to the pubkey P for which p (which the decryptor must possess) is the privkey.
    • base64 decode o if necessary.
    • check that the decoded o has first four bytes 0x42494531 and reject otherwise.
    • deserialize bytes 5 to 37 into a secp256k1 pubkey object R; reject if this operation fails.
    • perform scalar multiplication to derive a new pubkey S: S = p * R
    • serialize S and R as for Bitcoin (so, 02 or 03 are prepended to the x-coordinate and the complete serialization is exactly 33 bytes). Call these R-ser, S-ser.
    • calculate K = SHA512(S-ser)
    • Set IV equal to the first 16 bytes of K. Set K-AES to the second 16 bytes of K. Set K-MAC to last 32 bytes of K.
    • split the remaining bytes of o into two sections: the last 32 bytes and the bytes from byte 38. The first is called mac and the second is called C (ciphertext).
    • First check that mac is a valid mac on C with the key K-MAC: mac ?= HMAC-SHA256(K-MAC, C). Reject if false.
    • Perform AES128 decryption of the ciphertext C with IV IV and decryption key K-AES. Use PKCS 7 as the padding method and CBC as the block cipher mode. The final plaintext output of decryption, m, is then: m = AES128-CBC(IV, K-AES, C).
We cover the two types of SNICKER separately, starting with the simpler but perhaps less interesting version - address reuse.

SNICKER Protocol Version 00 - Reused Keys

Proposer actions

The Proposer will need to identify a set of addresses which are reused (specifically, has an unspent output currently existing, and previously spent from at least once). This could be global or restricted in some way (e.g. recency of usage, specific wallet type etc.), and may be found via direct blockchain scanning.

For each reused address A, he will find the public key P_A from the/a previous spend. He will then take these steps:

  • Find one and only one utxo of his own to spend, whose bitcoin value is greater than or equal to: value of utxo owned by A, plus approximately the transaction fee for a 2-in 3-out transaction.
  • Construct a new tweak c (see "Terms" above), using specifically this method (The reason for this specific method will be made clear here):
    • First, take the public key of the utxo input chosen in the previous step, call it Q, and its corresponding private key q.
    • Calculate c = ECDH(Q, P_A)
  • Construct a new destination address using pubkey P_A + cG and of the same address type as A.
  • Construct a transaction with inputs: (existing utxo owned by A, his own utxo owned by Q) and outputs: (P_A + cG, his own freshly generated output addresses O1 and O2). The amounts will be such that each party (Proposer, Receiver) receives back approximately what they put into the transaction, but the acceptability of the amount is for the Receiver to decide. O1 MUST receive exactly the same amount as P_A+cG and O2 should contain the change required to balance for the Proposer.
  • He signs the transaction for his own input, so it is partially signed.
  • He serializes the partially signed transaction and the tweak value c according to the serialization specification below, calling this serialization M, he outputs and publishes a **Proposal**: ECIES-E(P_A, M).
  • He publishes this Proposal (as according to definitions, it should be base64 encoded) to the Bulletin board.
These steps can be repeated for every address A. Note: he can if he wishes continue to reuse the same input utxo for each of these proposals, since if the Receivers find that it is already spent, there is no harm done. This would be a kind of "first come, first served".

Format of transaction Proposal

The proposer constructs the message m before encryption as follows:

  1. 7 bytes: magic, binary 'SNICKER' i.e. 0x534e49434b4552
  2. 2 bytes: version, specifically:
    • First byte is a version, currently only 00 and 01 are defined (see below for Version 01). So here this MUST be the byte 0x00.
    • Second byte represents a set of flags. For version 00 and 01 no flags are defined and so this value MUST be 0x00.
  3. 32 bytes: tweak value c calculated as explained above. This MUST be a valid element of the additive group of integers modulo N (where N is secp256k1's N), i.e. it must be a valid raw private key in the Bitcoin sense. The encoding must be a big endian serialization of the 256 bit integer. Note this must be fixed width i.e. always 32 bytes.
  4. Variable size: A partially signed bitcoin transaction according to the specifications of PSBT, as specified in BIP174[13]. Note that this should be the binary serialized variant and not the base64 encoded variant. See next subsection on rules for the content of this PSBT.
Partially signed transaction

The Proposer can be flexible in some aspects of creation of the transaction proposal above, but these conditions must be met for BOTH Version 00 and Version 01, note that the list may substantially differ for other versions. Note that explicit choices are always preferred here, since multiple implementations could otherwise risk creating watermarks from individual Proposers.

  1. (see subsection 'A note on address types' below). At least one output MUST have scriptPubkey: for Version 00, (address type as for P) using single public key: P + cG
  2. Transaction version MUST be 02
  3. Transaction locktime MUST be 0, and sequence numbers MUST be 0xffffffff. Opt-in RBF[14] is probably not practical in this type of protocol, so the proposal here requires using the simplest (default) values.
  4. There MUST be AT LEAST TWO outputs with the exact same output value in satoshis. One of those outputs must be the one to the Receiver as specified in item 1 above. Another must be of the same scriptPubKey type.
  5. There MUST be ONLY ONE input that the Receiver currently owns. This must be the/a utxo that is controlled by (address version of P) P.
  6. The Bitcoin transaction fee applied to the proposed transaction is not restricted; the Receiver will decide if it is acceptable or not.
  7. The AT LEAST ONE input NOT owned by the Receiver (see above) MUST be finalized as per the terminology of BIP174, that is to say, it/they must have fields 0x07 finalized scriptSig and/or 0x08 finalized scriptWitness completed and valid.
  8. The inputs and outputs ordering should be randomized, but BIP69[15] should NOT be used.

A note on address types

The spirit of the proposal as defined here for Versions 00 and 01 is that the new output created by the Proposer for the Receiver should be of the same type as the input being consumed, to somewhat improve privacy, but mostly to ensure that the wallet will be able to understand the new utxo created. However this could get quite complicated in case of customised scriptPubKeys created by more advanced wallets. Hence for simplicity we assume that addresses for which utxos are being consumed MUST be of one of the standard single key types, that is:

  • P2PKH (i.e. '1' addresses)
  • P2WPKH (segwit native single key, bech32 addresses)
  • P2SH-P2WPKH ('3' addresses wrapping segwit native p2wpkh)
Handling multisig or custom scripts including e.g. locktimes is currently considered out of scope but could in principle be added in future versions.

Receiver actions

The Receiver will need to keep a record of addresses (and corresponding keys) that he has reused. Let's call these addresses A as above.

The Receiver checks the Bulletin board periodically. He downloads all encrypted blobs that might be relevant (applying filters if the Bulletin board supports this, see section on Bulletin Board below).

For each blob (after being base64 decoded), calling the binary blob o, he should attempt to decrypt according to ECIES-D(p_A, o) where p_A is the private key of address A. As well as performing the checks defined in that operation, he can also check:

  1. Whether the first 7 bytes match the required SNICKER magic bytes: 0x534e49434b4552
  2. Whether the version is or is not 0x00.
If those checks pass, he should find the value c, and a valid partially signed bitcoin transaction serialization. He then performs the following checks:
  1. Optional; if not included, the second option in "Storage of Keys" below MUST NOT be assumed to be available (i.e. if not included, the wallet MUST import the new keys and MUST inform the user that persistent wallets are required for funds recovery):
    1. For the public key of the finalized input in the PSBT, call the pubkey Q, calculate c = ECDH(Q, P_A) and check if it matches c. If not, reject.
  2. Is one of the destination addresses, the address of the pubkey P_A + cG, with the same address type as A.
  3. Ensure that he has safely stored or imported the private key of the new output: x+c in the wallet before broadcast of the transaction.
  4. Receiver software must be cognizant of the fact that it is operating on untrusted input. The mere fact of a MAC check passing does not in this case prove anything at all about 'honest' behaviour, since the counterparty is using an entirely ephemeral and untrusted key. In particular it will be vital that the BIP174 parser does not have any vulnerabilities to malicious input.
  5. After storing the passed c-value and on successful parsing of the PSBT into an in-memory transaction proposal, the Receiver software should:
    • Check that the transaction version is 02, locktime is 0 and input sequence numbers are 0xffffffff.
    • Validate the signatures that must exist and be finalized on all but one input. This is to ensure that the Proposer spends their own coins, and not yours.
    • Check Receiver's ownership of the unsigned input (remembering that Proposers may have only guessed this and could be wrong). Obviously reject if un-owned.
    • Reconstruct the destination - from P+cG - P is already available at this point since we needed it to decrypt the proposal. Apply the same address type to recreate the scriptPubKey, as was mentioned above in "A note on address types".
    • Check the spent amount against the received amount. The Receiver is free to make their own judgement about the minimum amount of satoshis he receives as (satoshis received minus satoshis spent), it could be less than zero, or more; he should consider the network fee in his calculations of course. This decision can be considered as something set by the free market.
    • Assess whether the fee provided to the bitcoin network is suitable. If the transaction is highly desired for some reason, CPFP can be used to bump the fee but a reminder that RBF cannot, because the two sides cannot cooperate to sign a new version.
Note that there isn't really a need to check unspent-ness of inputs, at least strictly: the Receiver is technically in a race with other Receivers to broadcast the Coinjoin, and doesn't have any way to know if any other Receivers are likely to take up the offer. There is no risk of funds loss due to spend conflicts. He may, however, want to avoid this situation, and in which case he should at least check for the unspentness of inputs at the time of signing and broadcasting.

Assuming all checks pass the Receiver software can co-sign to complete the PSBT, then convert it to Bitcoin network serialization, and broadcast it onto the network, either automatically or based on user input. Implementors are reminded of the point made above, that the private key of the newly created utxo must be stored/imported in advance of broadcast.

Storage of Keys

New outputs created by SNICKER coinjoins are not directly controlled by a wallet's existing HD tree. Due to the ECDH mechanism used to create the tweaks for these keys, the Receiver wallet has two options:

  1. Import and persist - The newly created outputs pay to (address of) P + cG, which have corresponding private keys (x+c), where x is the private key of P. Most simply, the wallet MAY store these new keys separately (leaving the wallet free to ignore the original tweak value c), using an import function, which is already present in many wallets. This does of course leave the user unable to access the funds in the case of recovery from only a BIP32 master secret, so may often be considered insufficient.
  2. Re-derive from blockchain history - Because the tweak values c were derived via ECDH between pubkeys of addresses contained within the SNICKER coinjoin, and if the Receiver verified that this derivation was performed correctly at the time of construction of the coinjoin, it will be possible to find all utxos created via this mechanism using (a) the BIP32 master secret and (b) access to historical blockchain data:
    1. Using the master secret and the wallet's HD path, derive addresses as normal
    2. Use the blockchain to find all transactions that spend from any key in this HD wallet (i.e. all historical transactions, in general). For each spent output owned by us, check if the spending transaction fits the pattern of a SNICKER coinjoin (two equal sized outputs is enough of a check, since false positives are not a problem).
    3. For each of those transactions, check, for each of the two equal sized outputs, whether one destination address can be regenerated from by taking c found in the method described above, and reconstructing a pubkey P_A + cG where P_A is the Receiver-owned (and previously reused) key, and then constructing the scriptPubKey from that.
The second option is clearly desirable, but may present meaningful additional complexity depending on the nature of the wallet. If the former is used it MUST be communicated to the user that funds are at risk if they lose their wallet persistence.

SNICKER Protocol Version 01 - Inferred Ownership

The requirements for Proposer and Receiver in Version 01 are the same as those for in Version 00 except where specifically contradicted, or added, in this section.

Here the general concept is: follow the same steps as above, but use the public key from the scriptSig (or witness for segwit) for single key redemptions, thus not requiring address reuse. This introduces a new consideration: the Proposer must guess or infer co-ownership of the input exposing said pubkey, and outputs.

This last point is the reason that, even though Version 01 is much more attractive than Version 00, since it does not require previous address reuse, it is nevertheless a little more complex for both sides to implement, and also the additional inference means the probability of success of any one proposal may be significantly reduced.

Proposer actions

The proposer will need to identify a set of transactions for which he has a degree of plausibility (here unspecified) that at least one input is associated with a certain (currently) unspent output. For the sake of plausibility that the proposal will be taken up, he may need to filter according to various other criteria - does the transaction appear to be from a certain wallet, is there some other form of advertisement of "ready to do SNICKER" either embedded in the transaction itself or on some other bulletin board, is the transaction sufficiently recent etc. etc.

It is worthy of note that in the "canonical" case of a two output transaction, the Proposer can try both outputs with any one input; one of the two will almost always be correct.

Assuming he finds a set of such txs T, each one will have a corresponding input I (note: not a utxo, because this already spent) for which he extracts the pubkey P_I from either the scriptSig or the witness, and there will be a corresponding output which is unspent (see previous paragraphs), whose address we denote as A.

He will then take these steps (similar but not identical to previous - clarity is preferred here, so nothing is omitted, even if repeated):

  • Find one or more utxos of his own to spend, whose total bitcoin value is greater than or equal to: value of utxo owned by A, plus approximately the transaction fee for a 2-in 3-out transaction.
  • Construct a new tweak c (see "Terms" above), using specifically this method (The reason for this specific method was made clear here):
    • First, take the public key of the first utxo input (i.e. at the lowest index after randomization) chosen by the Proposer in the previous step, call it Q, and its corresponding private key q.
    • Calculate c = ECDH(Q, P_I).
  • Construct a new destination address using pubkey P_I + cG and of the same address type as A.
  • Construct a transaction with inputs: (utxo owned by A, his own utxos (the first of which is owned by Q)) and outputs: (P_I + cG, with address type as for A, his own freshly generated output addresses O1 and O2). The amounts will be such that each party (Proposer, Receiver) receives back approximately what they put into the transaction, but the acceptability of the amount is for the Receiver to decide. O1 MUST receive exactly the same amount as P_I+cG, and must be of the same scriptPubKey(address) type as A. O2 should contain the change required to balance for the Proposer.
  • He signs the transaction for his own inputs, so it is partially signed.
  • He serializes the partially signed transaction and the tweak value c according to the serialization specification here, calling this serialization M, he outputs and publishes a Proposal: ECIES-E(P_I, M).
  • He publishes this Proposal (as according to definitions, it should be base64 encoded) to the Bulletin board.

Receiver actions

The Receiver actions are as for Version 00, except:

  1. The Receiver will be keeping track of all used keys/addresses, and not only reused ones, as candidates for proposals.
  2. (Trivial) the Version byte check uses 0x01 not 0x00.
  3. As explained above in the Proposer actions section for Version 01, the value of c is calculated with c = ECDH(Q, P_I). However the same comments as for Version 00 apply here in the Receiver's choice of whether to use recoverable keys.

Implementation of the Bulletin Board

This is not a matter of any protocol-level consensus and so no specification is strictly required outside the format of encrypted proposal, which is already done.

Before continuing it is worthy of note that such a Bulletin Board is not strictly necessary and SNICKER as defined in this BIP can still be carried out without any server, just through ad-hoc connections; as long as the transaction proposal format is standardised, wallets could still create such coinjoins.

Here we will only note some possible, natural implementation approaches, and issues with them:

General point: anonymity

We earlier mentioned the possibility of using a Tor Hidden Service for this purpose, which seems practical, but in any case network level anonymity for the Proposers who upload and the Receivers who download is more or less essential, as otherwise monitoring (especially by the Bulletin Board) may link the participants in CoinJoins to other metadata, other Bitcoin transactions etc.

Now we'll discuss the possible mechanics of the Bulletin Board's storage of proposals:

"Flat" storage

The simplest possible approach is for the Bulletin board to accept any and all encrypted blobs and store them in a flat list. This would require Receivers to download ALL proposals and check each of them against each candidate key P. This has a large advantage, namely that the Receiver reveals nothing through their network traffic. But this approach obviously could suffer from both scalability issues (Receiver computation time, bandwidth) and (related) spam attacks since there is no way to restrict fake proposals.

Indexed to keys

A step up from this (and likely, necessary) is to publish the intended P value in plaintext to the bulletin board, along with the encrypted transaction proposal. This could be considered a privacy leak, however, when we consider that SNICKER of Version 00 or 01 types will be identifiable as such on the blockchain, the existence of a public record of a Proposer choosing such a key is probably not an issue. Also multiple proposals to keys will not be distinguishable due to the encryption, and moreover if the proposal is not taken up, nothing is leaked other than the intent of an unknown party to do a coinjoin with a key that they do not own. With indexing of this type, the Receiver's action is hugely easier, since they will only need to download proposals directly relevant reducing their bandwidth and computation requirements. However, they still need to keep track of all P candidates in advance, of course. Also notably, they could obfuscate their actions somewhat by downloading proposals for a selection of keys, and not just their own.

Anti-spam prevention

Even with indexing, the problem of near-infinite fake proposals to clog the bulletin board "channel" still exists. This can be prevented with either (a) hashcash[16] (grind the ephemeral key for ECIES for example), (b)fidelity bonds similar to the idea proposed by Chris Belcher for Joinmarket[17], (c) direct payments to a server for the right to post proposals or (d) proposals only allowed to be made by the Bulletin Board owner or some fixed group. All of these ideas may be possible; but not using any of them would almost certainly lead to a death-by-spam in any reasonably popular public system.

Future improvements

The Versions 00 and 01 as explained above were intended to be (a) as simple as possible and (b) adhering to existing standards wherever possible. Much more variability and customization is possible and could be implemented in higher version numbers and using flags for additional features. Some ideas:

  • More efficient variants - if bandwidth is a concern we could reduce the size of proposals, principally by not using the PSBT full format but rather a custom tx and signature transfer format, trading off flexibility for small size. Both SNICKER magic bytes and even tweaks could be removed (tweaks could be inferred from ECDH shared secrets).
  • SNICKER as payments - if a payer was willing to wait a little longer than normal, a new variant of SNICKER may have a particularly good incentive alignment - the payer-Proposer could incentivise the Receiver to enable the payment of a fixed amount of the Proposer's choosing to a destination by adding a little to the Receiver's outputs. This would not fit in Versions 00 and 01 as current since it would require two outputs for the Receiver.
  • Use other features of Bitcoin like OP_RETURN or sign-to-contract or possibly "stealth addresses" to allow flagging of individual transactions or utxos as SNICKER candidates (improving discovery) (it is worth noting that pretty much any 'watermark' in existing transactions, like CoinJoins of various types, as well as address reuse, could be used today as such a discovery feature).
  • Allow SNICKER on more complex/custom scriptPubKey types than those listed above.
  • Allow more complex transaction structures.
  • More than 2 party joins based on multiple proposal transfers - a significantly more complex protocol.

Backwards Compatibility

SNICKER has no backwards compatibility concerns with respect to Bitcoin and is a purely optional client-side wallet feature.

Test Vectors

ECDH TVs

TODO

ECIES TVs

TODO

Transaction proposal TVs

TODO

Credits

Thanks in particular to @fivepiece (github) aka arubi (IRC freenode) who came up with many ideas here including specifically Version 01, and many others who offered thoughts on this concept.

References

  1. ^ https://joinmarket.me/blog/blog/snicker/
  2. ^ https://bitcointalk.org/index.php?topic=279249.0
  3. ^ https://en.bitcoin.it/wiki/Shared_coin - warning, link outdated, this system is defunct
  4. ^ https://github.com/darkwallet/darkwallet - as for previous, this is now defunct
  5. ^ https://github.com/Joinmarket-Org/joinmarket-clientserver
  6. ^ https://github.com/zkSNACKs/WalletWasabi
  7. ^ https://github.com/Samourai-Wallet
  8. ^ https://bitcointalk.org/index.php?topic=1497271 for announcement and https://www.ndss-symposium.org/ndss2017/ndss-2017-programme/p2p-mixing-and-unlinkable-bitcoin-transactions/ for latest version
  9. ^ https://github.com/sipa/bips/blob/bip-schnorr/bip-taproot.mediawiki#Constructing_and_spending_Taproot_outputs
  10. ^ ECDH implementation in libsecp256k1: https://github.com/bitcoin/bitcoin/blob/master/src/secp256k1/src/modules/ecdh/main_impl.h
  11. ^ https://en.wikipedia.org/wiki/Integrated_Encryption_Scheme
  12. ^ https://github.com/spesmilo/electrum/blob/fd5b1acdc896fbe86d585b2e721edde7a357afc3/electrum/ecc.py#L277-L292
  13. ^ https://github.com/bitcoin/bips/blob/master/bip-0174.mediawiki
  14. ^ https://github.com/bitcoin/bips/blob/master/bip-0125.mediawiki
  15. ^ https://github.com/bitcoin/bips/blob/master/bip-0069.mediawiki
  16. ^ http://hashcash.org/
  17. ^ https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-July/017169.html

@fivepiece
Copy link

Future improvements :: Flagging SNICKER candidates - it might be possible to flag TXs using the nlocktime field if otherwise unused by some value like SNK# where # is a bit field used for and\or snicker version and application flags for potential Proposers,

The same applies for the nsequence field.

@elichai
Copy link

elichai commented Sep 2, 2019

Without diving into the actual details of this BIP, a couple of questions:
Why this weird ECIES scheme?

  1. Why CBC? why??.There's so much research on this. please don't use CBC for anything new.
    use AES-GCM preferably 256bit.
    Using AES-GCM is also a replacement for the HMAC, means you don't need the weird padding, has good hardware support and is highly recommended right now.

@AdamISZ
Copy link
Author

AdamISZ commented Sep 3, 2019

@elichai

I anticipated this question (and not only because you already asked it on IRC :) )

Why this weird ECIES scheme?

Well, it's linked to in the draft, but, it's because it's the only currently extant encryption protocol with bitcoin keys in the Bitcoin ecosystem - Electrum uses exactly this protocol. Now, on the one hand, that's not the same as it being used in Core, on the other, Electrum is a pretty big and established part of the ecosystem.
A lot of my reasoning here and in the rest of the document are based on me trying to avoid inventing anything new.

That argument isn't very strong though, I agree there.

Why CBC? why??.There's so much research on this. please don't use CBC for anything new.
use AES-GCM preferably 256bit.

I'm quite familiar with this. I'd point out that the severe failings of CBC combined with something like PKCS7, in the context of TLS aren't applicable here because we're not in a client server context. The classic decryption oracle (and in particular padding oracle) attacks like that of Vaudenay and the later variants don't apply because there is no server oracle here to query.
Does that mean I think it's a good idea to use it when we have a modern authenticated encryption scheme which bypasses any such concerns anyway? Not really. I basically agree with you.
But: AES-CBC is being used almost everywhere for wallet encryption. I see it in Bitcoin Core, in Electrum, and I'm pretty sure I've seen it in several other codebases.
While this is not encryption of data at rest, it's also not encryption of data in a client-server context.

All of these reasons aside, I still agree with you though; there's not enough of a good reason to avoid using a modern standard rather than an old one which is a bit broken, even if not obviously broken in this context.

If you want to add more thoughts on it, like concretely an ECIES construction with AES-GCM (or where to go to find one), it'd be appreciated.

@elichai
Copy link

elichai commented Sep 3, 2019

As you said I'm not convinced by that argument.

As for ECIES with AES-GCM? take the shared secret from ECDH, use that as a key, and get a random IV. that's it.

@LaurentMT
Copy link

Interesting idea.

Here are a few shower thoughts and a slightly updated version of the process that may help to address the identified challenges (finding candidate UTXOs, mitigation of spam).

My starting point is the observation that SNICKER transactions have a distinctive fingerprint and that in most cases it should be easy to identify which input is controlled by the Proposer or by the Receiver. This isn't a big issue since the main value proposition of SNICKER is to break the deterministic link between the inputs and the mixed outputs but may be we can leverage this characteristic to get some additional benefits for the protocol.

The main "principles" for this updated version of the protocol would be:

  • the message board is used by the Receivers to publicly announce that they are willing to provide a given UTXO,
  • the message board is used by the Proposers to transfer theirs proposals (as described in this BIP),
  • pubkeys associated to UTXOs announced by the Receivers are public information,
  • pubkeys associated to UTXOs proposed by the Proposers are public information (or at least disclosed to the message board),
  • interactions of Receivers and Proposers with the message board are "authenticated" (i.e. requests sent to the message board contain a message signed by the Receiver/Proposer and proving knowledge of the privkey associated to the announced/proposed pubkey)

The updated protocol would be something like this:

1/ Receiver sends a request to the message board to announce a new UTXO U1 (with associated (privkey1, pubkey1)).

  • this request contains the signature of a message that is committing to U1 (might be something like the signature of H(txid_U1||vout_U1).
  • this signature should prove that Receiver knows privkey1 (message directly signed with privkey1, message signed with a privkey derived from a shared secret after an ECDH dance between Receiver and the message board, etc)
  • on its side the message board enforces a few rules allowing to mitigate spam attacks from rogue receivers:
    • requests with an invalid signature (signature that doesn't prove the knowledge of the privkey associated to U1) are rejected,
    • the message board refuses to register more than N UTXOs created by a same transaction,
    • the message board deletes registered UTXOs that have been spent by a confirmed transaction.

2/ The Proposer sends a request (unauthenticated) to the message board to retrieve a list of the announced UTXOs matching some criteria (range of amounts, UTXOS announced after a given date, etc), Then, the Proposer selects an announced UTXO (U1) and one of her UTXO (U2) and builds her proposal according to the description given in the BIP.

3/ The Proposer sends a request to the message board to register the proposal.

  • this request contains the encrypted proposal but it also contains the pubkeys associated to U1 and U2 and the signature of a message committing to U1 and U2 and proving the knowledge of the privkey associated to U2.
  • on its side the message board enforces a few rules allowing to mitigate spam attacks from rogue proposers:
    • requests with an invalid signature (signature that doesn't prove the knowledge of the privkey associated to U2) are rejected,
    • at any time T, the message board refuses to register more than N proposals for a given U2,
  • the message board deletes the proposals registered for UTXOs that have been spent by a confirmed transaction.
  • note: nothing prevents the Proposer to use different UTXOs in the public part of the request and in the SNICKER transaction. IMHO, this isn't a good strategy in terms of privacy (it leaks more info than needed to the message board) but the most important is that it doesn't prevents the message board to enforce rules mitigating spam attacks.

The main benefits of this updated version of the protocol are:

  • it should help to mitigate spam attacks by forcing an attacker to pay miner fees to generate a large number of UTXOs that will be used to spam the message board,
  • it should greatly simplify the identification of candidate UTXOs since they're explicitly announced by the Receiver,
  • the public nature of the registration of candidate UTXOs increases the plausible deniability provided to the Receiver and it allows to play a game similar to what is done by Samourai Wallet with STONEWALL and STONEWALLX2 txs. I mean, even if a user doesn't intend to participate to SNICKER transactions with others users it's still beneficial for this user to announce an UTXO on the message board and to later build a SNICKER tx with herself.

A few additional (and unrelated) random thoughts:

  • may be I missed this point in the BIP but I think it would be better to add a constraint stating that the inputs and the 2 mixed outputs of the transaction should have the same address type.
  • as I see it, the main benefit of SNICKER is that it potentially increases the plausible deniability provided to Receivers by publishing theirs announcements to the "whole world" (as in "unknown people"). On the other hand, it may also be its main "weakness" since a Proposer can't know if the selected Receiver is an entity running a Sybil Attack. Samourai Wallet has made a different choice for STONEWALL/STONEWALLx2 by relying on a WoT approach (Proposer selects the Receiver in its WoT of Paynyms) that mitigates the risk of Sybil attacks but slightly decreases the plausible deniability offered to the Receiver. In conclusion, I don't think that any of these approaches is "perfect" but being able to use both of them in a series of Coinjoin transactions might be a good combo :)

@lontivero
Copy link

The beauty of SNICKER is that it is a non-interactive method to coinjoin where the candidates (receivers) can be discovered just by querying the blockchain. The updated protocol solves many problems but it is not SNICKER anymore, it is a completely different protocol.

@AdamISZ
Copy link
Author

AdamISZ commented Nov 17, 2019

@LaurentMT @lontivero thanks for comments, I will answer shortly.

@LaurentMT
Copy link

The beauty of SNICKER is that it is a non-interactive method to coinjoin where the candidates (receivers) can be discovered just by querying the blockchain. The updated protocol solves many problems but it is not SNICKER anymore, it is a completely different protocol.

IMHO, SNICKER is already an interactive protocol but I agree that the updated version provides a different model (conserving the asynchronous aspect of SNICKER).

@AdamISZ
Copy link
Author

AdamISZ commented Nov 18, 2019

IMHO, SNICKER is already an interactive protocol but I agree that the updated version provides a different model (conserving the asynchronous aspect of SNICKER).

I've heard this before, and I'm intrigued what you mean. Consider the extreme example of Proposer broadcasting on EM spectrum and Receiver just using an antenna or some such ... in what sense is this interactive? It is certainly active in that Proposer must do something, and the Receiver must also do something. This is a small but significant difference to standard Bitcoin payments, in which, while the initiator must perform an action, the Receiver need not (except it must somehow be known what their address is).
So while there are differences in those two cases, it seems obvious to me that they are both in a radically different model to what I think of as interactive - where Alice and Bob must set up a connection either directly or indirectly in order to send messages in both directions. I think non-interactive describes that perfectly.
Consider the application of the Fiat Shamir transform to an identity protocol, creating a signature protocol - you remove the need for the Verifier to send a message to the Prover. The Verifier still needs to do something (verify!) but he no longer needs to send a message. And so it's considered non-interactive.

It's true that "asynchronous" is an extremely important descriptive term for what the purpose of SNICKER is, but I think non-interactive is, too, and I don't understand why people think that's wrong, so I'd be interested to hear an explanation.

@LaurentMT
Copy link

LaurentMT commented Nov 19, 2019

Indeed, when I say that I consider SNICKER as an interactive protocol I mean that the outcome (the valid SNICKER transaction) can't be reached without an action from both participants. IMHO, this point is important because it has consequences on the plausible deniability offered by the protocol.

For instance, I would say that the selection of the UTXOs (spent or decoy) used in a ring signature of a Monero transaction is non-interactive and that provides plausible deniability to all the "participants". The guarantees provided by SNICKER are clearly different from that.

Considering that SNICKER is a privacy-oriented proposal and considering that 2-parties coinjoins are mostly about deniability,, I think it's better to apply the most constraining definition of the term. Hence my use of the term interactive.

@AdamISZ
Copy link
Author

AdamISZ commented Nov 19, 2019

OK, thanks for letting me know how you're thinking about it.

To me, interactive always means that the 2 or more parties involved in a protocol have to send data in both directions. E.g. Diffie Helman is an interactive protocol when used to set up symmetric encryption in both directions, but can be used non-interactively, for example in ECIES.

A huge part of what makes coinjoin difficult is the necessity for parties to cooperate by sending each other messages.
There is a necessity for a flow in both directions. The purpose of this proposal is to remove precisely that scenario, and I think "non-interactive" conveys that correctly to the person not reading every detail.

Another way to say it; these are the properties that I think are important:

  • the proposal can be constructed by the proposer without the receiver, and the receiver needs no negotiation with proposer (complete decoupling, "non-interactivity")
  • the proposal can be broadcast without privacy failure

Re: your detailed ideas, I will try to read them tomorrow, but I do note that I've already had some other discussions with people about antispam, I wondered if this might help (curious coincidence, also about LSAG!): https://gist.github.com/AdamISZ/52aa2e4e48240dfbadebb316507d0749

@AdamISZ
Copy link
Author

AdamISZ commented Nov 19, 2019

@LaurentMT

My starting point is the observation that SNICKER transactions have a distinctive fingerprint and that in most cases it should be easy to identify which input is controlled by the Proposer or by the Receiver. This isn't a big issue since the main value proposition of SNICKER is to break the deterministic link between the inputs and the mixed outputs but may be we can leverage this characteristic to get some additional benefits for the protocol.

For sure, agreed, there is no pretence here to have a steganographic property, as there is with PayJoin. I imagine you could make hybrids of the two ideas, but that's a topic for another day.

<snipped, updated protocol description>

So, if I could summarise, this is the idea that: receivers advertise to server, and proposers and receivers both authenticate their utxos to get permission, to avoid spam. But proposal format is kept the same, to keep privacy of outputs.

Several comments about that:

First, on using utxos and signatures on them as a rate limiting feature, see the LSAG idea I linked at the end of the previous comment ^ . I think it might be possible to do that while preserving privacy.

Second, I think we lose quite a bit by requiring receivers to upload data at all, and utxos particularly. Connections between utxos could leak like that; unless you both used tor to upload and had strong trust in its effectiveness, and properly addressed timing metadata, then uploading more than 1 utxo could cause a nasty privacy failure (notice: that the Proposer might upload the pubkey for a proposal does not have any of these concerns!). This would also be additional work to be implemented into the wallet. The original proposal is nothing more than a download, albeit you might need to get clever there, if you're not able to download all proposals. Also there may be scenarios where upload is not even possible (the passive receiver model); while they are unlikely today, I don't want to cut them out because they are super powerful.
Some but not all of these comments apply to Proposers uploading their utxos too.

Third, I've always thought that the difference, although small, between accepting a coinjoin proposal and actively asking for a coinjoin before accepting a proposal, is important. At the very least, if you advertise willingness but don't do a join, then you've given up data you wouldn't have otherwise.

Fourth, if one did go down this road, which I agree is certainly possible , then I can see why you think it still has some value: even though both parties do sort of "register" with the server, they don't have to coordinate with each other directly, and don't need synchrony, and still retain the privacy of the outputs ownership.

In summary I think my feeling is:
I personally don't prefer this model but it seems possible; but we should focus on to whatever extent this changes the Proposal format, as that is the only thing the BIP is trying to standardise. I think the Proposal format can be the same; this would be a ruleset defined by a bulletin board, only.

the public nature of the registration of candidate UTXOs increases the plausible deniability provided to the Receiver and it allows to play a game similar to what is done by Samourai Wallet with STONEWALL and STONEWALLX2 txs. I mean, even if a user doesn't intend to participate to SNICKER transactions with others users it's still beneficial for this user to announce an UTXO on the message board and to later build a SNICKER tx with herself.

This is certainly interesting but I'm not so much a fan of the fake transactions idea, at least at scale, because it's costly. However I can certainly see the argument. It's almost like a complete flipping on its head of the "receivers don't want to advertise because it weakly identifies", by having an "everyone identifies" approach. The issue I have here is that you are thinking about a specific ecosystem that you have some control over (Samourai wallet users), whereas I am thinking more of the entire Bitcoin wallet and user ecosystem, where I do not expect everyone will be signing up to a SNICKER type function. If it was even like 25% I think that would be marvellous. 90% isn't very realistic.

may be I missed this point in the BIP but I think it would be better to add a constraint stating that the inputs and the 2 mixed outputs of the transaction should have the same address type.

In https://gist.github.com/AdamISZ/2c13fb5819bd469ca318156e2cf25d79#a-note-on-address-types the first sentence does state that "The spirit of the proposal as defined here for Versions 00 and 01 is that the new output created by the Proposer for the Receiver should be of the same type as the input being consumed, to somewhat improve privacy, but mostly to ensure that the wallet will be able to understand the new utxo created." But this is not good enough, I agree it has to clearly state that the Proposer and Receiver outputs must be the same type. I'll change that somehow.

On the other hand, it may also be its main "weakness" since a Proposer can't know if the selected Receiver is an entity running a Sybil Attack.

It kinda depends but I half agree: since the Receiver does not actually do anything in the purist version of SNICKER (they just have some transaction on the blockchain) they can't "attack"; but in the non-purist, practical version, Proposers are going to narrow in on transactions that are "flagged" as SNICKER-plausible, most of the time, so there is wiggle room for an attacker to try to do this. But I mean, outside of a "walled garden" where everybody is trying to get into one anon set, I think it's really amorphous and difficult to see the value of the attack. And Proposers have a lot of choice in what they propose against.

In the most general sense every conceivable version of these active privacy measures is susceptible to the same thing: N-1 participants are a Sybil. SNICKER is a weak protocol in anon set terms, for single coinjoins. If your defence against that is identification of your counterparties, I think that's the wrong direction to go ... I mean, in the limit, if you just did privacy with friends you could literally just swap privkeys, no fancy CoinSwap protocol necessary :)

I'd envisage a "low level hum" of coinjoins (some might even be payments although that's kind of ..v2 let's say) each of which has little value but overall being very damaging to spies. Trying to Sybil that globally? Meh, I doubt it.

@AdamISZ
Copy link
Author

AdamISZ commented Nov 19, 2019

Hmm. I've realised I'm being kind of dumb saying "only the Proposal format matters". If the purpose of the BIP is to allow wallet developers to implement, then they certainly need to know whether their wallet will need to upload data or not!

I guess I'm with @lontivero in saying I think it's a rather radically different proposal that you make @LaurentMT but it's certainly got its merits. I prefer the simpler one with absolutely minimal impact on receivers.

@LaurentMT
Copy link

I definitely have to read the doc about the LSAG idea. :)

From your answer, I guess that we're diverging about a few points (like the importance of sybil-resistance mechanisms for these kinds of 2-parties schemes). But before I begin to write about the reasons I'm diverging on this, I would prefer to be sure that I'm not mistaken about the priorities that have informed the design of SNICKER.

My understanding is that the emphasis has been put on:

  • the minimization of interactions between participants (decreases the number of potential failures, etc),
  • the decentralization of the protocol,
  • a lower barrier to entry for potentially participating wallets (minimal requirements for the implementation in wallets used by Receivers),
  • a lower barrier to entry for potentially participating Receivers.

Am I right about these choices?

Note: Fell free to tell me if you prefer that we move this discussion to another channel to avoid the pollution of the comments of this gist. :)

@AdamISZ
Copy link
Author

AdamISZ commented Nov 21, 2019

I agree with the items on that list (except 3 and 4 are kinda the same?), but the list misses the privacy preserving principle. I don't want the users to have to reveal any information to any centralised bulletin board, ideally. Short of that, I want to reduce what they reveal to a bare minimum. So that we're using something close to an untrusted message channel.

May as well discuss it here, I think.

@AdamISZ
Copy link
Author

AdamISZ commented Nov 22, 2019

Made a small update as per @LaurentMT 's note about same scriptpubkey type for equal-outs.

@LaurentMT
Copy link

I apologize for the late answer and for the upcoming long wall of text written in crappy english ;D

I agree with the items on that list (except 3 and 4 are kinda the same?)

Indeed, they're related. They both try to increase adoption. I would say that 3/ is the first step aiming to increase the number of wallets implementing SNICKER and that 4/ is the second step aiming to increase the adoption of SNICKER by the users of these wallets.

Ok. First allow me to state that I definitely consider all these objectives as good ones. My main "issue" with the current form of this BIP is the part that isn't written in the BIP but that has a strong impact on the guarantees provided by the system, i.e.:

  • A/ the challenges of implementing an anti-sybil mechanism protecting the Proposers during their selection of Receivers,
  • B/ the impact of generic coins selection algorithms on the deniability provided by previous SNICKER transactions (mostly related to 3).

A/ I understand that we're diverging on the importance of implementing an anti-sybil mechanism for a protocol like SNICKER. IMHO, there are many reasons to believe that some actors may want to sybil it (globally or partially). One reason is the same reason pushing some actors to sybil the bitcoin P2P network or Electrum servers in significant proportions. Another reason may be individuals actors (whales?) willing to extract additional benefits from theirs dormant stacks by selling information resulting from the Sybils attacks. The information gathered by these individuals may be useless but may become very useful when aggregated by a single actor.

To counter such sybil attacks, the algorithms used by Receivers' wallets are key. A "naive" analysis based on candidate transactions considered in isolation will fail. The analysis of candidate transactions in their context (transactions graph) will greatly increase the cost of running the algorithms. It also makes more difficult for honest Receivers to share information about rogue Proposers.

Additionally, every detail of these algorithms will be important. For instance, I would say that rejecting UTXOs which are "too old" should be mandatory.

B/ I agree with you that decreasing the cost of implementing privacy enhancing solutions in generic wallets (i.e. wallets that aren't privacy-focused) is a good goal to have but I don't think that these wallets can bypass some important requirements.

For instance, whatever the divergences that the teams may have about specific points, I think it's fair to say that privacy-focused wallets like JoinMarket, Wasabi and Samourai share important characteristics, the main one being coins selection algorithms prioritizing privacy (of past, present and future transactions). IMHO, a generic wallet implementing a protocol like SNICKER, without a prior work on its coins selection algorithm, will do more harm than good.

Here's a specific example: The BIP suggests that selection of UTXOs associated to reused addresses may be a solution to kick start the system while allowing the Receivers to repair some past mistakes. The idea is seducing in theory but is problematic in practice. The main source of reused addresses are services/exchanges. Let's imagine that exchange decide to implement SNICKER. That would provide a lot of liquidity to the system. The issue is that coins selection algorithms used by exchanges will always be driven by business and operational considerations first (fast service of clients' withdrawals, cleaning of huge wallets, etc). Thus it's more than likely that these coins selection algorithms will greatly damage the deniability provided to Proposers by past SNICKER transactions made with the exchanges.

These considerations let aside, I would also suggest to remove the idea of selecting UTXOs associated to reused addresses from this BIP or to impose additional rules for building SNICKER proposals.

Considering that exchanges are the main sources of reused addresses, it's more that likely that such an algorithm would pick UTXOs controlled by exchanges. Even if exchanges don't implement SNICKER, they may very well monitor the proposals sent by Proposers and forward them to analytics companies. Considering that some analytics companies have already successfully convinced some exchanges to share an "anonymized" identification of their clients, I don't think that it would be very difficult to do the same for SNICKER proposals (for the sake of saving the children, etc).

An alternative mitigation addressing this risk of leaks to Receivers may be a rule imposing that a Proposer doesn't use the same 01 and 02 addresses for more than one proposal. But this solution may become problematic for wallets rescans if the ratio of non-reponding Receivers is too high.

In conclusion, I'm sympathetic to the objectives driving the design of SNICKER but, in my opinion, some keys elements are missing in this BIP if we want to go with the ambitious goal of a practical decentralized solution becoming a standard for the whole ecosystem. I know that it doesn't sound very positive (certainly because I don't believe that some of the challenges can be easily addressed on a short timeframe) but I truly hope that these feedback (resulting from observations made during the last years) will be useful.

@AdamISZ
Copy link
Author

AdamISZ commented Nov 24, 2019

A/ I understand that we're diverging on the importance of implementing an anti-sybil mechanism for a protocol like SNICKER. IMHO, there are many reasons to believe that some actors may want to sybil it (globally or partially). One reason is the same reason pushing some actors to sybil the bitcoin P2P network or Electrum servers in significant proportions. Another reason may be individuals actors (whales?) willing to extract additional benefits from theirs dormant stacks by selling information resulting from the Sybils attacks. The information gathered by these individuals may be useless but may become very useful when aggregated by a single actor.

Yes, we have a pretty different point of view on all this stuff! How I see it:

I don't agree that any use of coinjoin is actually making the system worse, except in cases where (a) it creates a network level trace - which is precisely the problem reduced to bare minimum by the simplest versions of SNICKER and (b) where it somehow encourages co-spending in forms worse than would otherwise exist (and this point is fairly murky, as you have to carefully consider counterfactuals about how coins would be spent, otherwise).

So discounting (a) and (b) even a coinjoin with N-1 sybils for an all-powerful global adversary does not leave you worse off than where you started. That's an important starting point. It means that a person fruitlessly coinjoining away for a long time is only a bit poorer in tx fees.

So that's an important starting point, but it's a weak starting point of course. But then you consider the realistic extra layer: there is not just one adversary. There are almost certainly many, and just as important, there are at least some and usually many non-adversaries, also, because there is mutual benefit here. An early paper on Joinmarket specifically quipped about 'friends' in this context, but it's deeper than just a quip. The more open and global the system is, the stronger this point of heterogeneity becomes. Note that by contrast, a server based system displays fragility - unless it is designed to have no role in join coordination, and only does messaging (i realise that the chaumian case is sort of inbetween but in danger of getting off topic here ...).

So about your main example of harm here (where a Proposer chooses an exchange output because it's reused):

( First please note that the BIP is not really suggesting that the system could or would be bootstrapped primarily via address reuse; I think far more likely specialised wallets like JM/Samourai/Wasabi or other wallets with some watermark behaviour could be used, then bootstrapping could occur more via choosing existing SNICKER joins. I admit I don't really know though, anything is possible. )

I don't, in the end, buy that this is a real harm to me as a user who explicitly chooses to participate in coinjoins, although I understand your perspective.
It's rather similar to the arguments (which for sure had validity) against the idea of Joinmarket. We're talking about an open system where anyone can participate; that clearly has both good and bad sides. If anything, this one (SNICKER) is better on that particular front, because the Proposer can use blockchain analysis themselves to decide where to target.

So, if I am a Proposer, I am likely a specialised entity and can choose to join with whoever I like, based on some crude or sophisticated kind of blockchain scanning and analysis. And even if I am ignorant of the fact it's an Exchange, say, what do I lose by proposing a 2 party coinjoin with them? They learn nothing about me other than that some utxo out there in the world wants to do a join - unless the Receiver (here, the exchange) decides to infer that different proposals come from the same entity and are therefore using related utxos; but that's an assumption, and it doesn't look like a very safe one. So while I'm not seeing a realistic way that the Proposer is damaging their privacy, they are definitely helping it, if they succeed in damaging common input ownership heuristics and creating anon sets to the outside world.

An alternative mitigation addressing this risk of leaks to Receivers may be a rule imposing that a Proposer doesn't use the same 01 and 02 addresses for more than one proposal. But this solution may become problematic for wallets rescans if the ratio of non-reponding Receivers is too high.

I think there's a key point that's wrong here: If I am a Proposer with utxo U1 and want to make a bunch of proposals with receiver utxos R1, R2, ... Rn, it has no privacy degrading consequence. Any single one of them learns that "U1 wants to do a join", and that information is essentially broadcast. Repeated proposals don't make it "worse" (if it's even bad). The only possible issue is if I have U1, U2 etc and want to propose with all (many) of them. That's what I was discussing above, and arguing that attempting to link them is dubious.

@LaurentMT
Copy link

So discounting (a) and (b) even a coinjoin with N-1 sybils for an all-powerful global adversary does not leave you worse off than where you started. That's an important starting point. It means that a person fruitlessly coinjoining away for a long time is only a bit poorer in tx fees.

Considering the poor state of on-chain privacy provided by Bitcoin, I don't disagree that any improvement that can be achieved is a step in the right direction. Far from that. At the same time, I don't think we can simply ignore the argument that a system providing zero guarantee is worse than nothing (because it gives a false sense of security/privacy or because it may encourage some people to do transactions that they wouldn't do without this system).

Don't get me wrong. I don't expect that we can transform a 2-parties coinjoin system into something providing strong guarantees (whatever "strong" mrans in this context). But I like to think that we can try to minimize the risks.

An early paper on Joinmarket specifically quipped about 'friends' in this context, but it's deeper than just a quip. The more open and global the system is, the stronger this point of heterogeneity becomes.

If you have a link to this paper, I'm interested. :)
I don't disagree with the idea that heterogeneity of honest and malicious participants may be a good thing for a global system (a kind of "divide & conquer" strategy) but the down-to-earh person in me can't forget that usage of privacy enhancing tools in Bitcoin is still very confidential.
Hence, my obsession for the "guarantees" provided by the system since its inception.

( First please note that the BIP is not really suggesting that the system could or would be bootstrapped primarily via address reuse; I think far more likely specialised wallets like JM/Samourai/Wasabi or other wallets with some watermark behaviour could be used, then bootstrapping could occur more via choosing existing SNICKER joins. I admit I don't really know though, anything is possible. )

Get it. I just want to point out that I still see a system based on reused addresses or on outputs generated by Joinmarket/Wasabi/Samourai as 2 very different models. The reason is that:

  • systems like Joinmarket/Wasabi/Samourai have anti-sybil mechanisms implemented for theirs own needs (fees paid to coordinator/makers in addition to fees paid to miners),
  • operators/developers of these systems have an incentive to monitor/avoid sybil attacks against theirs systems.

A SNICKER model relying on systems like these ones automatically benefits from these 2 points.

A model based on reused addresses is a very different beast which, IMHO, provides little benefits and at worse implies unecessary risks (depending on the answer to the next point).

In the same spirit (and even if a BIP isn't the right place for these considerations ;), I would say that a heuristic detecting Joinmarket/Wasabi/Samourai outputs based on an analysis of transactions in isolation provides very different guarantees from a heuristic based on the analysis of these transactions in their contexts.

And obviously, SNICKER doesn't have to be limited to systems like Joinmarket/Wasabi/Samourai but they're good examples of systems having incentives and needs pretty well aligned with the incentives and needs of SNICKER.

I think there's a key point that's wrong here: If I am a Proposer with utxo U1 and want to make a bunch of proposals with receiver utxos R1, R2, ... Rn, it has no privacy degrading consequence. Any single one of them learns that "U1 wants to do a join", and that information is essentially broadcast. Repeated proposals don't make it "worse" (if it's even bad). The only possible issue is if I have U1, U2 etc and want to propose with all (many) of them. That's what I was discussing above, and arguing that attempting to link them is dubious.

The scenario I had in mind was more something like this:

Alice wants to initiate a SNICKER transaction for her UTXO IA1. She sends 2 proposals, one to Bob and one to Eve.

  • Proposal sent to Bob:
    • Inputs: IA1 (Alice), IB1 (Bob)
    • Outputs: OA1 and OA2 (Alice), OB1 (Bob)
  • Proposal sent to Eve:
    • Inputs: IA1 (Alice), IE1 (Eve)
    • Outputs: OA1 and OA2 (Alice), OE1 (Eve)

Bob accepts to collaborate. The SNICKER transaction is pushed to the network. Eve sees the transaction and is able to say that OA1 and OA2 are controlled by Alice (and by deduction that OB1 is controlled by Bob).

May be I've missed a point in the BIP or I don't interpret the sentence "his own freshly generated output addresses O1 and O2" in its most restrictive meaning (when I should) but my understanding is that the BIP allows this scenario which produces an indesirable outcome.
If freshly generated should be understood as the strong constraint "O1 and O2 shouldn't be used in multiple proposals" and not as "O1 and O2 have never received a payment" then it's all good. I would just suggest to add an explicit point about this constraint in the "Partially signed transaction" section.

@AdamISZ
Copy link
Author

AdamISZ commented Nov 28, 2019

Hi @LaurentMT
paper: https://www.bitcrime.de/presse-publikationen/pdf/BoehmeMoeser_Anonymity_WEIS2016.pdf

The scenario I had in mind

This scenario illustrates a crucial point I'd stupidly paid no attention to, which is the practicality of generating enough addresses as Proposer. I never had in mind that OA1, OA2 would be reused, so I was (and still am!) convinced that there is nothing lost by making multiple proposals against a single utxo U1, no matter what the behaviour of malicious parties at the Receiver end
.
But thank you for pointing it out, because of course, this is another hurdle for Proposers in practice. We don't want them making 100K proposals (fake or real), so we need some rate limiting; but by the same token, we do want them to be able to broadcast "a lot" of proposals (for some values of "a lot") (they may have to pay for the privilege perhaps) for the practicality of achieving success - but that means they must be prepared to do the handling of large HD wallet/BIP32 branches/trees. I see this as just another illustration of the fact that proposals on anything but a trivial scale will be a somewhat specialised role, as envisaged in the original blog post ("Alisa").

Edit: I'd note Joinmarket makers have somewhat of a similar issue; in active periods, long running bots accumulate thousands or tens of thousands of used addresses, with sometimes significant gaps. The similarity is not coincidental.

@LaurentMT
Copy link

Thanks for the link! :)

@nopara73
Copy link

Anti-spam prevention

Even with indexing, the problem of near-infinite fake proposals to clog the bulletin board "channel" still exists. This can be prevented with either (a) hashcash[16] (grind the ephemeral key for ECIES for example), (b)fidelity bonds similar to the idea proposed by Chris Belcher for Joinmarket[17], (c) direct payments to a server for the right to post proposals or (d) proposals only allowed to be made by the Bulletin Board owner or some fixed group. All of these ideas may be possible; but not using any of them would almost certainly lead to a death-by-spam in any reasonably popular public system.

Could you elaborate on your ideal anti-spam protection?

@AdamISZ
Copy link
Author

AdamISZ commented Jan 12, 2020

I'm not sure I'm best placed to make the judgement. Consider this problem could apply to other systems too.

But taking them in order:

No defence at all: I think this is fine for a first step of experimentation. I don't think anyone cares enough at this point to attack. I'd be happy to set up a server and just see what happens (if I had the ability and the time to do all the necessary groundwork ...).

Hashcash as a general idea has never really taken off in any context (other than bitcoin's PoW which is a very special case), so while it's the most technologically elegant I fear it applies even less well here than it did in cases like bitmessage (where it sorta kinda worked, a little bit). I'm dubious we could use it in any effective way unless there's a spin on it I'm missing.

Fidelity bonds seem a good indirect solution of a similar type. I have been a bit concerned about the privacy implications of them, hence a recent other gist suggesting there might be an offsetting crypto technique for that: https://gist.github.com/AdamISZ/52aa2e4e48240dfbadebb316507d0749 ... in particular, I liked the idea that you could somehow tag LSAG key images so that the same fidelity bond can be used for multiple applications (nobody would want 1BTC locked up for Joinmarket, 1BTC locked up for SNICKER, etc etc ...).

The next layer down I guess is micropayments over something like LN. Here we're moving even further into forcing Proposers, and bulletin board servers, into doing infrastructure set up. But we're also moving into more potential centralization with having payments to a server, and more potential fingerprinting of Proposers via their payments - note a big part of SNICKER as an idea is that Proposers can stay almost entirely offline (and asynchronous), and opening channels is not that. I think it does make sense economically though.

Lowest layer would be fully trusted payment, like a subscription model. Can also work, but centralization AND trust in a server (i.e. paying them larger sums in advance) as well as privacy issues without LN (not that even with LN, privacy issues are removed!) make this a kind of icky suggestion.

TLDR I don't know.

@nopara73
Copy link

Hashcash as a general idea has never really taken off in any context (other than bitcoin's PoW which is a very special case), so while it's the most technologically elegant I fear it applies even less well here than it did in cases like bitmessage (where it sorta kinda worked, a little bit). I'm dubious we could use it in any effective way unless there's a spin on it I'm missing.

It's because the "Proposer" needs to do a bunch of proposals in the first place, which is problematic if we'd want to straightforward apply it as "one proposal = one computation", right?

@AdamISZ
Copy link
Author

AdamISZ commented Jan 12, 2020

I'm not sure that the linear dependence between proposals and costs there would be problematic. That part seems reasonable.
I just think that an organised attacker can potentially arrange for some kind of large scale optimised computation.

It's interesting to compare it with the junk mail problem. I think the original concept of hashcash was something like: the attacker only gets benefit from 1 out of 1000 junk mails, so they have to send orders of magnitude more mails than an honest user. At least that kinda makes sense (although the idea still didn't get traction). Whereas here I think the attacker probably only wants to send maybe 10 times as much to be enough of a nuisance as to really damage the system. Maybe? See, that's part of my problem in trying to answer this spam question: I don't have hard metrics (server load, bandwidth requirements for users etc.) that could give me some sense of what the real limitations are.

Also for junk mail you can maybe make some economic calculation, if the attacker is incented by greed. Here, the only attacker we worry about is the one who simply wants to jam the system; if people send real requests, even if they are the FBI or whatever, fine, that's not an issue (in my opinion). How much is a malevolent attacker of that type willing to spend? I don't know, but such cases are not that common.

@themighty1
Copy link

Just wanted to leave here a thought I had which is similar to "Indexed to keys" but with a slight twist.

Let's assume that the protocol specifies that Proposers must broadcast a certain fixed amount of proposals in one batch, say 10000.
The idea is that The Proposer must arrange the batch in such a manner that the Receiver's pubkey maps onto the index in the batch which is intended for the Receiver.

For specificity, if the last bytes of the intended Receiver's pubkey are e.g. 0x0101 (257 in decimal), then the Proposer must arrange their batch so that the item at index 257 is destined for the Receiver.
The Receiver now has to download only item with index 257 from each published batch and try to decrypt it to check if it was destined for him.

@AdamISZ , will this be usefull ?

@AdamISZ
Copy link
Author

AdamISZ commented Oct 8, 2021

@themighty1 well, hello again!

That sounds like an interesting idea, but I'm not sure I can see how it would work? The bulletin board aggregates proposals by different proposers (or different nyms on the anonymising network, anyway!), each of which could be a big batch or just single proposals, say.
Perhaps you see it as, the bulletin board itself will do this ordering, after being told by the proposer what key is attached to it? Would that cut down on storage? The key is maybe 33 bytes out of a few hundred. To get ordering to be aligned with the last N bytes of a key you'll need 2^(8N) in the list, right? Does that mean padding the list out? And of course if N is small there will be collisions in keys for the same position in the list.

(I also think it depends on what problem we're trying to solve. It seems there are three concerns you could have with the retrieval of proposals by the receiver:

  • Privacy - is the receiver revealing information by choosing to download certain things. This is mainly addressed with the anonymising network connection; an alternative might be Private Information Retrieval, which in the limit of no wizardry means: download all, but with wizardry, presumably scales way better than that, but is probably quite hard to implement.
  • The storage and bandwidth requirements of the bulletin board
  • The bandwidth requirements of the receiver

Were you thinking mainly about reducing bandwith requirements of the receiver? I'm not sure I can see how that happens here; in either case, the receiver says "give me X" whether it be "give me the proposal at position 257" or "give me proposals for the key 03....0101". It seems like it would mainly only help the bulletin board, although as per above, I'm not sure about that either.)

@themighty1
Copy link

Hi, and thanks for breaking this down.
I didn't realize that the BB can aggregate the list, I thought that BB is like an IPFS where blobs are posted and retrieved without anyone curating the blobs.

Actually, the fact that the BB must be curated makes sense, because the BB owner must check the PoW of the blobs and discard spam blobs.

I think you boiled it down correctly with this:

in either case, the receiver says "give me X" whether it be "give me the proposal at position 257" or "give me proposals for the key 03....0101"

(Assuming that by 03....0101 you mean that only the last two bytes are revealed to the BB owner (as opposed to revealing the whole pubkey), thus slightly increasing privacy for the receiver).

@themighty1
Copy link

Hey, @AdamISZ , could you pls give an update on this idea.
Has it been superseded with some better approach?
Is there any wallet which is planning to/already implemented this?
Thanks.

@AdamISZ
Copy link
Author

AdamISZ commented May 12, 2022

Hey, @AdamISZ , could you pls give an update on this idea. Has it been superseded with some better approach? Is there any wallet which is planning to/already implemented this? Thanks.

Hi.
It was implemented (by me) in basic form in Joinmarket, but there wasn't much interest, and nobody else took an interest in working on it (and I also failed to get it published as a BIP, though I believe I went through the proper channels by initiating a discussion on the mailing list) so it remains disabled by default in Joinmarket (and unmaintained so, while it was initially functional, I can't say for sure it would work if someone tried it, today) and I haven't done any more work on it.

I think probably because the gains a system like this offers are small (minimal anonymity set), it doesn't capture people's imaginations as something worth putting time into. It's easy to paint a scenario where this is actually kinda cool/useful if a huge set of people were using it, but such a thing wouldn't bootstrap unless it was really compelling, even being useful at small scale.

But those are just general thoughts. Bottom line is without other people choosing to get involved, I couldn't justify spending more time on it after the initial coding.

(a minor point, that was mentioned somewhere but was only theoretical at the time of the proposal: use of taproot makes this easier due to plain pubkey outputs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment