Skip to content

Instantly share code, notes, and snippets.

@tevador
Last active December 10, 2024 20:03
Show Gist options
  • Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.
Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.

JAMTIS

This document describes a new addressing scheme for Monero.

Chapters 1-2 are intended for general audience.

Chapters 3-7 contain technical specifications.

Table of Contents

1. Introduction

1.1 Why a new address format?

Sometime in 2024, Monero plans to adopt a new transaction protocol called Seraphis [1], which enables much larger ring sizes than the current RingCT protocol. However, due to a different key image construction, Seraphis is not compatible with CryptoNote addresses. This means that each user will need to generate a new set of addresses from their existing private keys. This provides a unique opportunity to vastly improve the addressing scheme used by Monero.

1.2 Current Monero addresses

The CryptoNote-based addressing scheme [2] currently used by Monero has several issues:

  1. Addresses are not suitable as human-readable identifiers because they are long and case-sensitive.
  2. Too much information about the wallet is leaked when scanning is delegated to a third party.
  3. Generating subaddresses requires view access to the wallet. This is why many merchants prefer integrated addresses [3].
  4. View-only wallets need key images to be imported to detect spent outputs [4].
  5. Subaddresses that belong to the same wallet can be linked via the Janus attack [5].
  6. The detection of outputs received to subaddresses is based on a lookup table, which can sometimes cause the wallet to miss outputs [6].

1.3 Jamtis

Jamtis is a new addressing scheme that was developed specifically for Seraphis and tackles all of the shortcomings of CryptoNote addresses that were mentioned above. Additionally, Jamtis incorporates two other changes related to addresses to take advantage of this large upgrade opportunity:

  • A new 16-word mnemonic scheme called Polyseed [7] that will replace the legacy 25-word seed for new wallets.
  • The removal of integrated addresses and payment IDs [8].

2. Features

2.1 Address format

Jamtis addresses, when encoded as a string, start with the prefix xmra and consist of 196 characters. Example of an address: xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bfyji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wrb5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7whkckh51ik

There is no "main address" anymore - all Jamtis addresses are equivalent to a subaddress.

2.1.1 Recipient IDs

Jamtis introduces a short recipient identifier (RID) that can be calculated for every address. RID consists of 25 alphanumeric characters that are separated by underscores for better readability. The RID for the above address is regne_hwbna_u21gh_b54n0_8x36q. Instead of comparing long addresses, users can compare the much shorter RID. RIDs are also suitable to be communicated via phone calls, text messages or handwriting to confirm a recipient's address. This allows the address itself to be transferred via an insecure channel.

2.2 Light wallet scanning

Jamtis introduces new wallet tiers below view-only wallet. One of the new wallet tiers called "FindReceived" is intended for wallet-scanning and only has the ability to calculate view tags [9]. It cannot generate wallet addresses or decode output amounts.

View tags can be used to eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, this tier can also link outputs to those addresses. Possible use cases are:

2.2.1 Wallet component

A wallet can have a "FindReceived" component that stays connected to the network at all times and filters out outputs in the blockchain. The full wallet can thus be synchronized at least 256x faster when it comes online (it only needs to check outputs with a matching view tag).

2.2.2 Third party services

If the "FindReceived" private key is provided to a 3rd party, it can preprocess the blockchain and provide a list of potential outputs. This reduces the amount of data that a light wallet has to download by a factor of at least 256. The third party will not learn which outputs actually belong to the wallet and will not see output amounts.

2.3 Wallet tiers for merchants

Jamtis introduces new wallet tiers that are useful for merchants.

2.3.1 Address generator

This tier is intended for merchant point-of-sale terminals. It can generate addresses on demand, but otherwise has no access to the wallet (i.e. it cannot recognize any payments in the blockchain).

2.3.2 Payment validator

This wallet tier combines the Address generator tier with the ability to also view received payments (including amounts). It is intended for validating paid orders. It cannot see outgoing payments and received change.

2.4 Full view-only wallets

Jamtis supports full view-only wallets that can identify spent outputs (unlike legacy view-only wallets), so they can display the correct wallet balance and list all incoming and outgoing transactions.

2.5 Janus attack mitigation

Janus attack is a targeted attack that aims to determine if two addresses A, B belong to the same wallet. Janus outputs are crafted in such a way that they appear to the recipient as being received to the wallet address B, while secretly using a key from address A. If the recipient confirms the receipt of the payment, the sender learns that they own both addresses A and B.

Jamtis prevents this attack by allowing the recipient to recognize a Janus output.

2.6 Robust output detection

Jamtis addresses and outputs contain an encrypted address tag which enables a more robust output detection mechanism that does not need a lookup table and can reliably detect outputs sent to arbitrary wallet addresses.

3. Notation

3.1 Serialization functions

  1. The function BytesToInt256(x) deserializes a 256-bit little-endian integer from a 32-byte input.
  2. The function Int256ToBytes(x) serialized a 256-bit integer to a 32-byte little-endian output.

3.2 Hash function

The function Hb(k, x) with parameters b, k, refers to the Blake2b hash function [10] initialized as follows:

  • The output length is set to b bytes.
  • Hashing is done in sequential mode.
  • The Personalization string is set to the ASCII value "Monero", padded with zero bytes.
  • If the key k is not null, the hash function is initialized using the key k (maximum 64 bytes).
  • The input x is hashed.

The function SecretDerive is defined as:

SecretDerive(k, x) = H32(k, x)

3.3 Elliptic curves

Two elliptic curves are used in this specification:

  1. Curve25519 - a Montgomery curve. Points on this curve include a cyclic subgroup 𝔾1.
  2. Ed25519 - a twisted Edwards curve. Points on this curve include a cyclic subgroup 𝔾2.

Both curves are birationally equivalent, so the subgroups 𝔾1 and 𝔾2 have the same prime order ℓ = 2252 + 27742317777372353535851937790883648493. The total number of points on each curve is 8ℓ.

3.3.1 Curve25519

Curve25519 is used exclusively for the Diffie-Hellman key exchange [11].

Only a single generator point B is used:

Point Derivation Serialized (hex)
B generator of 𝔾1 0900000000000000000000000000000000000000000000000000000000000000

Private keys for Curve25519 are 32-byte integers denoted by a lowercase letter d. They are generated using the following KeyDerive1(k, x) function:

  1. d = H32(k, x)
  2. d[31] &= 0x7f (clear the most significant bit)
  3. d[0] &= 0xf8 (clear the least significant 3 bits)
  4. return d

All Curve25519 private keys are therefore multiples of the cofactor 8, which ensures that all public keys are in the prime-order subgroup. The multiplicative inverse modulo is calculated as d-1 = 8*(8*d)-1 to preserve the aforementioned property.

Public keys (elements of 𝔾1) are denoted by the capital letter D and are serialized as the x-coordinate of the corresponding Curve25519 point. Scalar multiplication is denoted by a space, e.g. D = d B.

3.3.2 Ed25519

The Edwards curve is used for signatures and more complex cryptographic protocols [12]. The following three generators are used:

Point Derivation Serialized (hex)
G generator of 𝔾2 5866666666666666666666666666666666666666666666666666666666666666
U Hp("seraphis U") 126582dfc357b10ecb0ce0f12c26359f53c64d4900b7696c2c4b3f7dcab7f730
X Hp("seraphis X") 4017a126181c34b0774d590523a08346be4f42348eddd50eb7a441b571b2b613

Here Hp refers to an unspecified hash-to-point function.

Private keys for Ed25519 are 32-byte integers denoted by a lowercase letter k. They are generated using the following function:

KeyDerive2(k, x) = H64(k, x) mod ℓ

Public keys (elements of 𝔾2) are denoted by the capital letter K and are serialized as 256-bit integers, with the lower 255 bits being the y-coordinate of the corresponding Ed25519 point and the most significant bit being the parity of the x-coordinate. Scalar multiplication is denoted by a space, e.g. K = k G.

3.4 Block cipher

The function BlockEnc(s, x) refers to the application of the Twofish [13] permutation using the secret key s on the 16-byte input x. The function BlockDec(s, x) refers to the application of the inverse permutation using the key s.

3.5 Base32 encoding

"Base32" in this specification referes to a binary-to-text encoding using the alphabet xmrbase32cdfghijknpqtuwy01456789. This alphabet was selected for the following reasons:

  1. The order of the characters has a unique prefix that distinguishes the encoding from other variants of "base32".
  2. The alphabet contains all digits 0-9, which allows numeric values to be encoded in a human readable form.
  3. Excludes the letters o, l, v and z for the same reasons as the z-base-32 encoding [14].

4. Wallets

4.1 Wallet parameters

Each wallet consists of two main private keys and a timestamp:

Field Type Description
km private key wallet master key
kvb private key view-balance key
birthday timestamp date when the wallet was created

The master key km is required to spend money in the wallet and the view-balance key kvb provides full view-only access.

The birthday timestamp is important when restoring a wallet and determines the blockchain height where scanning for owned outputs should begin.

4.2 New wallets

4.2.1 Standard wallets

Standard Jamtis wallets are generated as a 16-word Polyseed mnemonic [7], which contains a secret seed value used to derive the wallet master key and also encodes the date when the wallet was created. The key kvb is derived from the master key.

Field Derivation
km BytesToInt256(polyseed_key) mod ℓ
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday from Polyseed

4.2.2 Multisignature wallets

Multisignature wallets are generated in a setup ceremony, where all the signers collectively generate the wallet master key km and the view-balance key kvb.

Field Derivation
km setup ceremony
kvb setup ceremony
birthday setup ceremony

4.3 Migration of legacy wallets

Legacy pre-Seraphis wallets define two private keys:

  • private spend key ks
  • private view-key kv

4.3.1 Standard wallets

Legacy standard wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday entered manually

Legacy wallets cannot be migrated to Polyseed and will keep using the legacy 25-word seed.

4.3.2 Multisignature wallets

Legacy multisignature wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = kv
birthday entered manually

4.4 Additional keys

There are additional keys derived from kvb:

Key Name Derivation Used to
dfr find-received key kfr = KeyDerive1(kvb, "jamtis_find_received_key") scan for received outputs
dua unlock-amounts key kid = KeyDerive1(kvb, "jamtis_unlock_amounts_key") decrypt output amounts
sga generate-address secret sga = SecretDerive(kvb, "jamtis_generate_address_secret") generate addresses
sct cipher-tag secret ket = SecretDerive(sga, "jamtis_cipher_tag_secret") encrypt address tags

The key dfr provides the ability to calculate the sender-receiver shared secret when scanning for received outputs. The key dua can be used to create a secondary shared secret and is used to decrypt output amounts.

The key sga is used to generate public addresses. It has an additional child key sct, which is used to encrypt the address tag.

4.5 Key hierarchy

The following figure shows the overall hierarchy of wallet keys. Note that the relationship between km and kvb only applies to standard (non-multisignature) wallets.

key hierarchy

4.6 Wallet access tiers

Tier Knowledge Off-chain capabilities On-chain capabilities
AddrGen sga generate public addresses none
FindReceived dfr recognize all public wallet addresses eliminate 99.6% of non-owned outputs (up to § 5.3.5), link output to an address (except of change and self-spends)
ViewReceived dfr, dua, sga all view all received except of change and self-spends (up to § 5.3.14)
ViewAll kvb all view all
Master km all all

4.6.1 Address generator (AddrGen)

This wallet tier can generate public addresses for the wallet. It doesn't provide any blockchain access.

4.6.2 Output scanning wallet (FindReceived)

Thanks to view tags, this tier can eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, it can also link outputs to those addresses (but it cannot generate addresses on its own). This tier should provide a noticeable UX improvement with a limited impact on privacy. Possible use cases are:

  1. An always-online wallet component that filters out outputs in the blockchain. A higher-tier wallet can thus be synchronized 256x faster when it comes online.
  2. Third party scanning services. The service can preprocess the blockchain and provide a list of potential outputs with pre-calculated spend keys (up to § 5.2.4). This reduces the amount of data that a light wallet has to download by a factor of at least 256.

4.6.3 Payment validator (ViewReceived)

This level combines the tiers AddrGen and FindReceived and provides the wallet with the ability to see all incoming payments to the wallet, but cannot see any outgoing payments and change outputs. It can be used for payment processing or auditing purposes.

4.6.4 View-balance wallet (ViewAll)

This is a full view-only wallet than can see all incoming and outgoing payments (and thus can calculate the correct wallet balance).

4.6.5 Master wallet (Master)

This tier has full control of the wallet.

4.7 Wallet public keys

There are 3 global wallet public keys. These keys are not usually published, but are needed by lower wallet tiers.

Key Name Value
Ks wallet spend key Ks = kvb X + km U
Dua unlock-amounts key Dua = dua B
Dfr find-received key Dfr = dfr Dua

5. Addresses

5.1 Address generation

Jamtis wallets can generate up to 2128 different addresses. Each address is constructed from a 128-bit index j. The size of the index space allows stateless generation of new addresses without collisions, for example by constructing j as a UUID [15].

Each Jamtis address encodes the tuple (K1j, D2j, D3j, tj). The first three values are public keys, while tj is the "address tag" that contains the encrypted value of j.

5.1.1 Address keys

The three public keys are constructed as:

  • K1j = Ks + kuj U + kxj X + kgj G
  • D2j = daj Dfr
  • D3j = daj Dua

The private keys kuj, kxj, kgj and daj are derived as follows:

Keys Name Derivation
kuj spend key extensions kuj = KeyDerive2(sga, "jamtis_spendkey_extension_u" || j)
kxj spend key extensions kxj = KeyDerive2(sga, "jamtis_spendkey_extension_x" || j)
kgj spend key extensions kgj = KeyDerive2(sga, "jamtis_spendkey_extension_g" || j)
daj address keys daj = KeyDerive1(sga, "jamtis_address_privkey" || j)

5.1.2 Address tag

Each address additionally includes an 18-byte tag tj = (j', hj'), which consists of the encrypted value of j:

  • j' = BlockEnc(sct, j)

and a 2-byte "tag hint", which can be used to quickly recognize owned addresses:

  • hj' = H2(sct, "jamtis_address_tag_hint" || j')

5.2 Sending to an address

TODO

5.3 Receiving an output

TODO

5.4 Change and self-spends

TODO

5.5 Transaction size

Jamtis has a small impact on transaction size.

5.5.1 Transactions with 2 outputs

The size of 2-output transactions is increased by 28 bytes. The encrypted payment ID is removed, but the transaction needs two encrypted address tags t~ (one for the recipient and one for the change). Both outputs can use the same value of De.

5.5.2 Transactions with 3 or more outputs

Since there are no "main" addresses anymore, the TX_EXTRA_TAG_PUBKEY field can be removed from transactions with 3 or more outputs.

Instead, all transactions with 3 or more outputs will require one 50-byte tuple (De, t~) per output.

6. Address encoding

6.1 Address structure

An address has the following overall structure:

Field Size (bits) Description
Header 30* human-readable address header (§ 6.2)
K1 256 address key 1
D2 255 address key 2
D3 255 address key 3
t 144 address tag
Checksum 40* (§ 6.3)

* The header and the checksum are already in base32 format

6.2 Address header

The address starts with a human-readable header, which has the following format consisting of 6 alphanumeric characters:

"xmra" <version char> <network type char>

Unlike the rest of the address, the header is never encoded and is the same for both the binary and textual representations. The string is not null terminated.

The software decoding an address shall abort if the first 4 bytes are not 0x78 0x6d 0x72 0x61 ("xmra").

The "xmra" prefix serves as a disambiguation from legacy addresses that start with "4" or "8". Additionally, base58 strings that start with the character x are invalid due to overflow [16], so legacy Monero software can never accidentally decode a Jamtis address.

6.2.1 Version character

The version character is "1". The software decoding an address shall abort if a different character is encountered.

6.2.2 Network type

network char network type
"t" testnet
"s" stagenet
"m" mainnet

The software decoding an address shall abort if an invalid network character is encountered.

6.3 Checksum

The purpose of the checksum is to detect accidental corruption of the address. The checksum consists of 8 characters and is calculated with a cyclic code over GF(32) using the polynomial:

x8 + 3x7 + 11x6 + 18x5 + 5x4 + 25x3 + 21x2 + 12x + 1

The checksum can detect all errors affecting 5 or fewer characters. Arbitrary corruption of the address has a chance of less than 1 in 1012 of not being detected. The reference code how to calculate the checksum is in Appendix A.

6.4 Binary-to-text encoding

An address can be encoded into a string as follows:

address_string = header + base32(data) + checksum

where header is the 6-character human-readable header string (already in base32), data refers to the address tuple (K1, D2, D3, t), encoded in 910 bits, and the checksum is the 8-character checksum (already in base32). The total length of the encoded address 196 characters (=6+182+8).

6.4.1 QR Codes

While the canonical form of an address is lower case, when encoding an address into a QR code, the address should be converted to upper case to take advantage of the more efficient alphanumeric encoding mode.

6.5 Recipient authentication

TODO

7. Test vectors

TODO

References

  1. https://github.com/UkoeHB/Seraphis
  2. https://github.com/monero-project/research-lab/blob/master/whitepaper/whitepaper.pdf
  3. monero-project/meta#299 (comment)
  4. https://www.getmonero.org/resources/user-guides/view_only.html
  5. https://web.getmonero.org/2019/10/18/subaddress-janus.html
  6. monero-project/monero#8138
  7. https://github.com/tevador/polyseed
  8. monero-project/monero#7889
  9. monero-project/research-lab#73
  10. https://eprint.iacr.org/2013/322.pdf
  11. https://cr.yp.to/ecdh/curve25519-20060209.pdf
  12. https://ed25519.cr.yp.to/ed25519-20110926.pdf
  13. https://www.schneier.com/wp-content/uploads/2016/02/paper-twofish-paper.pdf
  14. http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt
  15. https://en.wikipedia.org/wiki/Universally_unique_identifier
  16. https://github.com/monero-project/monero/blob/319b831e65437f1c8e5ff4b4cb9be03f091f6fc6/src/common/base58.cpp#L157

Appendix A: Checksum

# Jamtis address checksum algorithm

# cyclic code based on the generator 3BI5PLC1
# can detect 5 errors up to the length of 994 characters
GEN=[0x1ae45cd581, 0x359aad8f02, 0x61754f9b24, 0xc2ba1bb368, 0xcd2623e3f0]

M = 0xffffffffff

def jamtis_polymod(data):
    c = 1
    for v in data:
        b = (c >> 35)
        c = ((c & 0x07ffffffff) << 5) ^ v
        for i in range(5):
            c ^= GEN[i] if ((b >> i) & 1) else 0
    return c

def jamtis_verify_checksum(data):
    return jamtis_polymod(data) == M

def jamtis_create_checksum(data):
    polymod = jamtis_polymod(data + [0,0,0,0,0,0,0,0]) ^ M
    return [(polymod >> 5 * (7 - i)) & 31 for i in range(8)]

# test/example

CHARSET = "xmrbase32cdfghijknpqtuwy01456789"

addr_test = (
    "xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3"
    "wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bf"
    "yji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wr"
    "b5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7wh")

addr_data = [CHARSET.find(x) for x in addr_test]
addr_enc = addr_data + jamtis_create_checksum(addr_data)
addr = "".join([CHARSET[x] for x in addr_enc])

print(addr)
print("len =", len(addr))
print("valid =", jamtis_verify_checksum(addr_enc))
@j-berman
Copy link

j-berman commented Aug 22, 2023

Tx volume is hovering around ~20k txs per day these days, which is a floor of ~40k outputs per day. Let's assume ~65k outputs per day, which is an expected ~1 view tag match per day at a 1:65,536 hit rate. At that rate, any view-tag-matched enotes the server identifies around the time a user opens their wallet would almost certainly be the user's enotes. Further, any clusters of enotes the user spends/receives in a single day would stick out like a sore thumb to the server.

Seems at that hit rate and today's volume, the privacy gain of view tags is close to nil.

@kayabaNerve
Copy link

kayabaNerve commented Aug 22, 2023 via email

@DangerousFreedom1984
Copy link

  • Is the speed to recover enotes/balances of normal wallets decreasing? If so how much?
  • What is roughly the rate of people that use third-party servers to filter enotes for them?
  • If lets say only 1% of people would give their view keys to third partys to scan the blockchain for them, should we trade the speed recovery of 99% of users so those 1% can benefit from a more private recovery? (I'm unsure of the numbers, just a thought)
  • Giving away your sparse and dense priv keys is the same as giving away the priv find_receive key in the original seraphis, right?
  • (Just a thought) Differently from Bloom filters in Bitcoin that in reality dont really enhance privacy, I believe that these changes would enhance privacy here due to the different layers of privacy that Monero already has. But what would be nice to see would be less information being comunicated to the wallets but I can't see any improvements here (today I guess we have the public ephemeral key, view-tag and onetime-address, right? Would be nice to somehow get less info to improve speed recovery and privacy. No idea how.
  • It would have been really hard to make these changes if Seraphis were already in use as they are huge. I think we would have needed basically to multiply the seraphis lib by 2 since it touches almost every aspect of it. But I like also the idea of increasing the address for those who want that feature with more privacy. Do you think that these changes could work as an addon? Would the original Seraphis lib offer enough freedom for that? Maybe a good exercise to think about :p
  • I am willing to make the necessary changes in the knowledge proofs if these changes pass.
  • I'm still in the process of understanding and trying to answer these questions that I have so I don't have an opinion now but the efforts are very much appreciated. Thank you!

@UkoeHB
Copy link

UkoeHB commented Aug 28, 2023

@jeffro256, here is my review of the proposed changes to the document. I will follow-up with an assessment of pros/cons in a later comment.

To summarize the proposal: Do two key derivations instead of one during the 'view tag filtering' piece of balance recovery. If one derivation is offloaded to a third party, then the second derivation gates access to the nominal address tag (and nominal address spend key).

  • deriving s_fr from k_ua

    • It would be better to derive s_fr from k_vb. That way k_dv and k_sv will have the same entropy as k_ua.
  • section 8.2.4 'Optimized Design'

    • Normal enotes: It should be 'three ECDH exchanges'. Also, adding an additional 32 bytes to s^sr_1 means you'll need two blake2b blocks instead of one (a block is 128 bytes, and iirc we only need one block for s^sr_1 currently), so it is technically four hash operations for normal enote secrets.
  • section 8.3.3

    • Formatting is messed up.
  • section 8.3.4 (needs proof-reading)

    • "we include a MAC-like hashes" -> "we include MAC-like hashes"
    • "and check it against" -> "and check them against"
    • "the ECDH exchange" -> "the ECDH exchanges"
    • start a quote with back ticks so they curl properly: ``
    • "ensuring the view tag derivation" -> "ensuring the view tag derivations"
    • Tentative rewrite: "We highlight the advantage of using two view tags, rather than one, in Section 8.5.1".
  • section 8.4.3

    • K_1 -> K_s
  • section 8.5.1

    • Revert section title changes.
    • "by checking view tags" -> "by checking its view tags"
    • "since it tends to be larger, and thus filters out more computation" -> This is introduced with no prior discussion about the recommended size of view tags (other than vaguely implied by the view tag names).
  • section 8.5.2

    • Self-send tau checks are no longer cheap, because there is no longer an address tag hint.
  • Comments

    • I am not entirely in agreement with rolling back the 'address tag' term. I think it is easier to handle than 'ciphered address index'.
    • Considering the self-send tau check issue, it would be better to just retain the address tag hint instead of adding in a separate view tag. (EDIT: the perf diff here is probably non-existent, so I retract this comment)

@jeffro256
Copy link

@DangerousFreedom1984

Is the speed to recover enotes/balances of normal wallets decreasing? If so how much?

Honestly, this is really hard to say. I wanted to say that normal full wallet scanning was not going to be any slower than before, but @UkoeHB brought up an issue with the self-send tau checks (I haven't looked into it yet).

What is roughly the rate of people that use third-party servers to filter enotes for them?

I think the rate of people using light wallet servers now is very low because of the terrible privacy trade-offs (giving away your private view key). Fixing some of the privacy issues with light wallet servers and advertising those changes amongst the greater community would surely affect the usage rate.

If lets say only 1% of people would give their view keys to third partys to scan the blockchain for them, should we trade the speed recovery of 99% of users so those 1% can benefit from a more private recovery? (I'm unsure of the numbers, just a thought)

If it really was this low then I don't know if the trade-off would be worth it. I suspect it won't be this low, though. Just look at (e.g.) MyMonero downloads vs other apps.

But what would be nice to see would be less information being comunicated to the wallets but I can't see any improvements here (today I guess we have the public ephemeral key, view-tag and onetime-address, right? Would be nice to somehow get less info to improve speed recovery and privacy. No idea how.

Unfortunately, unless some other technique is used to transmit chain data, the fewer enotes/txs the light wallet server associates with you, the smaller your anonymity set is, which means less bandwidth = less private. Idk how to solve that yet either.

I think we would have needed basically to multiply the seraphis lib by 2 since it touches almost every aspect of it

It doesn't really affect any part of Seraphis proper, just the Jamtis addressing layer, and just normal enote balance recovery at that. So it does require basically a complete rewrite of balance recovery code, but shouldn't actually expand it too much hopefully (working on that right now).

Do you think that these changes could work as an addon? Would the original Seraphis lib offer enough freedom for that?

Two problems with making these changes an optional add-on is that 1) you partition yourself to senders by giving away information about your type of wallet and 2) ecosystem developers now have to support 2 types of addresses, and you can see how well that ends up normally (e.g. current light wallets still don't support sub-addresses). I (and I'm sure others) would prefer if there was just one type of address, but it certainly could be done.

I am willing to make the necessary changes in the knowledge proofs if these changes pass

Thank you, I really appreciate it ;)

@jeffro256
Copy link

@j-berman Thank you for your deep analysis and counter-arguments

To confront the initial steelman:

even with this proposal, a light wallet user should still expect that a 3rd party server is able to trace their transactions using statistical analysis.

Since Monero's conception, the network has never provided perfect privacy, only plausible deniability. There is over 9 years of on-chain data to perform statistical analysis upon, but never has the protocol been designed to allow deterministic de-anonymization. The implications of the ability for light wallet server to more-or-less ~100% deterministically sense that a user owns an incoming payment in conditions outside of the user's control (public address sharing, multiple receives) are massive, especially in the western legal domain. Downgrading these attacks to statistical, especially where the risks decrease with greater transaction flow, may save people from legal battles in the future.

I agree with almost everything else, although I don't think we should completely consider the statement true in all cases:

The light wallet server can definitively tell when the user constructs a tx, and further can narrow in on a subset of plausible spends

This is a current design choice that is made for light wallet servers because of the convenience, but nothing about the Cryotonote protocol or Seraphis/Jamtis protocol requires this to be true. It is always possible to construct a transaction and broadcast is to the network directly, and even use a Tor tx proxy, bypassing the light wallet server and obfuscating the user's IP address. In this case, the user doesn't have to assume that the 3rd party daemon isn't colluding with the light wallet server, it knows it to be true, aside from a Sybil/Eclipse attack.

@j-berman
Copy link

j-berman commented Aug 28, 2023

I'd say my commentary is most relevant toward understanding why a 2 byte view tag would offer basically no privacy advantage at today's tx volume due to its statistical surface, even with Tor and with connecting to 3rd party daemons to submit txs: https://gist.github.com/tevador/50160d160d24cfc6c52ae02eb3d17024?permalink_comment_id=4668705#gistcomment-4668705

I generally agree the idea to add an additional pub key does provide a stronger level of privacy though, which is the primary reason why I'm a proponent of the idea. I agree that when compared to the Jamtis light wallet tier without the additional pub key, this proposal downgrades the statistical attack surface (and the surface could become virtually non-existent with extremely high tx volume).

Still, it's worth keeping in mind that the statistical analysis surface the light wallet tier brings is more significant than Monero's current full wallets.

A while back someone proposed that full wallets only download data necessary to determine which outputs belong to a user, and then once identified, request the transactions of those outputs along with "chaff" (decoy) transactions, in order to minimize data needed to download when scanning. There was pushback on this idea because of the widened statistical surface enabling a node to potentially pinpoint a user's txs: https://www.reddit.com/r/Monero/comments/5wc2th/a_proposal_to_speed_up_wallet_sync_around_5x/de940mj/

It's worth keeping in mind the light wallet tier introduces a similar surface.

It is always possible to construct a transaction and broadcast is to the network directly, and even use a Tor tx proxy, bypassing the light wallet server and obfuscating the user's IP address.

This is what I was getting at in explaining how the optimal privacy profile of a light wallet client would communicate with a 3rd party daemon ideally not colluding with the server. Even with Tor though, if a 3rd party daemon combines logs with a light wallet server, the logs would show e.g. Bob just opened his light wallet client, then 1 person just requested paths in a merkle tree (1 path included one of Bob's view tag matched enotes)/fees/submitted a tx to the network, and Bob has a view tag match in that tx.

Unless there exists significant cover volume where tons of people are trying to construct txs at a specific point in time, then it's fairly trivial to guess Bob's tx, his spent enote, and his change enote.

However, yes, it's still a "guess" which I agree is stronger privacy than the current Jamtis light wallet tier's "100% certainty in some cases" and would improve with higher tx volume.

@jeffro256
Copy link

@UkoeHB I've been thinking about the slowness of the self-send tau checks under the new addressing scheme, and yes you are right, they are slower since there are no address tag hints. However, since you can now do 3-bytes of view tag checks BEFORE doing the self-send tau checks vs 1-byte of view tags checks, under the the new scheme, the process of self-send tau checks will be done ~65536 times less (more often if one's self-sends is a larger portion of total on-chain enote volume). Hopefully, this amortizes out to be slightly faster overall for most users.

@kayabaNerve
Copy link

Too many view tag bytes hurts privacy AFAIUI, @j-berman to properly state what I'm thinking of so we're all on the same page.

@jeffro256
Copy link

To be clear, I say 3 bytes of view tags, but it is split into two view tags, a 1-byte and a 2-byte tag, which are each computed from two independent DH secrets. You can give access to compute just one view tag (presumably the 1-byte view tag) to a light wallet server. However, if you are the client with the whole view-balance key, you can compute both view tags and check against both before trying self-send tau checks.

@jeffro256
Copy link

@j-berman Was making the point that without huge increases to transaction volume and the assumption that the third-party daemon and light wallet server are not colluding, the privacy of giving a light wallet server the ability to compute 2-byte view tags is very bad.

@kayabaNerve
Copy link

Ah, sorry. Thanks for clarifying.

@jeffro256
Copy link

jeffro256 commented Sep 10, 2023

For the base32 encoding, instead of using a custom alphabet, why not use an existing standard that meets our requirements like Crockford base32? Spec here: https://www.crockford.com/base32.html. There's an existing C++ implementation here: https://github.com/tplgy/cppcodec/blob/master/cppcodec/base32_crockford.hpp.

@UkoeHB
Copy link

UkoeHB commented Sep 10, 2023

After considering the pros and cons, the biggest concern for me is that combining the view tags gives you a scan tier that can almost definitively identify all owned enotes (normal and self-send). The combined tier would be an ultra-efficient scan tier with high visibility into user transaction graphs. I expect that in the long run, someone will implement that tier to the detriment of user privacy.

So the trade-off is: A) improve privacy for the recommended remote scanning tier, B) expose an unrecommended remote scanning tier that is materially superior to the recommended tier and greatly weakens user privacy.

@jeffro256
Copy link

Tbf, this was already possible by combining Find-Received + Cipher-Tag. You could give a third-party s_ct and k_fr, and then they could decrypt and decipher address tags, whittling down the probability that a scanned enote is a false negative to 1:16777216.

@UkoeHB
Copy link

UkoeHB commented Sep 11, 2023

Tbf, this was already possible by combining Find-Received + Cipher-Tag.

Not quite. With k_fr and s_ct you can only identify normal enotes. You still need to send all view tag matches to the client so they can scan for self-sends, which means a remote scanner with k_fr and s_ct is not materially more efficient than one with just k_fr. However, with the dual view tags this changes because now you can rule out many more self-send candidates using the second view tag, greatly reducing the amount of data that needs to be sent to the client.

We can fix this issue by keeping the prior jamtis design (with the address tag hint). The only change is to add the second key derivation to s^sr_1 for normal scans only. This way a remote scanner with k_fr and k_rs (receive-secret key for the second key derivation) is equivalent to the current remote scanner, while a remote scanner with just k_fr has the benefits of your original proposal. This is actually much better overall, because now it is feasible for someone to offload both k_fr and k_rs to a remote scanner in order to offload computation of the second key derivation to that scanner (in your proposal it would not be feasible due to the self-send identification issue), which may be a beneficial trade-off if tx volume becomes very large (e.g. if tx volume increases 256x, then your proposal would leave light wallet clients with the same scanning perf normal clients have today).

On the other hand, I do wonder if all these scanning optimizations and tweaks would/will make sense in the long run. If there comes a time when remote scanning only makes sense by offloading both derivations, then we are back to the original jamtis proposal at the cost of a uselessly larger jamtis address and bloated spec.

@tevador
Copy link
Author

tevador commented Sep 12, 2023

We can fix this issue by keeping the prior jamtis design (with the address tag hint). The only change is to add the second key derivation to s^sr_1 for normal scans only. This way a remote scanner with k_fr and k_rs (receive-secret key for the second key derivation) is equivalent to the current remote scanner, while a remote scanner with just k_fr has the benefits of your original proposal.

I like this solution. The cost would be slightly longer addresses (247 vs 244 characters), but there would be much stronger protection of self-sends from the remote scanning services. See this comment to understand why hiding self-sends is vital to protect the privacy properties of the whole network.

On the other hand, I do wonder if all these scanning optimizations and tweaks would/will make sense in the long run. If there comes a time when remote scanning only makes sense by offloading both derivations, then we are back to the original jamtis proposal at the cost of a uselessly larger jamtis address and bloated spec.

If tx volume increases 256x, we'd be at ~40 MB blocks with a blockchain growth of >10 TB/year. If the network can handle that, I think it's safe to assume that CPU performance and network bandwidth have also increased so that light clients can easily keep up using 1/256 view tags.

@jeffro256
Copy link

Not quite. With k_fr and s_ct you can only identify normal enotes.

Fair enough

You still need to send all view tag matches to the client so they can scan for self-sends, which means a remote scanner with k_fr and s_ct is not materially more efficient than one with just k_fr. However, with the dual view tags this changes because now you can rule out many more self-send candidates using the second view tag, greatly reducing the amount of data that needs to be sent to the client.

If the user is this hell-bent on revealing their transaction graph for the sake of efficiency, why doesn't the user also send his self-send TXIDs to the light wallet server? IIRC, current light wallet servers already know which users are tied to which outgoing transactions by virtue of helping them construct that transaction. Heck, all of these changes still don't keep the user from sending their view balance key, which would constitute the most efficient light wallet server. If they wanted to dance around the fact that this isn't private, they could even add some ad-hoc tech to randomly request other data so they can claim its private, or an infinite amount of other things that degrade privacy but make it more efficient. To me, this argument falls under the same category of criticism at the announcement of view-balance keys, because someone else could force them to reveal their view balance keys. It isn't cryptographically possible to prevent people from revealing secret keys willy-nilly, so I don't know how productive it is to talk about potential future scenarios in which the tier system is willingly abused. What we should design are the tiers that we want to see, because users will use them and gain certain trade-offs, while minimizing risk to the planned tiers.

but there would be much stronger protection of self-sends from the remote scanning services

Same point here: It isn't stronger if we don't assume the user won't abuse the wallet tiers, which is what brought this discussion on.

See this comment to understand why hiding self-sends is vital to protect the privacy properties of the whole network.

I agree that hiding self-sends is important, but unless you have a protocol that forces users' self-send privacy, I think that point is moot here.

After considering the pros and cons, the biggest concern for me is that combining the view tags gives you a scan tier that can almost definitively identify all owned enotes (normal and self-send).

One thing about prevents this using actual incentives is the existence of the the 2-byte view tag "sparse" tier in the original proposal. 2-bytes of view tag, for people like us, is complete overkill in efficiency/privacy balance as of current tx volume. But potentially in the future, if there are users who don't want to even scan 1/256 of the enotes on the chain, because they value convenience over privacy 10-fold, they can scan 256x times less than that: 1/65536 (about ~1 enote every day on mainnet today). I think it's not unreasonable that tx volume could 256x sometime in the distant future, which would mean that an enote hit every 10 minutes or so for people using the 2-byte view tag tier. (@j-berman did a great analysis of timing attacks against 2-byte view tags against current tx volume in this thread)

But here's the big thing: this tier doesn't have the deterministic drawbacks of a third-party wallet knowing your nominal address tags: identifying incoming normal enotes to known addresses and incoming normal enotes sent to addresses more than once with ~100% certainty. The privacy of the 2-byte view tag tier scales up with volume, and it is much more detrimental to privacy than the proposed "dense" view tag tier, but if we're planning for very desperate users like we're doing here, we need a bigger jump for light wallet scanning than replacing DH ops with Twofish ops; we need to have the option to cut bandwidth without deterministic attacks.

On the other hand, I do wonder if all these scanning optimizations and tweaks would/will make sense in the long run. If there comes a time when remote scanning only makes sense by offloading both derivations, then we are back to the original jamtis proposal at the cost of a uselessly larger jamtis address and bloated spec.

Here again is the beauty of a 2-byte view tag tier being available. Since we're planning for huge tx volume which displaces users who simply can't keep up with chain data, a 2-byte view tag tier will actually cut bandwidth hugely w/o deterministic downsides.

I think it's safe to assume that CPU performance and network bandwidth have also increased so that light clients can easily keep up using 1/256 view tags

If it's safe to assume this, then why have the modifications in the first place? If it's so easy to keep up with bandwidth and computation, why would users feel the need to jump ship to worse privacy trade-offs en masse?

@tevador
Copy link
Author

tevador commented Sep 12, 2023

why doesn't the user also send his self-send TXIDs to the light wallet server

Rational users have exactly zero incentive to do this.

Here again is the beauty of a 2-byte view tag tier being available. Since we're planning for huge tx volume which displaces users who simply can't keep up with chain data, a 2-byte view tag tier will actually cut bandwidth hugely w/o deterministic downsides.

Do we really need two view tags for this from the start? Why can't the bitsize of the "standard" view tag scale with volume to keep the false positive rate roughly constant? E.g. when tx volume doubles, one bit is added to the view tag deterministically. That would react much more smoothly and provide plausible deniability under all conditions.

@UkoeHB
Copy link

UkoeHB commented Sep 12, 2023

Why can't the bitsize of the "standard" view tag scale with volume to keep the false positive rate roughly constant? E.g. when tx volume doubles, one bit is added to the view tag deterministically. That would react much more smoothly and provide plausible deniability under all conditions.

This can be abused by a malicious remote scanning service to reduce the anonymity of users by spamming the chain.

@tevador
Copy link
Author

tevador commented Sep 12, 2023

Malicious actors can reduce the anonymity of users by spamming the chain right now.

@UkoeHB
Copy link

UkoeHB commented Sep 12, 2023

Yes but a dynamic view tag would make spam more damaging.

@tevador
Copy link
Author

tevador commented Sep 12, 2023

The options are:

  1. Fixed-size view tag: Either good plausible deniability now and possibly inadequate filtering later, or vice versa (or somewhere in between).
  2. Multiple view tags of various sizes: coarse tuning of the false positive rate; susceptible to spam attacks (an attacker can spam for a while to make users subscribe with the larger tag); retroactive privacy loss when switching to a larger tag.
  3. Dynamic-size view tag: fine tuning of the false positive rate, no retroactive privacy loss, susceptible to spam attacks.

Choose your poison.

@jeffro256
Copy link

The fourth option, which is what @UkoeHB was proposing, is a fixed-size view tag but optionally enable third parties to compute nominal address tags, which reduces light-side single-core compute time by about 100x, but increases the bandwidth by ~10%.

@tevador
Copy link
Author

tevador commented Sep 12, 2023

optionally enable third parties to compute nominal address tags

This is orthogonal, can be added to any of the above 3 options. The important point is that it does not reduce the bandwidth requirements for light clients.

@jeffro256
Copy link

jeffro256 commented Sep 13, 2023

I don't know if this idea has ever been floated before, and I'm making this up right now, but we could do dynamic view tags that 1) aren't susceptible to spam attacks, 2) scale as the receiver wishes and 3) all look uniform on-chain while keeping transaction size the same. They would have an absolute maximum size set by consensus (say 2-bytes, or 3-bytes if we're pushing it, but 2-bytes is probably fine for a maximum). All tags on-chain will show up as this constant length. Let's call this number of bits b_max. The actual length of view tag/the amount of filtering that a receiver desires is encoded in the address. Let's call this value b_addr. We shouldn't give users too many options, else they will partition themselves too finely and addresses could be attempted to be correlated. Let's say that we give the users 8 (could be any number and 4 might be better) choices, which means the size of integer b_addr is fixed at 3 bits (which we could actually fit into an address in my proposal without expanding the address size since there's 4 unused bits). Let's say that our 8 choices for b_addr (the utilized bit length of the view-assist tag) are 1, 2, 4, 6, 8, 10, 12, or 16 bits wide. The higher b_addr is, the more efficient scanning is, but the smaller the anonymity pool is. Full wallets would likely set this value as high as it will go (since they lose no privacy either way if they are giving up that private key). Light wallets would select a good value for them, then send k_va (view-assist) and b_addr to their light wallet server.

Senders, when sending to an address, will extract b_addr from the address and encode b_addr number of bits into the view-assist tag, and whatever bits are left in the view tag space (b_max - b_addr), they will fill with random noise (this part is important to not miss otherwise we might accidentally filter using more bits than intended).

Light wallet servers, who know b_addr for each user, when doing DH exchanges against k_va, will match b_addr bits and send those records to the light wallet client, who scans them as usual.

Cons: 1) Partitioning on receive addresses can happen. 2) Malicious senders can use more bits than requested (b_addr) to signal probabilistically to someone's light wallet server that an incoming normal enote belongs to said receiver. 3) If a receiver creates a recieve address with b_addr1, then gives that address to a sender, then wants to increase b_addr1 to b_addr2>b_addr1, the receiver might not properly scan a transaction sent with the old b_addr1, and will need the sender to tell the receiver the transaction ID. Not too worried about point 2 since its already possible to construct a transaction then blab about it. Point 1 is a little trickier.

Pros: 1) Many options for users so they are incentivized to not completely bomb their privacy even with incredibly high transaction volume and bad connectivity, 2) on-chain uniformity, 3) view tags not susceptible to spam attacks 4) no retroactive loss of privacy when increasing b_addr (moving to less private, more efficient tier) 5) There are options to do less than 1:256 view filter, e.g. 1:64 filtering.

What do you think?

@jeffro256
Copy link

In the context of a Jamtis protocol where we do 2 DH operations for normal enotes anyways (to guard the sender-receiver secret from light wallet servers), all txs could also still include the independent fixed-size view tag for the second key (like the "sparse view tag" in my original proposal). Full wallets could use this fixed-size view tag as they first tag they scan against, allowing them to generate random values of b_addr for their addresses to help mitigate partitioning, while not affecting their scan time by more than fractions of a percent.

@tevador
Copy link
Author

tevador commented Sep 13, 2023

Interesting idea, but:

  1. Users of remote scanning services might be coerced or tricked into using the longest possible tag, getting zero privacy, while costing the malicious service nothing.
  2. Any address with fewer than the maximum number of bits would immediately leak the fact that the user is using a remote scanner.
  3. Horrible UX when changing the tag size. Payments sent to old addresses would not be recognized and asking the sender for the TXID is not always possible (e.g. donations).

Compare that with a dynamic tag size calculated from the running average tx volume over the last 100 000 blocks so that the mean false positive rate is about 256 tag matches per day:

  1. Malicious services would have to spam constantly at least 50% of the transaction volume to add 1 bit to the tag size (reducing the effective false positive rate to 128 matches per day). This attack is not free, the attacker is paying transaction fees.
  2. Neither addresses nor transactions leak anything.
  3. No UX problems because there is always agreement about the tag size that was used in a transaction.

@tevador
Copy link
Author

tevador commented Sep 13, 2023

Full wallets could use this fixed-size view tag as they first tag they scan against, allowing them to generate random values of b_addr for their addresses to help mitigate partitioning, while not affecting their scan time by more than fractions of a percent.

You have to look at it from a game-theoretic perspective. If users can make a choice that improves their experience regardless of what other users do, you have to assume they will make that choice (see prisoner's dilemma). Full wallet users will use the full tag size if they can speed up their scan time by a fraction of a percent.

@jeffro256
Copy link

Yeah the downsides are kinda weird and hard to reason about/plan OPSEC for. If there was a way to allow someone to encode a certain level of entropy to be received by someone else without the sender knowing what the level of entropy is, I think that'd be the way to go, but I don't know if that's possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment