Skip to content

Instantly share code, notes, and snippets.

@tevador
Last active December 10, 2024 20:03
Show Gist options
  • Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.
Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.

JAMTIS

This document describes a new addressing scheme for Monero.

Chapters 1-2 are intended for general audience.

Chapters 3-7 contain technical specifications.

Table of Contents

1. Introduction

1.1 Why a new address format?

Sometime in 2024, Monero plans to adopt a new transaction protocol called Seraphis [1], which enables much larger ring sizes than the current RingCT protocol. However, due to a different key image construction, Seraphis is not compatible with CryptoNote addresses. This means that each user will need to generate a new set of addresses from their existing private keys. This provides a unique opportunity to vastly improve the addressing scheme used by Monero.

1.2 Current Monero addresses

The CryptoNote-based addressing scheme [2] currently used by Monero has several issues:

  1. Addresses are not suitable as human-readable identifiers because they are long and case-sensitive.
  2. Too much information about the wallet is leaked when scanning is delegated to a third party.
  3. Generating subaddresses requires view access to the wallet. This is why many merchants prefer integrated addresses [3].
  4. View-only wallets need key images to be imported to detect spent outputs [4].
  5. Subaddresses that belong to the same wallet can be linked via the Janus attack [5].
  6. The detection of outputs received to subaddresses is based on a lookup table, which can sometimes cause the wallet to miss outputs [6].

1.3 Jamtis

Jamtis is a new addressing scheme that was developed specifically for Seraphis and tackles all of the shortcomings of CryptoNote addresses that were mentioned above. Additionally, Jamtis incorporates two other changes related to addresses to take advantage of this large upgrade opportunity:

  • A new 16-word mnemonic scheme called Polyseed [7] that will replace the legacy 25-word seed for new wallets.
  • The removal of integrated addresses and payment IDs [8].

2. Features

2.1 Address format

Jamtis addresses, when encoded as a string, start with the prefix xmra and consist of 196 characters. Example of an address: xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bfyji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wrb5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7whkckh51ik

There is no "main address" anymore - all Jamtis addresses are equivalent to a subaddress.

2.1.1 Recipient IDs

Jamtis introduces a short recipient identifier (RID) that can be calculated for every address. RID consists of 25 alphanumeric characters that are separated by underscores for better readability. The RID for the above address is regne_hwbna_u21gh_b54n0_8x36q. Instead of comparing long addresses, users can compare the much shorter RID. RIDs are also suitable to be communicated via phone calls, text messages or handwriting to confirm a recipient's address. This allows the address itself to be transferred via an insecure channel.

2.2 Light wallet scanning

Jamtis introduces new wallet tiers below view-only wallet. One of the new wallet tiers called "FindReceived" is intended for wallet-scanning and only has the ability to calculate view tags [9]. It cannot generate wallet addresses or decode output amounts.

View tags can be used to eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, this tier can also link outputs to those addresses. Possible use cases are:

2.2.1 Wallet component

A wallet can have a "FindReceived" component that stays connected to the network at all times and filters out outputs in the blockchain. The full wallet can thus be synchronized at least 256x faster when it comes online (it only needs to check outputs with a matching view tag).

2.2.2 Third party services

If the "FindReceived" private key is provided to a 3rd party, it can preprocess the blockchain and provide a list of potential outputs. This reduces the amount of data that a light wallet has to download by a factor of at least 256. The third party will not learn which outputs actually belong to the wallet and will not see output amounts.

2.3 Wallet tiers for merchants

Jamtis introduces new wallet tiers that are useful for merchants.

2.3.1 Address generator

This tier is intended for merchant point-of-sale terminals. It can generate addresses on demand, but otherwise has no access to the wallet (i.e. it cannot recognize any payments in the blockchain).

2.3.2 Payment validator

This wallet tier combines the Address generator tier with the ability to also view received payments (including amounts). It is intended for validating paid orders. It cannot see outgoing payments and received change.

2.4 Full view-only wallets

Jamtis supports full view-only wallets that can identify spent outputs (unlike legacy view-only wallets), so they can display the correct wallet balance and list all incoming and outgoing transactions.

2.5 Janus attack mitigation

Janus attack is a targeted attack that aims to determine if two addresses A, B belong to the same wallet. Janus outputs are crafted in such a way that they appear to the recipient as being received to the wallet address B, while secretly using a key from address A. If the recipient confirms the receipt of the payment, the sender learns that they own both addresses A and B.

Jamtis prevents this attack by allowing the recipient to recognize a Janus output.

2.6 Robust output detection

Jamtis addresses and outputs contain an encrypted address tag which enables a more robust output detection mechanism that does not need a lookup table and can reliably detect outputs sent to arbitrary wallet addresses.

3. Notation

3.1 Serialization functions

  1. The function BytesToInt256(x) deserializes a 256-bit little-endian integer from a 32-byte input.
  2. The function Int256ToBytes(x) serialized a 256-bit integer to a 32-byte little-endian output.

3.2 Hash function

The function Hb(k, x) with parameters b, k, refers to the Blake2b hash function [10] initialized as follows:

  • The output length is set to b bytes.
  • Hashing is done in sequential mode.
  • The Personalization string is set to the ASCII value "Monero", padded with zero bytes.
  • If the key k is not null, the hash function is initialized using the key k (maximum 64 bytes).
  • The input x is hashed.

The function SecretDerive is defined as:

SecretDerive(k, x) = H32(k, x)

3.3 Elliptic curves

Two elliptic curves are used in this specification:

  1. Curve25519 - a Montgomery curve. Points on this curve include a cyclic subgroup 𝔾1.
  2. Ed25519 - a twisted Edwards curve. Points on this curve include a cyclic subgroup 𝔾2.

Both curves are birationally equivalent, so the subgroups 𝔾1 and 𝔾2 have the same prime order ℓ = 2252 + 27742317777372353535851937790883648493. The total number of points on each curve is 8ℓ.

3.3.1 Curve25519

Curve25519 is used exclusively for the Diffie-Hellman key exchange [11].

Only a single generator point B is used:

Point Derivation Serialized (hex)
B generator of 𝔾1 0900000000000000000000000000000000000000000000000000000000000000

Private keys for Curve25519 are 32-byte integers denoted by a lowercase letter d. They are generated using the following KeyDerive1(k, x) function:

  1. d = H32(k, x)
  2. d[31] &= 0x7f (clear the most significant bit)
  3. d[0] &= 0xf8 (clear the least significant 3 bits)
  4. return d

All Curve25519 private keys are therefore multiples of the cofactor 8, which ensures that all public keys are in the prime-order subgroup. The multiplicative inverse modulo is calculated as d-1 = 8*(8*d)-1 to preserve the aforementioned property.

Public keys (elements of 𝔾1) are denoted by the capital letter D and are serialized as the x-coordinate of the corresponding Curve25519 point. Scalar multiplication is denoted by a space, e.g. D = d B.

3.3.2 Ed25519

The Edwards curve is used for signatures and more complex cryptographic protocols [12]. The following three generators are used:

Point Derivation Serialized (hex)
G generator of 𝔾2 5866666666666666666666666666666666666666666666666666666666666666
U Hp("seraphis U") 126582dfc357b10ecb0ce0f12c26359f53c64d4900b7696c2c4b3f7dcab7f730
X Hp("seraphis X") 4017a126181c34b0774d590523a08346be4f42348eddd50eb7a441b571b2b613

Here Hp refers to an unspecified hash-to-point function.

Private keys for Ed25519 are 32-byte integers denoted by a lowercase letter k. They are generated using the following function:

KeyDerive2(k, x) = H64(k, x) mod ℓ

Public keys (elements of 𝔾2) are denoted by the capital letter K and are serialized as 256-bit integers, with the lower 255 bits being the y-coordinate of the corresponding Ed25519 point and the most significant bit being the parity of the x-coordinate. Scalar multiplication is denoted by a space, e.g. K = k G.

3.4 Block cipher

The function BlockEnc(s, x) refers to the application of the Twofish [13] permutation using the secret key s on the 16-byte input x. The function BlockDec(s, x) refers to the application of the inverse permutation using the key s.

3.5 Base32 encoding

"Base32" in this specification referes to a binary-to-text encoding using the alphabet xmrbase32cdfghijknpqtuwy01456789. This alphabet was selected for the following reasons:

  1. The order of the characters has a unique prefix that distinguishes the encoding from other variants of "base32".
  2. The alphabet contains all digits 0-9, which allows numeric values to be encoded in a human readable form.
  3. Excludes the letters o, l, v and z for the same reasons as the z-base-32 encoding [14].

4. Wallets

4.1 Wallet parameters

Each wallet consists of two main private keys and a timestamp:

Field Type Description
km private key wallet master key
kvb private key view-balance key
birthday timestamp date when the wallet was created

The master key km is required to spend money in the wallet and the view-balance key kvb provides full view-only access.

The birthday timestamp is important when restoring a wallet and determines the blockchain height where scanning for owned outputs should begin.

4.2 New wallets

4.2.1 Standard wallets

Standard Jamtis wallets are generated as a 16-word Polyseed mnemonic [7], which contains a secret seed value used to derive the wallet master key and also encodes the date when the wallet was created. The key kvb is derived from the master key.

Field Derivation
km BytesToInt256(polyseed_key) mod ℓ
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday from Polyseed

4.2.2 Multisignature wallets

Multisignature wallets are generated in a setup ceremony, where all the signers collectively generate the wallet master key km and the view-balance key kvb.

Field Derivation
km setup ceremony
kvb setup ceremony
birthday setup ceremony

4.3 Migration of legacy wallets

Legacy pre-Seraphis wallets define two private keys:

  • private spend key ks
  • private view-key kv

4.3.1 Standard wallets

Legacy standard wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday entered manually

Legacy wallets cannot be migrated to Polyseed and will keep using the legacy 25-word seed.

4.3.2 Multisignature wallets

Legacy multisignature wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = kv
birthday entered manually

4.4 Additional keys

There are additional keys derived from kvb:

Key Name Derivation Used to
dfr find-received key kfr = KeyDerive1(kvb, "jamtis_find_received_key") scan for received outputs
dua unlock-amounts key kid = KeyDerive1(kvb, "jamtis_unlock_amounts_key") decrypt output amounts
sga generate-address secret sga = SecretDerive(kvb, "jamtis_generate_address_secret") generate addresses
sct cipher-tag secret ket = SecretDerive(sga, "jamtis_cipher_tag_secret") encrypt address tags

The key dfr provides the ability to calculate the sender-receiver shared secret when scanning for received outputs. The key dua can be used to create a secondary shared secret and is used to decrypt output amounts.

The key sga is used to generate public addresses. It has an additional child key sct, which is used to encrypt the address tag.

4.5 Key hierarchy

The following figure shows the overall hierarchy of wallet keys. Note that the relationship between km and kvb only applies to standard (non-multisignature) wallets.

key hierarchy

4.6 Wallet access tiers

Tier Knowledge Off-chain capabilities On-chain capabilities
AddrGen sga generate public addresses none
FindReceived dfr recognize all public wallet addresses eliminate 99.6% of non-owned outputs (up to § 5.3.5), link output to an address (except of change and self-spends)
ViewReceived dfr, dua, sga all view all received except of change and self-spends (up to § 5.3.14)
ViewAll kvb all view all
Master km all all

4.6.1 Address generator (AddrGen)

This wallet tier can generate public addresses for the wallet. It doesn't provide any blockchain access.

4.6.2 Output scanning wallet (FindReceived)

Thanks to view tags, this tier can eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, it can also link outputs to those addresses (but it cannot generate addresses on its own). This tier should provide a noticeable UX improvement with a limited impact on privacy. Possible use cases are:

  1. An always-online wallet component that filters out outputs in the blockchain. A higher-tier wallet can thus be synchronized 256x faster when it comes online.
  2. Third party scanning services. The service can preprocess the blockchain and provide a list of potential outputs with pre-calculated spend keys (up to § 5.2.4). This reduces the amount of data that a light wallet has to download by a factor of at least 256.

4.6.3 Payment validator (ViewReceived)

This level combines the tiers AddrGen and FindReceived and provides the wallet with the ability to see all incoming payments to the wallet, but cannot see any outgoing payments and change outputs. It can be used for payment processing or auditing purposes.

4.6.4 View-balance wallet (ViewAll)

This is a full view-only wallet than can see all incoming and outgoing payments (and thus can calculate the correct wallet balance).

4.6.5 Master wallet (Master)

This tier has full control of the wallet.

4.7 Wallet public keys

There are 3 global wallet public keys. These keys are not usually published, but are needed by lower wallet tiers.

Key Name Value
Ks wallet spend key Ks = kvb X + km U
Dua unlock-amounts key Dua = dua B
Dfr find-received key Dfr = dfr Dua

5. Addresses

5.1 Address generation

Jamtis wallets can generate up to 2128 different addresses. Each address is constructed from a 128-bit index j. The size of the index space allows stateless generation of new addresses without collisions, for example by constructing j as a UUID [15].

Each Jamtis address encodes the tuple (K1j, D2j, D3j, tj). The first three values are public keys, while tj is the "address tag" that contains the encrypted value of j.

5.1.1 Address keys

The three public keys are constructed as:

  • K1j = Ks + kuj U + kxj X + kgj G
  • D2j = daj Dfr
  • D3j = daj Dua

The private keys kuj, kxj, kgj and daj are derived as follows:

Keys Name Derivation
kuj spend key extensions kuj = KeyDerive2(sga, "jamtis_spendkey_extension_u" || j)
kxj spend key extensions kxj = KeyDerive2(sga, "jamtis_spendkey_extension_x" || j)
kgj spend key extensions kgj = KeyDerive2(sga, "jamtis_spendkey_extension_g" || j)
daj address keys daj = KeyDerive1(sga, "jamtis_address_privkey" || j)

5.1.2 Address tag

Each address additionally includes an 18-byte tag tj = (j', hj'), which consists of the encrypted value of j:

  • j' = BlockEnc(sct, j)

and a 2-byte "tag hint", which can be used to quickly recognize owned addresses:

  • hj' = H2(sct, "jamtis_address_tag_hint" || j')

5.2 Sending to an address

TODO

5.3 Receiving an output

TODO

5.4 Change and self-spends

TODO

5.5 Transaction size

Jamtis has a small impact on transaction size.

5.5.1 Transactions with 2 outputs

The size of 2-output transactions is increased by 28 bytes. The encrypted payment ID is removed, but the transaction needs two encrypted address tags t~ (one for the recipient and one for the change). Both outputs can use the same value of De.

5.5.2 Transactions with 3 or more outputs

Since there are no "main" addresses anymore, the TX_EXTRA_TAG_PUBKEY field can be removed from transactions with 3 or more outputs.

Instead, all transactions with 3 or more outputs will require one 50-byte tuple (De, t~) per output.

6. Address encoding

6.1 Address structure

An address has the following overall structure:

Field Size (bits) Description
Header 30* human-readable address header (§ 6.2)
K1 256 address key 1
D2 255 address key 2
D3 255 address key 3
t 144 address tag
Checksum 40* (§ 6.3)

* The header and the checksum are already in base32 format

6.2 Address header

The address starts with a human-readable header, which has the following format consisting of 6 alphanumeric characters:

"xmra" <version char> <network type char>

Unlike the rest of the address, the header is never encoded and is the same for both the binary and textual representations. The string is not null terminated.

The software decoding an address shall abort if the first 4 bytes are not 0x78 0x6d 0x72 0x61 ("xmra").

The "xmra" prefix serves as a disambiguation from legacy addresses that start with "4" or "8". Additionally, base58 strings that start with the character x are invalid due to overflow [16], so legacy Monero software can never accidentally decode a Jamtis address.

6.2.1 Version character

The version character is "1". The software decoding an address shall abort if a different character is encountered.

6.2.2 Network type

network char network type
"t" testnet
"s" stagenet
"m" mainnet

The software decoding an address shall abort if an invalid network character is encountered.

6.3 Checksum

The purpose of the checksum is to detect accidental corruption of the address. The checksum consists of 8 characters and is calculated with a cyclic code over GF(32) using the polynomial:

x8 + 3x7 + 11x6 + 18x5 + 5x4 + 25x3 + 21x2 + 12x + 1

The checksum can detect all errors affecting 5 or fewer characters. Arbitrary corruption of the address has a chance of less than 1 in 1012 of not being detected. The reference code how to calculate the checksum is in Appendix A.

6.4 Binary-to-text encoding

An address can be encoded into a string as follows:

address_string = header + base32(data) + checksum

where header is the 6-character human-readable header string (already in base32), data refers to the address tuple (K1, D2, D3, t), encoded in 910 bits, and the checksum is the 8-character checksum (already in base32). The total length of the encoded address 196 characters (=6+182+8).

6.4.1 QR Codes

While the canonical form of an address is lower case, when encoding an address into a QR code, the address should be converted to upper case to take advantage of the more efficient alphanumeric encoding mode.

6.5 Recipient authentication

TODO

7. Test vectors

TODO

References

  1. https://github.com/UkoeHB/Seraphis
  2. https://github.com/monero-project/research-lab/blob/master/whitepaper/whitepaper.pdf
  3. monero-project/meta#299 (comment)
  4. https://www.getmonero.org/resources/user-guides/view_only.html
  5. https://web.getmonero.org/2019/10/18/subaddress-janus.html
  6. monero-project/monero#8138
  7. https://github.com/tevador/polyseed
  8. monero-project/monero#7889
  9. monero-project/research-lab#73
  10. https://eprint.iacr.org/2013/322.pdf
  11. https://cr.yp.to/ecdh/curve25519-20060209.pdf
  12. https://ed25519.cr.yp.to/ed25519-20110926.pdf
  13. https://www.schneier.com/wp-content/uploads/2016/02/paper-twofish-paper.pdf
  14. http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt
  15. https://en.wikipedia.org/wiki/Universally_unique_identifier
  16. https://github.com/monero-project/monero/blob/319b831e65437f1c8e5ff4b4cb9be03f091f6fc6/src/common/base58.cpp#L157

Appendix A: Checksum

# Jamtis address checksum algorithm

# cyclic code based on the generator 3BI5PLC1
# can detect 5 errors up to the length of 994 characters
GEN=[0x1ae45cd581, 0x359aad8f02, 0x61754f9b24, 0xc2ba1bb368, 0xcd2623e3f0]

M = 0xffffffffff

def jamtis_polymod(data):
    c = 1
    for v in data:
        b = (c >> 35)
        c = ((c & 0x07ffffffff) << 5) ^ v
        for i in range(5):
            c ^= GEN[i] if ((b >> i) & 1) else 0
    return c

def jamtis_verify_checksum(data):
    return jamtis_polymod(data) == M

def jamtis_create_checksum(data):
    polymod = jamtis_polymod(data + [0,0,0,0,0,0,0,0]) ^ M
    return [(polymod >> 5 * (7 - i)) & 31 for i in range(8)]

# test/example

CHARSET = "xmrbase32cdfghijknpqtuwy01456789"

addr_test = (
    "xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3"
    "wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bf"
    "yji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wr"
    "b5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7wh")

addr_data = [CHARSET.find(x) for x in addr_test]
addr_enc = addr_data + jamtis_create_checksum(addr_data)
addr = "".join([CHARSET[x] for x in addr_enc])

print(addr)
print("len =", len(addr))
print("valid =", jamtis_verify_checksum(addr_enc))
@UkoeHB
Copy link

UkoeHB commented Jan 10, 2023

I am working on seraphis knowledge/audit proofs with @DangerousFreedom1984 and ran into some issues with enote ownership proofs and address index proofs.

  • enote ownership proof: prove that an enote is owned by a specific user address (transitively, the owner of that address owns the enote)

Any proof method you come up with (A. subtract K_1 from Ko and make a composition proof on the remainder; B. expose the sender-receiver secret q and allow the verifier to recompute Ko from K_1 [only works for non-selfsends]) can be spoofed by the prover if they know the private keys of the real address, since the K_1 used in the proof can be freely defined. Spoofing means making a proof that an enote was sent to a particular address when the original sender sent it to a different address.

To get around that problem, I propose updating the sender extension to include the key K_1 that is being extended (e.g. k_{g, sender} = H_n("..g..", K_1, q, C). Then you can expose q and K_1 and the verifier can recreate Ko and be confident that K_1 owns the enote. Note that this proof doesn't provide a way for you to prove an address doesn't own an enote, all it says is 'if you make a valid proof, then the K_1 in that proof is accurate'.

Another issue is you can't use that approach to make selfsend enote ownership proofs, because q is used without a secondary secret (the baked key) when constructing amount commitments and encoded amounts (meaning you can't make a selfsend enote ownership proof without exposing the amount). Moreover, any such proof would have to reveal that an enote is a selfsend type (no type-agnostic proof).

To solve that I propose updating selfsend enote construction so it mimicks normal enotes more closely. The only changes needed are adding a selfsend baked key to amounts (baked_key_selfsend = H_32[k_vb](q); for consistency, update the normal one to baked_key_plain = H_32(xr xG) so that both baked keys will have the same serialization pattern [random 32 bytes]), and encrypting address tags the same way as normal enotes (instead of encrypting the raw index). Changing those things actually simplifies the protocol a little by isolating per-type customization to just the construction of secrets q and the baked key.

  • address index proof: prove that an address was generated from a particular index

There is currently no way to prove an address was constructed from a particular index without exposing s_ga. I propose changing the address extensions to H_n(K_s, j, H_32[s_ga](j)) where K_s = k_vb X + k_m U (and H_n_x25519(K_s, j, H_32[s_ga](j)) for the xK_2 and xK_3 modifiers). Then an address index proof for {K}_j will expose K_s, j, and secret H_32[s_ga](j). The user can then do another proof on K_s to show the private keys are known, or do a composition proof with the address {K}_j.

EDIT: These changes have been implemented.

@jeffro256
Copy link

I have a concern with mixing the "find-received" tier (k_fr) and "generate-address" tier (s_ga & K_s). Having access to both these tiers allows more than the sum of these tiers, namely the ability to 100% recognize owned incoming enotes (basically the "payment validator" tier w/o knowing the amounts). If the shared secret used to encrypt the address tag is a function s_sr2, then the nominal one-time address K_'o would only be calculable on the "payment validator" tier.

I suggest using the following method for creating the encrypted address tag in an enote: addr_tag_enc = addr_tag XOR H_ate(s_sr1 || s_sr2 || Ko). Under this scheme, the "generate-address" tier can still generate any public address with the same information, but can't decrypt the encrypted address tags.

There's two real-life issues that I can imagine that this change would fix. Let's say that you wanted to create a social payment app, like Venmo, in which the backend both calculates and filters view tags to speed up scanning, as well as generates new receive addresses for people who want to send money to their users. Without changing the address tiers, this service would be able to identify all owned enotes of their users w/ ease. Another scenario in which this change would increase security is a merchant server system where the find-receive keys and generate address keys are spread across user-facing servers for quick & responsive invoice generation. If a malicious actor gains access to both key tiers, then they can generate addresses and see all incoming transactions whereas under the modified scheme, they can only generate addresses and calculate view tags.

@UkoeHB
Copy link

UkoeHB commented Aug 14, 2023

@jeffro256

  1. If decrypting the address tag requires s_sr2, that invalidates the performance benefit of k_fr scanning, because clients of a remote scanner now have to compute the baked key 1/(k^j_a ∗ k_ua)) ∗ K_e.
  2. The baked key actually depends on the address index j, so what you describe is logically impossible.

@jeffro256
Copy link

jeffro256 commented Aug 19, 2023

Okay I've looked into the 3 main privacy issues I've had with Jamtis deeper, and have a proposal. Thanks to @UkoeHB for the guidance thus far! I modified the Jamtis section of Ukoe's "Implementing Seraphis" paper with the details and uploaded it to Ufile since its a little more fleshed out than this doc. See below for a high-level view of the proposal.

Jamtis Change: Fix F-R Privacy Issues and New View Tag Tier

Pros

  • Third-parties who compute view tags on behalf of users can no longer strongly identify incoming normal enotes to known public addresses.
  • Third-parties who compute view tags on behalf of users can no longer strongly identify incoming normal enotes sent to a public address that is used more than once.
  • Third-parties can now compute view tags and generate public addresses on behalf of users without the ability to learn any additonal balance recovery information.
  • There are now two tiers of view tag wallets that users can pick between depending on their desire of balance/privacy: dense (1 byte) and sparse (2 bytes).

Cons

  • Public address raw size is increased by 30 bytes (48 characters if encoded using base32) (Additional +32 bytes for new public key, -2 bytes to remove decipher hint). Transactions remain the same size.
  • Light wallet scanning is slower on the client side (each deciphering op is replaced with DH op)
  • Additional spec complexity
  • Some other things I'm not currently seeing

Description of changes

Account secrets

Instead of one find-receive key k_fr, there are now two keys: the dense view key k_dv and the sparse view key k_sv. Instead of a base pubkey K_fr, there is now the dense view pubkey K_dv = k_dv * K_ua and sparse view pubkey K_sv = k_sv * K_ua.

Public address

The new public address now contains 4 pubkeys instead of 3 and does away with the decipher hint. The four pubkeys are labeled Kj_s, Dj_ua, Dj_dv, and Dj_sv. Kj_s is the same as Kj_1 in the old address scheme, while Dj_ua, Dj_dv, and Dj_sv are equal to their respective base pubkeys multiplied by the address private key. The ciphered address index c^j stays the same and Dj_ua is the same as the old Kj_3. To summarize, the new address tuple is [Kj_s, Dj_ua, Dj_dv, Dj_sv, c^j].

DH exchanges

The ephemeral pubkey K_e is calculated K_e = r * Dj_ua by the sender like normal (remember that Dj_ua is functionally identical to the old Kj_3), but now there are two DH keys: the dense DH key Kdv_d = r * Dj_dv = k_dv * K_e and the sparse DH key Ksv_d = r * Dj_sv = k_sv * K_e.

View tags

There are now two view tags (dense_view_tag and sparse_view_tag) per enote which are functions of their respective DH keys. In keeping with the old scheme, the dense view tag replaces the old view tag at 1 byte in size, and the sparse view tag replaces the decipher hint at 2 bytes in size. These view tags are completely independent of each other so combining the checks multiplies the amount of filtering done to 1:16777216. A user can choose to reveal either k_dv or k_sv (but not both, explained later) to a light wallet server to pre-scan the enotes for them. Unlike the old k_fr, knowing only one of k_dv or k_sv does not allow a third-party to perform any process in the balance recovery process except for recomputing view tags, which is good for privacy (also explained later).

Sender-Receiver Secret s_sr1

The sender-receiver secret s_sr1 is now computed as s_sr1 = Hsr1n(Kdv_d || Ksv_d || K_e || input_context). Notice that both DH keys are needed to compute s_sr1. This point is crucial to the privacy properties of the new scheme.

Wallet Tiers

There are now two tiers for view tag computation: dense and sparse. They can do nothing else besides compute their respective view tags. The find-receive tier can identify all incoming enotes, but not view amounts. The payment-validator tier remains the same in terms of capabilities. There are 2 new "compound tiers" which are the combination of the dense/sparse view tag tier and the generate address tier, which do exactly as expected without additional privacy drawbacks.

How the New Changes Address Privacy Issues

The core of the first two privacy issues mentioned in the "pros" section stem from the fact that the ability to decrypt address tags was tied to the ability to perform view tag computation. Since address tags are 1) public and 2) constant for a given address, third-parties with knowledge of k_fr can make extremely strong guesses about users' ownership of enotes under loose conditions. This new scheme decouples those two things so that a third-party can compute view tags for a user but not learn any additional information about enotes. To decrypt "address tags" (its now just the ciphered address index) under this new scheme, a third-party must know both k_dv and k_sv, since those are needed to compute s_sr1.

The third privacy issue (mixing the find-receive and generate-address tier) is fixed because of the same reason: a third-party must now know both k_dv and k_sv to compute s_sr1. However, there was a deeper issue here with the old scheme since third-parties who knew k_fr, k_ga, and K_s (the combination of find-receive and generate-address tiers) could decipher the address indicies and recompute the onetime address Ko, proving to themselves that a user owns this enote with 100% certainty (assuming the user did not lose their keys). Since this issue is addressed, this opens up the possibility for Venmo-like applications where s_ga, k_dv, and K_s are given to a single third-party so that the third-party can reduce users' refresh times by ~99.6% using view tags and generate receive addresses for their users' while they are offline without compromising privacy.

How the New Changes Affect Scanning Speed

For regular full wallets where both k_dv and k_sv are known, the first view tag check can be against the 2-byte sparse view tag to initially filter out all but 1:65536 enotes with just one DH exchange, as compared to 1:256. After that, the dense view tag can be checked to further refine the enotes in a 1:256 ratio. For owned enotes, the balance recovery process is actually slower since 3 DH operations are needed instead of 2. As for 1-byte-view-tag light wallets (Old "find-receive" tier and new "dense-view" tier), the server does the exact same amount of work (1 DH + view tag check), but the client will need to do expend more CPU cycles, assuming that a DH exchange is more expensive than symmetrically deciphering the 16 bytes address index.

Below, I have provided and quick and dirty comparison of the operations that must be done to scan enotes under different wallet types. I use the term "Sparse-View Light Wallet" and "Dense-View Light Wallet" to refer to wallet schemes in which the sparse (2 byte) view tag key k_sv and dense (1 byte) view tag key k_dv are provided to third parties, respectively, to initially filter out enotes. The new "Dense-View" wallet tier is the most similar to the old "Find-Received" wallet tier in the respect that they can both calculate 1 byte view tags on behalf of users.

Normal Enote Operation Density

Amortized Period Old Full/Light Wallet New Full/Sparse-View Light Wallet New Dense-View Light Wallet
1:1 enotes DH + view tag check* DH + view tag check* DH + view tag check*
1:28 enotes 16 byte decipher + decipher hint check DH + view tag check
1:216 enotes DH + view tag check
1:224 enotes Ko recompute 16 byte decipher + Ko recompute 16 byte decipher + Ko recompute

* = If applicable, a light wallet server would perform this operation on behalf of a user in the background. This is important when considering trade-offs because if you value client scanning time above all else, you can disregard the operations marked by an asterisk when considering light wallet schemes.

Total Notable Operations for a Owned Normal Enote

Scheme Total Notable Operations for a Owned Normal Enote
Old DH + view tag check + 16 byte decipher + decipher hint check + Ko recompute
New 2 * (DH + view tag check) + 16 byte decipher + Ko recompute

There are obviously more operations in the balance recovery than are mentioned here, but these are likely the most expensive. The main performance difference between full wallets is that every 256 enotes, the old scheme has to decrypt the address tag, decipher the address index j. The new scheme only does this every 1:16777216 enotes, but must perform an extra Diffie-Helman key exchange and view tag check once every 65536 enotes. According to @tevador, DH exchanges are ~100x more expensive than deciphering, so the scanning performance will likely remain more or less the same for full wallets.

On the other hand, the performance for light wallet clients is worse. The work for the server is exactly the same: 1 DH + view tag check, but the client must do 1 DH + view tag check instead of a 16 byte decipher every 1:256 enotes on-chain (every enote the client receives). At any rate, in both the new "dense-view" light wallet and old "find-receive" wallet tiers, the client must download ~65536x more information (less if the user owns a large fraction of on-chain enotes) than is actually needed for balancy recovery past view tag/decipher hint checks, so the performance difference here is hard to quantify without real-word testing. It should be noted that the new "sparse-view" wallet tier follows the same recovery path as the full wallet, so it gets the performance benefit on both the server and client of first being able to check against the sparse view tag, filtering out all but 1:65536 enotes for a user as compared to the normal 1:256. This means that a "sparse-view" light wallet client has to download 256x less information than current Jamtis light wallets, obviously at the privacy cost of narrowing down owned enotes probabilistically to 1:65536.

Additional Opinions on Why I Think the Trade-off is Worth it

Gathering from years of forum discussions and IRC/Matrix chats, one of the biggest UX complaints (arguably only beaten by the 10-block-lock) against Monero is the frustratingly long refresh times. This is such an issue that light wallet ecosystems evolved in the very early days of Monero to tackle this problem. The users of these light wallets were willing to completely sacrifice their incoming enote privacy (by revealing private view keys) just to bring refresh times down. There are innumerable posts online about potential users who left Monero completely because of refresh bugs and the corresponding wait times. This is why privacy-preserving light wallet servers are the future for most casual users, and will capture many on-the-fence people who want a better privacy/UX balance. Creating accessible, un-foot-gun-able digital cash is the core value proposition of Monero for me.

The new light wallet scheme under Jamtis is exciting and brings and lot of possibility. However, the privacy issues inherent to them would make it hard for me to recommend to anyone except the least privacy-minded people. There are simply too many ways to footgun, the main concerns being over the passive address tag decrypting issues, which means you can't receive to the same address more than once or let your light wallet server know your public addresses.

Addressing the main downside of this change, the address size, I say: its not that big of a deal to me. Jamtis addresses are already >3x the size of BTC addresses, so increasing the size by ~25% doesn't matter. The new addresses would still be easily copy-and-paste-able and fit on a medium QR code. I don't know anyone who is typing the addresses out by hand or reading them aloud, even with legacy Cryptonote addresses which are >2x the length of a BTC address, so I don't believe that use case is affected. For those that have read this far, thank you for your time and consideration. ;)

@tevador
Copy link
Author

tevador commented Aug 19, 2023

Public address size is increased by 30 bytes

The actual address length would increase from 196 to 244 characters.

As you can see, unless deciphering is more than 256x faster than a DH operation

DH is about 100x slower. The performance impact of this change is likely negligible (slightly slower overall).

Nevertheless, the privacy benefits might still be worth it.

@One-horse-wagon
Copy link

To enhance security by accommodating your new protocol in a 244 character address is a no-brainer to me. Address length would become an issue only if it would limit what you can do, such as making Q-R codes unusable.

@jeffro256
Copy link

DH is about 100x slower.

I assume you're talking about X25519 and Twofish here, is that correct? If we move to a curve cycle to prepare for FCMPs, how fast can DH/variable-base-scalar-multiplication be made using your curve cycle? I would assume that it would be slower, so the full wallet scanning performance changes would likely wash out completely.

@tevador
Copy link
Author

tevador commented Aug 20, 2023

I assume you're talking about X25519 and Twofish here, is that correct?

Correct.

how fast can DH/variable-base-scalar-multiplication be made using your curve cycle?

X25519 has many optimized implementations that would be very hard to beat with a custom curve.

If we switch to the curve cycle, this only affects the "proof" keys (denoted with the letter K in this specification). I strongly recommend to keep X25519 for the key exchange keys (denoted with the letter D in this specification). This should be easy to do because Jamtis never needs any interop between the key exchange keys and the "proof" keys, so these can be completely unrelated elliptic curve groups.

@j-berman
Copy link

I lean yes on the idea to add an additional pub key to the address for the privacy gain to light wallet users, however, I'm not a hard yes and I think it's an acceptable decision to proceed without it. I'm going to steel man an opposing argument: even with this proposal, a light wallet user should still expect that a 3rd party server is able to trace their transactions using statistical analysis. As such, the addition offers a benefit that light wallet users using 3rd party servers shouldn't consider in their threat model, and therefore is not worth the added UX and complexity burden.

I lean yes (and do not agree with the steel man) because the address length would still fall within an "acceptable" size, and the proposal offers a tangible privacy benefit to light wallet users (and therefore benefits the anonymity set): the server cannot definitively identify a user's received enotes even if the user receives to the same address twice or if the server knows the user's address, which is a strict improvement to a light wallet user's privacy even if there are still potential statistical leaks under certain conditions.

I'll explain why I think a 3rd party server may still be able to trace transactions using statistical analysis under certain conditions.

Assume this proposal is accepted alongside full chain membership proofs. After some discussion with @kayabaNerve, here is what I understand the theoretically optimal privacy profile for light wallets could look like when constructing a tx:

  1. The user opens their light wallet client and requests their wallet's view-tag-matched enotes from the server.
  2. In order to construct a tx, a light wallet client fetches paths in the merkle tree to a set of enotes (1 real path + N decoy paths so whomever is serving the paths does not know which enote the user is spending)1
    • Each single path would be on the order of kilobytes, thus the light wallet client would fetch a subset of paths similar to fetching decoys today (using a decoy selection algo).
    • The light wallet client should request these paths from a 3rd party daemon whose operator is ideally not colluding with the light wallet server. This way the user avoids revealing to the light wallet server that the user is trying to construct a tx.
      • The light wallet client could request paths to view-tag-matched enotes only, just in case the 3rd party daemon is colluding with the light wallet server.
    • The light wallet client should also request fees from and submit the final tx to 1 or more 3rd party daemons ideally not colluding with the light wallet server to avoid revealing the tx was constructed by the user to the server.
  3. Finally, the user's tx will include a view tag match on chain.

If you assume the 3rd party daemon is not colluding with the light wallet server, then the statistical footprint is: user opened their light wallet, shortly thereafter there's a view tag match on chain. This footprint's impact on a user's privacy depends entirely on tx volume. With low volume, the server is able to tell the user likely spent an enote in the tx since the view tag match is likely change. If the server collects these footprints for every tx the user constructs, with low volume, the server can perhaps start to build a user's plausible tx graph.

If you assume the 3rd party daemon is colluding with the light wallet server, which I think should be every user's default assumption (trusted 3rd parties are security holes), then the statistical footprint naturally can have a worse impact on a user's privacy. The light wallet server can definitively tell when the user constructs a tx, and further can narrow in on a subset of plausible spends. Example:

  • The user receives an enoteA in txA, then spends that enoteA in txB and has a change enoteB in txB. The light wallet server knows the user constructed txB and therefore knows the view tag matched enoteB in txB is likely the user's.
  • When spending enoteB in txC, the user requests a set of merkle paths where enoteB is 1 of N path requests.
  • The light wallet server knows the user constructed txC and can make an educated guess that enoteB was spent in txC.

The light wallet server has thus built up evidence the user received enoteB in txB and spent enoteB in txC.

This statistical leak should be considered unavoidable for the light wallet tier imo; this leak can only be mitigated in some capacity. Which is why I would hope that light wallets don't replace full wallets for privacy-conscious users unless they're running their own light wallet servers. I would still argue the single additional key "find-received" tier as currently spec'd is valuable and worth implementing because 1) amounts are unknown to the server (significant privacy benefit), and 2) it offers a tangible privacy benefit since the light wallet server cannot definitively identify all of a user's received enotes under all conditions. But I can understand the argument why two additional keys for the tier is excessive considering the above argument.

Reiterating: I'm still for the proposal to add an additional pub key to the address. I think the tangible privacy benefit the additional key brings to light wallet users is worth ~25% larger addresses and more complexity. But I don't hold a strong yes considering I think the argument against is a strong argument.


I haven't dug deeper into the sparse/dense view side of the proposal yet and will comment on that later.


1: requesting paths in the merkle tree would be unnecessary if the client downloads the entire merkle tree when scanning. However, this downloading could be on the order of gb's, which would then defeat the core benefit of a light wallet: instant wallet open.

@kayabaNerve
Copy link

kayabaNerve commented Aug 22, 2023

The Merkle tree leaves would be 32 bytes per output, or a few GB @ 100m outputs. If we have branches with no view-tag-matched outputs, they can be dropped for one 32 byte value. If the view tag hit rate is 1/256, I believe more than half of the branches will have at least one leaf. If the view tag hit rate is 1/65536, most won't.

(branch length is currently configured to 167)

If we have a 1:65536 hit rate, only ~1/400 branches will be hit? That means the 3.2 GB leaf set at 100m outputs becomes 10 MB? It seems much saner to just download the tree in this case.

@j-berman
Copy link

j-berman commented Aug 22, 2023

Tx volume is hovering around ~20k txs per day these days, which is a floor of ~40k outputs per day. Let's assume ~65k outputs per day, which is an expected ~1 view tag match per day at a 1:65,536 hit rate. At that rate, any view-tag-matched enotes the server identifies around the time a user opens their wallet would almost certainly be the user's enotes. Further, any clusters of enotes the user spends/receives in a single day would stick out like a sore thumb to the server.

Seems at that hit rate and today's volume, the privacy gain of view tags is close to nil.

@kayabaNerve
Copy link

kayabaNerve commented Aug 22, 2023 via email

@DangerousFreedom1984
Copy link

  • Is the speed to recover enotes/balances of normal wallets decreasing? If so how much?
  • What is roughly the rate of people that use third-party servers to filter enotes for them?
  • If lets say only 1% of people would give their view keys to third partys to scan the blockchain for them, should we trade the speed recovery of 99% of users so those 1% can benefit from a more private recovery? (I'm unsure of the numbers, just a thought)
  • Giving away your sparse and dense priv keys is the same as giving away the priv find_receive key in the original seraphis, right?
  • (Just a thought) Differently from Bloom filters in Bitcoin that in reality dont really enhance privacy, I believe that these changes would enhance privacy here due to the different layers of privacy that Monero already has. But what would be nice to see would be less information being comunicated to the wallets but I can't see any improvements here (today I guess we have the public ephemeral key, view-tag and onetime-address, right? Would be nice to somehow get less info to improve speed recovery and privacy. No idea how.
  • It would have been really hard to make these changes if Seraphis were already in use as they are huge. I think we would have needed basically to multiply the seraphis lib by 2 since it touches almost every aspect of it. But I like also the idea of increasing the address for those who want that feature with more privacy. Do you think that these changes could work as an addon? Would the original Seraphis lib offer enough freedom for that? Maybe a good exercise to think about :p
  • I am willing to make the necessary changes in the knowledge proofs if these changes pass.
  • I'm still in the process of understanding and trying to answer these questions that I have so I don't have an opinion now but the efforts are very much appreciated. Thank you!

@UkoeHB
Copy link

UkoeHB commented Aug 28, 2023

@jeffro256, here is my review of the proposed changes to the document. I will follow-up with an assessment of pros/cons in a later comment.

To summarize the proposal: Do two key derivations instead of one during the 'view tag filtering' piece of balance recovery. If one derivation is offloaded to a third party, then the second derivation gates access to the nominal address tag (and nominal address spend key).

  • deriving s_fr from k_ua

    • It would be better to derive s_fr from k_vb. That way k_dv and k_sv will have the same entropy as k_ua.
  • section 8.2.4 'Optimized Design'

    • Normal enotes: It should be 'three ECDH exchanges'. Also, adding an additional 32 bytes to s^sr_1 means you'll need two blake2b blocks instead of one (a block is 128 bytes, and iirc we only need one block for s^sr_1 currently), so it is technically four hash operations for normal enote secrets.
  • section 8.3.3

    • Formatting is messed up.
  • section 8.3.4 (needs proof-reading)

    • "we include a MAC-like hashes" -> "we include MAC-like hashes"
    • "and check it against" -> "and check them against"
    • "the ECDH exchange" -> "the ECDH exchanges"
    • start a quote with back ticks so they curl properly: ``
    • "ensuring the view tag derivation" -> "ensuring the view tag derivations"
    • Tentative rewrite: "We highlight the advantage of using two view tags, rather than one, in Section 8.5.1".
  • section 8.4.3

    • K_1 -> K_s
  • section 8.5.1

    • Revert section title changes.
    • "by checking view tags" -> "by checking its view tags"
    • "since it tends to be larger, and thus filters out more computation" -> This is introduced with no prior discussion about the recommended size of view tags (other than vaguely implied by the view tag names).
  • section 8.5.2

    • Self-send tau checks are no longer cheap, because there is no longer an address tag hint.
  • Comments

    • I am not entirely in agreement with rolling back the 'address tag' term. I think it is easier to handle than 'ciphered address index'.
    • Considering the self-send tau check issue, it would be better to just retain the address tag hint instead of adding in a separate view tag. (EDIT: the perf diff here is probably non-existent, so I retract this comment)

@jeffro256
Copy link

@DangerousFreedom1984

Is the speed to recover enotes/balances of normal wallets decreasing? If so how much?

Honestly, this is really hard to say. I wanted to say that normal full wallet scanning was not going to be any slower than before, but @UkoeHB brought up an issue with the self-send tau checks (I haven't looked into it yet).

What is roughly the rate of people that use third-party servers to filter enotes for them?

I think the rate of people using light wallet servers now is very low because of the terrible privacy trade-offs (giving away your private view key). Fixing some of the privacy issues with light wallet servers and advertising those changes amongst the greater community would surely affect the usage rate.

If lets say only 1% of people would give their view keys to third partys to scan the blockchain for them, should we trade the speed recovery of 99% of users so those 1% can benefit from a more private recovery? (I'm unsure of the numbers, just a thought)

If it really was this low then I don't know if the trade-off would be worth it. I suspect it won't be this low, though. Just look at (e.g.) MyMonero downloads vs other apps.

But what would be nice to see would be less information being comunicated to the wallets but I can't see any improvements here (today I guess we have the public ephemeral key, view-tag and onetime-address, right? Would be nice to somehow get less info to improve speed recovery and privacy. No idea how.

Unfortunately, unless some other technique is used to transmit chain data, the fewer enotes/txs the light wallet server associates with you, the smaller your anonymity set is, which means less bandwidth = less private. Idk how to solve that yet either.

I think we would have needed basically to multiply the seraphis lib by 2 since it touches almost every aspect of it

It doesn't really affect any part of Seraphis proper, just the Jamtis addressing layer, and just normal enote balance recovery at that. So it does require basically a complete rewrite of balance recovery code, but shouldn't actually expand it too much hopefully (working on that right now).

Do you think that these changes could work as an addon? Would the original Seraphis lib offer enough freedom for that?

Two problems with making these changes an optional add-on is that 1) you partition yourself to senders by giving away information about your type of wallet and 2) ecosystem developers now have to support 2 types of addresses, and you can see how well that ends up normally (e.g. current light wallets still don't support sub-addresses). I (and I'm sure others) would prefer if there was just one type of address, but it certainly could be done.

I am willing to make the necessary changes in the knowledge proofs if these changes pass

Thank you, I really appreciate it ;)

@jeffro256
Copy link

@j-berman Thank you for your deep analysis and counter-arguments

To confront the initial steelman:

even with this proposal, a light wallet user should still expect that a 3rd party server is able to trace their transactions using statistical analysis.

Since Monero's conception, the network has never provided perfect privacy, only plausible deniability. There is over 9 years of on-chain data to perform statistical analysis upon, but never has the protocol been designed to allow deterministic de-anonymization. The implications of the ability for light wallet server to more-or-less ~100% deterministically sense that a user owns an incoming payment in conditions outside of the user's control (public address sharing, multiple receives) are massive, especially in the western legal domain. Downgrading these attacks to statistical, especially where the risks decrease with greater transaction flow, may save people from legal battles in the future.

I agree with almost everything else, although I don't think we should completely consider the statement true in all cases:

The light wallet server can definitively tell when the user constructs a tx, and further can narrow in on a subset of plausible spends

This is a current design choice that is made for light wallet servers because of the convenience, but nothing about the Cryotonote protocol or Seraphis/Jamtis protocol requires this to be true. It is always possible to construct a transaction and broadcast is to the network directly, and even use a Tor tx proxy, bypassing the light wallet server and obfuscating the user's IP address. In this case, the user doesn't have to assume that the 3rd party daemon isn't colluding with the light wallet server, it knows it to be true, aside from a Sybil/Eclipse attack.

@j-berman
Copy link

j-berman commented Aug 28, 2023

I'd say my commentary is most relevant toward understanding why a 2 byte view tag would offer basically no privacy advantage at today's tx volume due to its statistical surface, even with Tor and with connecting to 3rd party daemons to submit txs: https://gist.github.com/tevador/50160d160d24cfc6c52ae02eb3d17024?permalink_comment_id=4668705#gistcomment-4668705

I generally agree the idea to add an additional pub key does provide a stronger level of privacy though, which is the primary reason why I'm a proponent of the idea. I agree that when compared to the Jamtis light wallet tier without the additional pub key, this proposal downgrades the statistical attack surface (and the surface could become virtually non-existent with extremely high tx volume).

Still, it's worth keeping in mind that the statistical analysis surface the light wallet tier brings is more significant than Monero's current full wallets.

A while back someone proposed that full wallets only download data necessary to determine which outputs belong to a user, and then once identified, request the transactions of those outputs along with "chaff" (decoy) transactions, in order to minimize data needed to download when scanning. There was pushback on this idea because of the widened statistical surface enabling a node to potentially pinpoint a user's txs: https://www.reddit.com/r/Monero/comments/5wc2th/a_proposal_to_speed_up_wallet_sync_around_5x/de940mj/

It's worth keeping in mind the light wallet tier introduces a similar surface.

It is always possible to construct a transaction and broadcast is to the network directly, and even use a Tor tx proxy, bypassing the light wallet server and obfuscating the user's IP address.

This is what I was getting at in explaining how the optimal privacy profile of a light wallet client would communicate with a 3rd party daemon ideally not colluding with the server. Even with Tor though, if a 3rd party daemon combines logs with a light wallet server, the logs would show e.g. Bob just opened his light wallet client, then 1 person just requested paths in a merkle tree (1 path included one of Bob's view tag matched enotes)/fees/submitted a tx to the network, and Bob has a view tag match in that tx.

Unless there exists significant cover volume where tons of people are trying to construct txs at a specific point in time, then it's fairly trivial to guess Bob's tx, his spent enote, and his change enote.

However, yes, it's still a "guess" which I agree is stronger privacy than the current Jamtis light wallet tier's "100% certainty in some cases" and would improve with higher tx volume.

@jeffro256
Copy link

@UkoeHB I've been thinking about the slowness of the self-send tau checks under the new addressing scheme, and yes you are right, they are slower since there are no address tag hints. However, since you can now do 3-bytes of view tag checks BEFORE doing the self-send tau checks vs 1-byte of view tags checks, under the the new scheme, the process of self-send tau checks will be done ~65536 times less (more often if one's self-sends is a larger portion of total on-chain enote volume). Hopefully, this amortizes out to be slightly faster overall for most users.

@kayabaNerve
Copy link

Too many view tag bytes hurts privacy AFAIUI, @j-berman to properly state what I'm thinking of so we're all on the same page.

@jeffro256
Copy link

To be clear, I say 3 bytes of view tags, but it is split into two view tags, a 1-byte and a 2-byte tag, which are each computed from two independent DH secrets. You can give access to compute just one view tag (presumably the 1-byte view tag) to a light wallet server. However, if you are the client with the whole view-balance key, you can compute both view tags and check against both before trying self-send tau checks.

@jeffro256
Copy link

@j-berman Was making the point that without huge increases to transaction volume and the assumption that the third-party daemon and light wallet server are not colluding, the privacy of giving a light wallet server the ability to compute 2-byte view tags is very bad.

@kayabaNerve
Copy link

Ah, sorry. Thanks for clarifying.

@jeffro256
Copy link

jeffro256 commented Sep 10, 2023

For the base32 encoding, instead of using a custom alphabet, why not use an existing standard that meets our requirements like Crockford base32? Spec here: https://www.crockford.com/base32.html. There's an existing C++ implementation here: https://github.com/tplgy/cppcodec/blob/master/cppcodec/base32_crockford.hpp.

@UkoeHB
Copy link

UkoeHB commented Sep 10, 2023

After considering the pros and cons, the biggest concern for me is that combining the view tags gives you a scan tier that can almost definitively identify all owned enotes (normal and self-send). The combined tier would be an ultra-efficient scan tier with high visibility into user transaction graphs. I expect that in the long run, someone will implement that tier to the detriment of user privacy.

So the trade-off is: A) improve privacy for the recommended remote scanning tier, B) expose an unrecommended remote scanning tier that is materially superior to the recommended tier and greatly weakens user privacy.

@jeffro256
Copy link

Tbf, this was already possible by combining Find-Received + Cipher-Tag. You could give a third-party s_ct and k_fr, and then they could decrypt and decipher address tags, whittling down the probability that a scanned enote is a false negative to 1:16777216.

@UkoeHB
Copy link

UkoeHB commented Sep 11, 2023

Tbf, this was already possible by combining Find-Received + Cipher-Tag.

Not quite. With k_fr and s_ct you can only identify normal enotes. You still need to send all view tag matches to the client so they can scan for self-sends, which means a remote scanner with k_fr and s_ct is not materially more efficient than one with just k_fr. However, with the dual view tags this changes because now you can rule out many more self-send candidates using the second view tag, greatly reducing the amount of data that needs to be sent to the client.

We can fix this issue by keeping the prior jamtis design (with the address tag hint). The only change is to add the second key derivation to s^sr_1 for normal scans only. This way a remote scanner with k_fr and k_rs (receive-secret key for the second key derivation) is equivalent to the current remote scanner, while a remote scanner with just k_fr has the benefits of your original proposal. This is actually much better overall, because now it is feasible for someone to offload both k_fr and k_rs to a remote scanner in order to offload computation of the second key derivation to that scanner (in your proposal it would not be feasible due to the self-send identification issue), which may be a beneficial trade-off if tx volume becomes very large (e.g. if tx volume increases 256x, then your proposal would leave light wallet clients with the same scanning perf normal clients have today).

On the other hand, I do wonder if all these scanning optimizations and tweaks would/will make sense in the long run. If there comes a time when remote scanning only makes sense by offloading both derivations, then we are back to the original jamtis proposal at the cost of a uselessly larger jamtis address and bloated spec.

@tevador
Copy link
Author

tevador commented Sep 12, 2023

We can fix this issue by keeping the prior jamtis design (with the address tag hint). The only change is to add the second key derivation to s^sr_1 for normal scans only. This way a remote scanner with k_fr and k_rs (receive-secret key for the second key derivation) is equivalent to the current remote scanner, while a remote scanner with just k_fr has the benefits of your original proposal.

I like this solution. The cost would be slightly longer addresses (247 vs 244 characters), but there would be much stronger protection of self-sends from the remote scanning services. See this comment to understand why hiding self-sends is vital to protect the privacy properties of the whole network.

On the other hand, I do wonder if all these scanning optimizations and tweaks would/will make sense in the long run. If there comes a time when remote scanning only makes sense by offloading both derivations, then we are back to the original jamtis proposal at the cost of a uselessly larger jamtis address and bloated spec.

If tx volume increases 256x, we'd be at ~40 MB blocks with a blockchain growth of >10 TB/year. If the network can handle that, I think it's safe to assume that CPU performance and network bandwidth have also increased so that light clients can easily keep up using 1/256 view tags.

@jeffro256
Copy link

Not quite. With k_fr and s_ct you can only identify normal enotes.

Fair enough

You still need to send all view tag matches to the client so they can scan for self-sends, which means a remote scanner with k_fr and s_ct is not materially more efficient than one with just k_fr. However, with the dual view tags this changes because now you can rule out many more self-send candidates using the second view tag, greatly reducing the amount of data that needs to be sent to the client.

If the user is this hell-bent on revealing their transaction graph for the sake of efficiency, why doesn't the user also send his self-send TXIDs to the light wallet server? IIRC, current light wallet servers already know which users are tied to which outgoing transactions by virtue of helping them construct that transaction. Heck, all of these changes still don't keep the user from sending their view balance key, which would constitute the most efficient light wallet server. If they wanted to dance around the fact that this isn't private, they could even add some ad-hoc tech to randomly request other data so they can claim its private, or an infinite amount of other things that degrade privacy but make it more efficient. To me, this argument falls under the same category of criticism at the announcement of view-balance keys, because someone else could force them to reveal their view balance keys. It isn't cryptographically possible to prevent people from revealing secret keys willy-nilly, so I don't know how productive it is to talk about potential future scenarios in which the tier system is willingly abused. What we should design are the tiers that we want to see, because users will use them and gain certain trade-offs, while minimizing risk to the planned tiers.

but there would be much stronger protection of self-sends from the remote scanning services

Same point here: It isn't stronger if we don't assume the user won't abuse the wallet tiers, which is what brought this discussion on.

See this comment to understand why hiding self-sends is vital to protect the privacy properties of the whole network.

I agree that hiding self-sends is important, but unless you have a protocol that forces users' self-send privacy, I think that point is moot here.

After considering the pros and cons, the biggest concern for me is that combining the view tags gives you a scan tier that can almost definitively identify all owned enotes (normal and self-send).

One thing about prevents this using actual incentives is the existence of the the 2-byte view tag "sparse" tier in the original proposal. 2-bytes of view tag, for people like us, is complete overkill in efficiency/privacy balance as of current tx volume. But potentially in the future, if there are users who don't want to even scan 1/256 of the enotes on the chain, because they value convenience over privacy 10-fold, they can scan 256x times less than that: 1/65536 (about ~1 enote every day on mainnet today). I think it's not unreasonable that tx volume could 256x sometime in the distant future, which would mean that an enote hit every 10 minutes or so for people using the 2-byte view tag tier. (@j-berman did a great analysis of timing attacks against 2-byte view tags against current tx volume in this thread)

But here's the big thing: this tier doesn't have the deterministic drawbacks of a third-party wallet knowing your nominal address tags: identifying incoming normal enotes to known addresses and incoming normal enotes sent to addresses more than once with ~100% certainty. The privacy of the 2-byte view tag tier scales up with volume, and it is much more detrimental to privacy than the proposed "dense" view tag tier, but if we're planning for very desperate users like we're doing here, we need a bigger jump for light wallet scanning than replacing DH ops with Twofish ops; we need to have the option to cut bandwidth without deterministic attacks.

On the other hand, I do wonder if all these scanning optimizations and tweaks would/will make sense in the long run. If there comes a time when remote scanning only makes sense by offloading both derivations, then we are back to the original jamtis proposal at the cost of a uselessly larger jamtis address and bloated spec.

Here again is the beauty of a 2-byte view tag tier being available. Since we're planning for huge tx volume which displaces users who simply can't keep up with chain data, a 2-byte view tag tier will actually cut bandwidth hugely w/o deterministic downsides.

I think it's safe to assume that CPU performance and network bandwidth have also increased so that light clients can easily keep up using 1/256 view tags

If it's safe to assume this, then why have the modifications in the first place? If it's so easy to keep up with bandwidth and computation, why would users feel the need to jump ship to worse privacy trade-offs en masse?

@tevador
Copy link
Author

tevador commented Sep 12, 2023

why doesn't the user also send his self-send TXIDs to the light wallet server

Rational users have exactly zero incentive to do this.

Here again is the beauty of a 2-byte view tag tier being available. Since we're planning for huge tx volume which displaces users who simply can't keep up with chain data, a 2-byte view tag tier will actually cut bandwidth hugely w/o deterministic downsides.

Do we really need two view tags for this from the start? Why can't the bitsize of the "standard" view tag scale with volume to keep the false positive rate roughly constant? E.g. when tx volume doubles, one bit is added to the view tag deterministically. That would react much more smoothly and provide plausible deniability under all conditions.

@UkoeHB
Copy link

UkoeHB commented Sep 12, 2023

Why can't the bitsize of the "standard" view tag scale with volume to keep the false positive rate roughly constant? E.g. when tx volume doubles, one bit is added to the view tag deterministically. That would react much more smoothly and provide plausible deniability under all conditions.

This can be abused by a malicious remote scanning service to reduce the anonymity of users by spamming the chain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment