Skip to content

Instantly share code, notes, and snippets.

@tevador
Last active December 10, 2024 20:03
Show Gist options
  • Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.
Save tevador/50160d160d24cfc6c52ae02eb3d17024 to your computer and use it in GitHub Desktop.

JAMTIS

This document describes a new addressing scheme for Monero.

Chapters 1-2 are intended for general audience.

Chapters 3-7 contain technical specifications.

Table of Contents

1. Introduction

1.1 Why a new address format?

Sometime in 2024, Monero plans to adopt a new transaction protocol called Seraphis [1], which enables much larger ring sizes than the current RingCT protocol. However, due to a different key image construction, Seraphis is not compatible with CryptoNote addresses. This means that each user will need to generate a new set of addresses from their existing private keys. This provides a unique opportunity to vastly improve the addressing scheme used by Monero.

1.2 Current Monero addresses

The CryptoNote-based addressing scheme [2] currently used by Monero has several issues:

  1. Addresses are not suitable as human-readable identifiers because they are long and case-sensitive.
  2. Too much information about the wallet is leaked when scanning is delegated to a third party.
  3. Generating subaddresses requires view access to the wallet. This is why many merchants prefer integrated addresses [3].
  4. View-only wallets need key images to be imported to detect spent outputs [4].
  5. Subaddresses that belong to the same wallet can be linked via the Janus attack [5].
  6. The detection of outputs received to subaddresses is based on a lookup table, which can sometimes cause the wallet to miss outputs [6].

1.3 Jamtis

Jamtis is a new addressing scheme that was developed specifically for Seraphis and tackles all of the shortcomings of CryptoNote addresses that were mentioned above. Additionally, Jamtis incorporates two other changes related to addresses to take advantage of this large upgrade opportunity:

  • A new 16-word mnemonic scheme called Polyseed [7] that will replace the legacy 25-word seed for new wallets.
  • The removal of integrated addresses and payment IDs [8].

2. Features

2.1 Address format

Jamtis addresses, when encoded as a string, start with the prefix xmra and consist of 196 characters. Example of an address: xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bfyji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wrb5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7whkckh51ik

There is no "main address" anymore - all Jamtis addresses are equivalent to a subaddress.

2.1.1 Recipient IDs

Jamtis introduces a short recipient identifier (RID) that can be calculated for every address. RID consists of 25 alphanumeric characters that are separated by underscores for better readability. The RID for the above address is regne_hwbna_u21gh_b54n0_8x36q. Instead of comparing long addresses, users can compare the much shorter RID. RIDs are also suitable to be communicated via phone calls, text messages or handwriting to confirm a recipient's address. This allows the address itself to be transferred via an insecure channel.

2.2 Light wallet scanning

Jamtis introduces new wallet tiers below view-only wallet. One of the new wallet tiers called "FindReceived" is intended for wallet-scanning and only has the ability to calculate view tags [9]. It cannot generate wallet addresses or decode output amounts.

View tags can be used to eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, this tier can also link outputs to those addresses. Possible use cases are:

2.2.1 Wallet component

A wallet can have a "FindReceived" component that stays connected to the network at all times and filters out outputs in the blockchain. The full wallet can thus be synchronized at least 256x faster when it comes online (it only needs to check outputs with a matching view tag).

2.2.2 Third party services

If the "FindReceived" private key is provided to a 3rd party, it can preprocess the blockchain and provide a list of potential outputs. This reduces the amount of data that a light wallet has to download by a factor of at least 256. The third party will not learn which outputs actually belong to the wallet and will not see output amounts.

2.3 Wallet tiers for merchants

Jamtis introduces new wallet tiers that are useful for merchants.

2.3.1 Address generator

This tier is intended for merchant point-of-sale terminals. It can generate addresses on demand, but otherwise has no access to the wallet (i.e. it cannot recognize any payments in the blockchain).

2.3.2 Payment validator

This wallet tier combines the Address generator tier with the ability to also view received payments (including amounts). It is intended for validating paid orders. It cannot see outgoing payments and received change.

2.4 Full view-only wallets

Jamtis supports full view-only wallets that can identify spent outputs (unlike legacy view-only wallets), so they can display the correct wallet balance and list all incoming and outgoing transactions.

2.5 Janus attack mitigation

Janus attack is a targeted attack that aims to determine if two addresses A, B belong to the same wallet. Janus outputs are crafted in such a way that they appear to the recipient as being received to the wallet address B, while secretly using a key from address A. If the recipient confirms the receipt of the payment, the sender learns that they own both addresses A and B.

Jamtis prevents this attack by allowing the recipient to recognize a Janus output.

2.6 Robust output detection

Jamtis addresses and outputs contain an encrypted address tag which enables a more robust output detection mechanism that does not need a lookup table and can reliably detect outputs sent to arbitrary wallet addresses.

3. Notation

3.1 Serialization functions

  1. The function BytesToInt256(x) deserializes a 256-bit little-endian integer from a 32-byte input.
  2. The function Int256ToBytes(x) serialized a 256-bit integer to a 32-byte little-endian output.

3.2 Hash function

The function Hb(k, x) with parameters b, k, refers to the Blake2b hash function [10] initialized as follows:

  • The output length is set to b bytes.
  • Hashing is done in sequential mode.
  • The Personalization string is set to the ASCII value "Monero", padded with zero bytes.
  • If the key k is not null, the hash function is initialized using the key k (maximum 64 bytes).
  • The input x is hashed.

The function SecretDerive is defined as:

SecretDerive(k, x) = H32(k, x)

3.3 Elliptic curves

Two elliptic curves are used in this specification:

  1. Curve25519 - a Montgomery curve. Points on this curve include a cyclic subgroup 𝔾1.
  2. Ed25519 - a twisted Edwards curve. Points on this curve include a cyclic subgroup 𝔾2.

Both curves are birationally equivalent, so the subgroups 𝔾1 and 𝔾2 have the same prime order ℓ = 2252 + 27742317777372353535851937790883648493. The total number of points on each curve is 8ℓ.

3.3.1 Curve25519

Curve25519 is used exclusively for the Diffie-Hellman key exchange [11].

Only a single generator point B is used:

Point Derivation Serialized (hex)
B generator of 𝔾1 0900000000000000000000000000000000000000000000000000000000000000

Private keys for Curve25519 are 32-byte integers denoted by a lowercase letter d. They are generated using the following KeyDerive1(k, x) function:

  1. d = H32(k, x)
  2. d[31] &= 0x7f (clear the most significant bit)
  3. d[0] &= 0xf8 (clear the least significant 3 bits)
  4. return d

All Curve25519 private keys are therefore multiples of the cofactor 8, which ensures that all public keys are in the prime-order subgroup. The multiplicative inverse modulo is calculated as d-1 = 8*(8*d)-1 to preserve the aforementioned property.

Public keys (elements of 𝔾1) are denoted by the capital letter D and are serialized as the x-coordinate of the corresponding Curve25519 point. Scalar multiplication is denoted by a space, e.g. D = d B.

3.3.2 Ed25519

The Edwards curve is used for signatures and more complex cryptographic protocols [12]. The following three generators are used:

Point Derivation Serialized (hex)
G generator of 𝔾2 5866666666666666666666666666666666666666666666666666666666666666
U Hp("seraphis U") 126582dfc357b10ecb0ce0f12c26359f53c64d4900b7696c2c4b3f7dcab7f730
X Hp("seraphis X") 4017a126181c34b0774d590523a08346be4f42348eddd50eb7a441b571b2b613

Here Hp refers to an unspecified hash-to-point function.

Private keys for Ed25519 are 32-byte integers denoted by a lowercase letter k. They are generated using the following function:

KeyDerive2(k, x) = H64(k, x) mod ℓ

Public keys (elements of 𝔾2) are denoted by the capital letter K and are serialized as 256-bit integers, with the lower 255 bits being the y-coordinate of the corresponding Ed25519 point and the most significant bit being the parity of the x-coordinate. Scalar multiplication is denoted by a space, e.g. K = k G.

3.4 Block cipher

The function BlockEnc(s, x) refers to the application of the Twofish [13] permutation using the secret key s on the 16-byte input x. The function BlockDec(s, x) refers to the application of the inverse permutation using the key s.

3.5 Base32 encoding

"Base32" in this specification referes to a binary-to-text encoding using the alphabet xmrbase32cdfghijknpqtuwy01456789. This alphabet was selected for the following reasons:

  1. The order of the characters has a unique prefix that distinguishes the encoding from other variants of "base32".
  2. The alphabet contains all digits 0-9, which allows numeric values to be encoded in a human readable form.
  3. Excludes the letters o, l, v and z for the same reasons as the z-base-32 encoding [14].

4. Wallets

4.1 Wallet parameters

Each wallet consists of two main private keys and a timestamp:

Field Type Description
km private key wallet master key
kvb private key view-balance key
birthday timestamp date when the wallet was created

The master key km is required to spend money in the wallet and the view-balance key kvb provides full view-only access.

The birthday timestamp is important when restoring a wallet and determines the blockchain height where scanning for owned outputs should begin.

4.2 New wallets

4.2.1 Standard wallets

Standard Jamtis wallets are generated as a 16-word Polyseed mnemonic [7], which contains a secret seed value used to derive the wallet master key and also encodes the date when the wallet was created. The key kvb is derived from the master key.

Field Derivation
km BytesToInt256(polyseed_key) mod ℓ
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday from Polyseed

4.2.2 Multisignature wallets

Multisignature wallets are generated in a setup ceremony, where all the signers collectively generate the wallet master key km and the view-balance key kvb.

Field Derivation
km setup ceremony
kvb setup ceremony
birthday setup ceremony

4.3 Migration of legacy wallets

Legacy pre-Seraphis wallets define two private keys:

  • private spend key ks
  • private view-key kv

4.3.1 Standard wallets

Legacy standard wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = KeyDerive1(km, "jamtis_view_balance_key")
birthday entered manually

Legacy wallets cannot be migrated to Polyseed and will keep using the legacy 25-word seed.

4.3.2 Multisignature wallets

Legacy multisignature wallets can be migrated to the new scheme based on the following table:

Field Derivation
km km = ks
kvb kvb = kv
birthday entered manually

4.4 Additional keys

There are additional keys derived from kvb:

Key Name Derivation Used to
dfr find-received key kfr = KeyDerive1(kvb, "jamtis_find_received_key") scan for received outputs
dua unlock-amounts key kid = KeyDerive1(kvb, "jamtis_unlock_amounts_key") decrypt output amounts
sga generate-address secret sga = SecretDerive(kvb, "jamtis_generate_address_secret") generate addresses
sct cipher-tag secret ket = SecretDerive(sga, "jamtis_cipher_tag_secret") encrypt address tags

The key dfr provides the ability to calculate the sender-receiver shared secret when scanning for received outputs. The key dua can be used to create a secondary shared secret and is used to decrypt output amounts.

The key sga is used to generate public addresses. It has an additional child key sct, which is used to encrypt the address tag.

4.5 Key hierarchy

The following figure shows the overall hierarchy of wallet keys. Note that the relationship between km and kvb only applies to standard (non-multisignature) wallets.

key hierarchy

4.6 Wallet access tiers

Tier Knowledge Off-chain capabilities On-chain capabilities
AddrGen sga generate public addresses none
FindReceived dfr recognize all public wallet addresses eliminate 99.6% of non-owned outputs (up to § 5.3.5), link output to an address (except of change and self-spends)
ViewReceived dfr, dua, sga all view all received except of change and self-spends (up to § 5.3.14)
ViewAll kvb all view all
Master km all all

4.6.1 Address generator (AddrGen)

This wallet tier can generate public addresses for the wallet. It doesn't provide any blockchain access.

4.6.2 Output scanning wallet (FindReceived)

Thanks to view tags, this tier can eliminate 99.6% of outputs that don't belong to the wallet. If provided with a list of wallet addresses, it can also link outputs to those addresses (but it cannot generate addresses on its own). This tier should provide a noticeable UX improvement with a limited impact on privacy. Possible use cases are:

  1. An always-online wallet component that filters out outputs in the blockchain. A higher-tier wallet can thus be synchronized 256x faster when it comes online.
  2. Third party scanning services. The service can preprocess the blockchain and provide a list of potential outputs with pre-calculated spend keys (up to § 5.2.4). This reduces the amount of data that a light wallet has to download by a factor of at least 256.

4.6.3 Payment validator (ViewReceived)

This level combines the tiers AddrGen and FindReceived and provides the wallet with the ability to see all incoming payments to the wallet, but cannot see any outgoing payments and change outputs. It can be used for payment processing or auditing purposes.

4.6.4 View-balance wallet (ViewAll)

This is a full view-only wallet than can see all incoming and outgoing payments (and thus can calculate the correct wallet balance).

4.6.5 Master wallet (Master)

This tier has full control of the wallet.

4.7 Wallet public keys

There are 3 global wallet public keys. These keys are not usually published, but are needed by lower wallet tiers.

Key Name Value
Ks wallet spend key Ks = kvb X + km U
Dua unlock-amounts key Dua = dua B
Dfr find-received key Dfr = dfr Dua

5. Addresses

5.1 Address generation

Jamtis wallets can generate up to 2128 different addresses. Each address is constructed from a 128-bit index j. The size of the index space allows stateless generation of new addresses without collisions, for example by constructing j as a UUID [15].

Each Jamtis address encodes the tuple (K1j, D2j, D3j, tj). The first three values are public keys, while tj is the "address tag" that contains the encrypted value of j.

5.1.1 Address keys

The three public keys are constructed as:

  • K1j = Ks + kuj U + kxj X + kgj G
  • D2j = daj Dfr
  • D3j = daj Dua

The private keys kuj, kxj, kgj and daj are derived as follows:

Keys Name Derivation
kuj spend key extensions kuj = KeyDerive2(sga, "jamtis_spendkey_extension_u" || j)
kxj spend key extensions kxj = KeyDerive2(sga, "jamtis_spendkey_extension_x" || j)
kgj spend key extensions kgj = KeyDerive2(sga, "jamtis_spendkey_extension_g" || j)
daj address keys daj = KeyDerive1(sga, "jamtis_address_privkey" || j)

5.1.2 Address tag

Each address additionally includes an 18-byte tag tj = (j', hj'), which consists of the encrypted value of j:

  • j' = BlockEnc(sct, j)

and a 2-byte "tag hint", which can be used to quickly recognize owned addresses:

  • hj' = H2(sct, "jamtis_address_tag_hint" || j')

5.2 Sending to an address

TODO

5.3 Receiving an output

TODO

5.4 Change and self-spends

TODO

5.5 Transaction size

Jamtis has a small impact on transaction size.

5.5.1 Transactions with 2 outputs

The size of 2-output transactions is increased by 28 bytes. The encrypted payment ID is removed, but the transaction needs two encrypted address tags t~ (one for the recipient and one for the change). Both outputs can use the same value of De.

5.5.2 Transactions with 3 or more outputs

Since there are no "main" addresses anymore, the TX_EXTRA_TAG_PUBKEY field can be removed from transactions with 3 or more outputs.

Instead, all transactions with 3 or more outputs will require one 50-byte tuple (De, t~) per output.

6. Address encoding

6.1 Address structure

An address has the following overall structure:

Field Size (bits) Description
Header 30* human-readable address header (§ 6.2)
K1 256 address key 1
D2 255 address key 2
D3 255 address key 3
t 144 address tag
Checksum 40* (§ 6.3)

* The header and the checksum are already in base32 format

6.2 Address header

The address starts with a human-readable header, which has the following format consisting of 6 alphanumeric characters:

"xmra" <version char> <network type char>

Unlike the rest of the address, the header is never encoded and is the same for both the binary and textual representations. The string is not null terminated.

The software decoding an address shall abort if the first 4 bytes are not 0x78 0x6d 0x72 0x61 ("xmra").

The "xmra" prefix serves as a disambiguation from legacy addresses that start with "4" or "8". Additionally, base58 strings that start with the character x are invalid due to overflow [16], so legacy Monero software can never accidentally decode a Jamtis address.

6.2.1 Version character

The version character is "1". The software decoding an address shall abort if a different character is encountered.

6.2.2 Network type

network char network type
"t" testnet
"s" stagenet
"m" mainnet

The software decoding an address shall abort if an invalid network character is encountered.

6.3 Checksum

The purpose of the checksum is to detect accidental corruption of the address. The checksum consists of 8 characters and is calculated with a cyclic code over GF(32) using the polynomial:

x8 + 3x7 + 11x6 + 18x5 + 5x4 + 25x3 + 21x2 + 12x + 1

The checksum can detect all errors affecting 5 or fewer characters. Arbitrary corruption of the address has a chance of less than 1 in 1012 of not being detected. The reference code how to calculate the checksum is in Appendix A.

6.4 Binary-to-text encoding

An address can be encoded into a string as follows:

address_string = header + base32(data) + checksum

where header is the 6-character human-readable header string (already in base32), data refers to the address tuple (K1, D2, D3, t), encoded in 910 bits, and the checksum is the 8-character checksum (already in base32). The total length of the encoded address 196 characters (=6+182+8).

6.4.1 QR Codes

While the canonical form of an address is lower case, when encoding an address into a QR code, the address should be converted to upper case to take advantage of the more efficient alphanumeric encoding mode.

6.5 Recipient authentication

TODO

7. Test vectors

TODO

References

  1. https://github.com/UkoeHB/Seraphis
  2. https://github.com/monero-project/research-lab/blob/master/whitepaper/whitepaper.pdf
  3. monero-project/meta#299 (comment)
  4. https://www.getmonero.org/resources/user-guides/view_only.html
  5. https://web.getmonero.org/2019/10/18/subaddress-janus.html
  6. monero-project/monero#8138
  7. https://github.com/tevador/polyseed
  8. monero-project/monero#7889
  9. monero-project/research-lab#73
  10. https://eprint.iacr.org/2013/322.pdf
  11. https://cr.yp.to/ecdh/curve25519-20060209.pdf
  12. https://ed25519.cr.yp.to/ed25519-20110926.pdf
  13. https://www.schneier.com/wp-content/uploads/2016/02/paper-twofish-paper.pdf
  14. http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt
  15. https://en.wikipedia.org/wiki/Universally_unique_identifier
  16. https://github.com/monero-project/monero/blob/319b831e65437f1c8e5ff4b4cb9be03f091f6fc6/src/common/base58.cpp#L157

Appendix A: Checksum

# Jamtis address checksum algorithm

# cyclic code based on the generator 3BI5PLC1
# can detect 5 errors up to the length of 994 characters
GEN=[0x1ae45cd581, 0x359aad8f02, 0x61754f9b24, 0xc2ba1bb368, 0xcd2623e3f0]

M = 0xffffffffff

def jamtis_polymod(data):
    c = 1
    for v in data:
        b = (c >> 35)
        c = ((c & 0x07ffffffff) << 5) ^ v
        for i in range(5):
            c ^= GEN[i] if ((b >> i) & 1) else 0
    return c

def jamtis_verify_checksum(data):
    return jamtis_polymod(data) == M

def jamtis_create_checksum(data):
    polymod = jamtis_polymod(data + [0,0,0,0,0,0,0,0]) ^ M
    return [(polymod >> 5 * (7 - i)) & 31 for i in range(8)]

# test/example

CHARSET = "xmrbase32cdfghijknpqtuwy01456789"

addr_test = (
    "xmra1mj0b1977bw3ympyh2yxd7hjymrw8crc9kin0dkm8d3"
    "wdu8jdhf3fkdpmgxfkbywbb9mdwkhkya4jtfn0d5h7s49bf"
    "yji1936w19tyf3906ypj09n64runqjrxwp6k2s3phxwm6wr"
    "b5c0b6c1ntrg2muge0cwdgnnr7u7bgknya9arksrj0re7wh")

addr_data = [CHARSET.find(x) for x in addr_test]
addr_enc = addr_data + jamtis_create_checksum(addr_data)
addr = "".join([CHARSET[x] for x in addr_enc])

print(addr)
print("len =", len(addr))
print("valid =", jamtis_verify_checksum(addr_enc))
@UkoeHB
Copy link

UkoeHB commented Jan 5, 2022

In the collaborative spend case with 0 change, how will I know when I spent that output without revealing my key image to the server?

The server will send you all key images in the ledger, and you will check it directly.

The 'tx id' in the <key image, tx id> table is for finding change/self-spend outputs.

@tevador
Copy link
Author

tevador commented Jan 5, 2022

In the collaborative spend case with 0 change, how will I know when I spent that output without revealing my key image to the server?

I think the only viable option that preserves privacy would be to download all key images since the wallet birthday. The list of transactions that reference any of the outputs with a matching view tag would probably include half of the blockchain (if we consider a ring size of 128), so that's not usable.

@j-berman
Copy link

j-berman commented Jan 5, 2022

Ok, assuming downloading the table is fine, I guess if light wallet clients really wanted to display additional metadata for a tx like timestamp, block, hash, they could either download an even larger <key image, < tx metadata > > table, or maybe could query the server with a ton of decoy tx id's, or just don't display that data?

Perhaps this is a tangent, but considering the design is going to this length to keep known key images from the server, I think it's worth discussing further how light wallets could be designed to avoid timing analysis that reveals key images to the server (e.g. user logs in to light wallet, retrieves their view tag matched outs, gets decoy outputs, gets network fee, submits a tx including decoys + view tag matched outs using server's suggested fee, server sees user received view tag match tx in pool). You guys are of the view that there is a way to keep the server from finding owned key images across all these steps? Even with membership proofs that span the entire chain, this seems challenging

@UkoeHB
Copy link

UkoeHB commented Jan 5, 2022

(e.g. user logs in to light wallet, retrieves their view tag matched outs, gets decoy outputs, gets network fee, submits a tx including decoys + view tag matched outs using server's suggested fee, server sees user received view tag match tx in pool). You guys are of the view that there is a way to keep the server from finding owned key images across all these steps? Even with membership proofs that span the entire chain, this seems challenging

  1. the server should be recommending an algorithmic fee - a fee that all txs submitted at this time should use
  2. not every light wallet login is to make a tx
  3. with large ring sizes, there will be a HUGE rate of false positives on view tag matches appearing in tx rings, even within a single block (maybe only txs with >2 inputs would be affected)

@j-berman
Copy link

j-berman commented Jan 6, 2022

  1. the server should be recommending an algorithmic fee - a fee that all txs submitted at this time should use

Maybe I'm totally off here, but won't the network accept tx's with a higher fee than the prevailing expected fee, or will that only be when blocks fill up? If I'm not off, the only way users would know that the server's recommended fee is correct is by checking their tx in the blockchain, right?

  1. not every light wallet login is to make a tx

Sure, but this doesn't matter if some pattern emerges of login -> get_unspent_outs -> submit tx that has received view tag outs. As currently implemented, the light wallet server offers an endpoint to submit tx's, and that's the default route light wallets use to submit AFAIK. This would mean it's fairly trivial to see when a user submits a tx after logging in.

  1. with large ring sizes, there will be a HUGE rate of false positives on view tag matches appearing in tx rings, even within a single block

Even if we assume membership proofs that span the whole chain: the server will see you logged in, took steps to construct a tx, then you got a view tag matched output.

It's a pattern that seems very hard to avoid leaking some useful data that could be used to guess owned key images. But perhaps some is still better than giving up entirely and leaking the key images.

@UkoeHB
Copy link

UkoeHB commented Jan 6, 2022

If I'm not off, the only way users would know that the server's recommended fee is correct is by checking their tx in the blockchain, right?

The user could query a public node/set of public nodes to get an array of current network fees (the current minimum fee).

As currently implemented, the light wallet server offers an endpoint to submit tx's, and that's the default route light wallets use to submit AFAIK. This would mean it's fairly trivial to see when a user submits a tx after logging in.

This leaks which output is the change output (or self-spend). The server won't know when that change output is spent. A simple solution is to not submit txs through the light wallet server.

It's a pattern that seems very hard to avoid leaking some useful data that could be used to guess owned key images. But perhaps some is still better than giving up entirely and leaking the key images.

Yep this is basically the limit of what we can achieve for remote scanning.

In this case you could let the light wallet client store an entire ledger copy, and let change/self-spend output view tags be invisible to remote scanning. However, this is impractical and hamstrings the entire feature (since ALL users of remote scanning would need their own ledger copy to find change/self-spends).

@j-berman
Copy link

j-berman commented Jan 6, 2022

The user could query a public node/set of public nodes

Fair

This leaks which output is the change output (or self-spend). The server won't know when that change output is spent. A simple solution is to not submit txs through the light wallet server.

I think it's a pretty strong assumption the key images in the tx are yours here too. But, ya, not submitting through the light wallet server seems a reasonable solution (and extending this, not explicitly signalling to the server you're trying to construct a tx is probably the move too, like by not getting decoys from the server either, which is another trivial avenue for the current server implementation to identify real spends).

Yep this is basically the limit of what we can achieve for remote scanning.

I was thinking "is this limit futile", such that light clients should just query the server by key image when needed. But reasonable arguments, a fully functional light client/server design does seem it can be modeled better to gain tangible privacy improvements without revealing key images.

Downloading all key images does still feel like a hard sell for a light wallet though. More number crunching and feedback would be nice on that

@tevador
Copy link
Author

tevador commented Jan 6, 2022

I realized that downloading all key images can be avoided with a bloom filter. It has a configurable false positive rate, which can be adjusted to achieve optimal ratio of privacy vs download size.

@j-berman
Copy link

j-berman commented Jan 6, 2022

EDIT: realized flaw with this logic here

Ok thought on it more.

Technically when a light wallet submits a tx, the wallet is providing owned key images to whomever the tx is submitted to. The design described above is relying on the fact that by submitting a tx and revealing your owned key images to some other location that knows nothing else about you, the light wallet server would not be able to tell that the tx and its key images are yours.

With this in mind, why not just query public nodes (or wherever you're submitting tx's to) for your txid's by key images? The major difference I'm seeing is that this other location would see a larger cluster of key images (some of which are not yet in the chain), perhaps making it marginally easier to collect and fingerprint you. But it seems privacy problems that stem from this can just be avoided by divvying up key image queries across more places. So long as the design is relying on the fundamental assumption that whomever receives the tx submission knows nothing about you and cannot link you to the light wallet, it seems to have roughly equivalent privacy properties to just query whomever you'd submit tx's to to see if your key images are in the chain.

As an additional compromise: could query for key images with just matching first byte :)

EDIT: lol bloom filter works too. I've had this open for an hour didn't see that comment
EDIT2: cleaned it up

@j-berman
Copy link

j-berman commented Jan 6, 2022

Nevermind. Realized flaw in above logic. By revealing a key image not yet included in the chain, you reveal your output must be at least that old when it's ultimately included in the chain. This is a worse privacy implication.

Bloom filter seems more workable.

@UkoeHB
Copy link

UkoeHB commented Jan 6, 2022

I realized that downloading all key images can be avoided with a bloom filter. It has a configurable false positive rate, which can be adjusted to achieve optimal ratio of privacy vs download size.

@tevador Can you elaborate? How does a bloom filter allow you to match key images with tx ids, which is necessary in order to find change and self-spends?

Maybe the scanning service can return {key images, output matches} tuples for every tx in the chain that has an output match. This lets the user find all change/self-spends without needing the entire key image table (in collaborative funding there should not be any change/self-spend outputs, just in case the tx proposer doesn't contribute any inputs). Then, use a bloom filter to check for spends where there is no change/self-spend output.

Is the idea with a bloom filter that you send a small hash of the key image to the service, which returns a set of key images that match that hash? I feel like this opens some heuristics for identifying when an output is spent (and implicitly revealing change outputs). If a tx has all of these: bloom filter key image match, input references output view tag match or nominal-spendkey-duplicate (possible spend), output set includes a view tag match (change). Timing analysis can also increase heuristic strength there.

The bloom-filter request could A) exclude known-spent key images, B) include a large number of fake requests. However, (A) has two problems: 1) it doesn't hide key images that haven't yet been spent, 2) if there are two requests, one before and one after an output is spent, then the bloom-filter request will drop the key image hash of the key image that is spent. If the fake key image hashes change with every request, then the service can get the request intersection to identify which hashes are real (and look at the ones dropped from request-to-request to guess real-spends).

To fully obfuscate requests you would need:

  • for each real key image, a set of deterministic fake key images of semi-random size; this set will be repeated each time a request is made when the real key image has unknown spent-status
  • a random number of pure-random fake sets of semi-random size; these sets are always new with each request, and are meant to mask real key images that appear in one request and disappear in the next
  • a random number of deterministic fake sets of semi-random size; a random number of these are replaced with each request, to mask real key images that stick around for multiple requests
  • a random number of deterministic fake sets of semi-random size; all of these stay the same between requests, to mask real key images that stick around for a long long time

Or, just let light wallets download all key images.

@j-berman
Copy link

j-berman commented Jan 7, 2022

bloom filter that you send a small hash of the key image to the service

This is what I was thinking @tevador meant

I feel like this opens some heuristics for identifying when an output is spent

I don't think there is a way to fully obfuscate requests and no matter what, this route would offer an additional avenue for heuristic analysis to guess at owned key images, but those techniques above are improvements and perhaps could be improved further:

for each real key image, a set of deterministic fake key images of semi-random size; this set will be repeated each time a request is made when the real key image has unknown spent-status

You could continue making requests with the same set until there have been at least x (another deterministic random number) of all your requested fake + real key image sets in the chain after your real has been included. This way whoever is collecting your requests won't glean as much information from the time you stopped making the requests.

a random number of pure-random fake sets of semi-random size; these sets are always new with each request, and are meant to mask real key images that appear in one request and disappear in the next

a random number of deterministic fake sets of semi-random size; a random number of these are replaced with each request, to mask real key images that stick around for multiple requests

a random number of deterministic fake sets of semi-random size; all of these stay the same between requests, to mask real key images that stick around for a long long time

Not sure I'm understanding these 3, on average, wouldn't you expect to download half of all key images with these? Plus, in theory isn't it possible to never find your key image if you get mega unlucky?

EDIT: removed a bunch of stuff that I don't think helps advance the discussion

@tevador
Copy link
Author

tevador commented Jan 7, 2022

Yeah, bloom filters are pretty hard to use without leaking information. Just look at the failure of BIP-37.

There would generally be 2 options:

  1. The client sends bloom filters to the server. This is like BIP-37.
  2. The server sends bloom filters to the client.

In the first case, the light client must never send two different bloom filters that contain the same key image. So that would basically mean we'd need bloom filters with 1 item, AKA hashes. For each unspent key image, the light client would construct a tuple (key, hash), such that Siphash(key, key_image) == hash, where the size of hash short enough to give false positives. To avoid leaking the number of unspent inputs, there would also be decoy tuples. The client would randomly remove decoys and real spends from the list and add new decoys and new inputs. This methods definitely leaks some information and it's questionable whether it would actually save bandwidth.

In the second case, the server could compile a bloom filter containing key images spent in some block range and the client would request the key images from the block range if a match is found in the bloom filter. This also leaks some information about spends. Tf there are too many false positives, the client might end up downloading more data than the size of all key images.

It seems that the simplest method that doesn't leak any information would be to just download all key images starting with the wallet birthday.

This problem is the basic reason I was uncertain about Janus B vs Janus C.

Janus C would be a big mistake. Revealing spends to wallet scanning services weakens the ring signatures of the whole network.

@tevador
Copy link
Author

tevador commented Jan 9, 2022

Major changes to address encoding:

  1. I decided to remove the invoice functionality from this specification. Monero invoices will be specified separately and will be independent of addresses. I think this should alleviate some of the concerns regarding the complexity of this proposal.
  2. The various signature types have been removed. Now there are just two types: anonymous addresses and certified addresses (certified addresses have two subtypes depending on the wallet tier).
  3. The address header is human-readable. The prefix is xmr1 for version 1 XMR address, then t/s/m for the network type and a/c for anonymous or certified address. The total length of the prefix is 6 characters.
  4. I decided to use base32 instead of base58. Base32 allows for smaller QR codes and enables the use of the much more efficient finite-field based checksums. The new checksum will be 8 characters and based on a BCH code similar to bech32 Bitcoin addresses (but stronger). I think base32 is overall a superior encoding and it's worth the small increase in address length (168 vs 147 characters). Also switching to base32 now is better than doing it later because then we'd have to support two different encodings like Bitcoin.

Example address: xmr1ma2qgc3yrp5yp99b6qdy9dstx7wqdyy3g9skore7wb1k1xbtt34mp48h5gmh1y13pispmqco3kujgs7cdpax2r4h5bwjgnbd25bf8x1f4detdeyfrh449upohy28kwroyxdunwxp5ufw9e2btuo5k2mfr5rd3hkbxj3h

@rbrunner7
Copy link

I really like the human-readable address header.

@UkoeHB
Copy link

UkoeHB commented Jan 10, 2022

Adding a comment about @tevador's 'encrypted address index' (maybe enc_addr_index) from here.

The tuple (i,j) would be encrypted twice: once in the address (using a block cipher), and second time by the sender of the tx (using a stream cipher). The recipient would decrypt (i,j) and check if K_s+(q+H(k_vb,i,j))G matches the tx key (the same number of EC ops). No need to cache any addresses in advance.

Let index i be block-ciphered with k_ac, and index j be block-ciphered with k^i_a (each one has its own 1-byte MAC). This way k^i_a can generate new addresses by appending encrypted js to an encrypted i received from the base account creator. Moreover, k^i_a holders won't be able to open the i index, so they can't learn anything extra about outputs received to other accounts (especially change/self-spend outputs).

During scanning, a k^i_a holder will just assume that the i decrypts to their i. A k_ac holder can decrypt both indices.

struct addr_index
{
  uint32_t i;
  uint32_t j;
};

struct enc_index
{
  uint32_t index_enc;
  uint8_t index_MAC;
}

struct enc_addr_index
{
  enc_index i_enc;
  enc_index j_enc;
};

using enc_addr_index_pack = unsigned char[80];

void encrypt_index(
    const uint32_t index,
    const crypto::secret_key &encryption_key,
    enc_index &encrypted_index_out
  )
{
  encrypted_index_out = block_cipher_enc(index, encryption_key);
}

bool decrypt_index(
    const enc_index encrypted_index,
    const crypto::secret_key &encryption_key,
    uint32_t &decrypted_index_out
  )
{
  enc_index temp_dec_index;
  temp_dec_index = block_cipher_dec(encrypted_index.index_enc, encryption_key);

  if (temp_dec_index.index_MAC == encrypted_index.index_MAC)
  {
    decrypted_index_out = temp_dec_index.index_enc;
    return true;
  }

  return false;
}

void encrypt_addr_index(
    const addr_index &index,
    const crypto::secret_key &k_ac,
    const crypto::secret_key &k_a_i,
    enc_addr_index &encrypted_addr_index_out
  )
{
  enc_addr_index enc_index;
  encrypt_index(index.i, k_ac, encrypted_addr_index_out.i_enc);
  encrypt_index(index.j, k_a_i, encrypted_addr_index_out.j_enc);
}

enc_addr_index encrypted_addr_index;
enc_addr_index_pack encrypted_addr_index_pack = stream_cipher(encrypted_addr_index, q);  //q is the sender-receiver shared secret

@tevador
Copy link
Author

tevador commented Jan 10, 2022

Your implementation has one big issue: it leaks that two addresses belong to the same account (because their encrypted_addr_index_out.i_enc will be the same). With a high probability, that also means they belong to the same wallet, making addresses linkable.

I was originally thinking to use a 64-bit block cipher like Blowfish to encrypt (i,j) at the same time, but unfortunately, that's not compatible with the accout-level access tiers.

I don't know if there is a solution for both problems.

@UkoeHB
Copy link

UkoeHB commented Jan 10, 2022

Ah you're right... encrypted indices are not compatible with account-level wallet authority (ViewReceivedAccnt) without leaking output:index mappings for other accounts.

@UkoeHB
Copy link

UkoeHB commented Jan 10, 2022

Ok, what about just encrypting with k^i_a, cache a number of account keys, and try-decrypt with all of them? I suppose, depending on the cipher cost, this could have performance impacts.

@tevador
Copy link
Author

tevador commented Jan 10, 2022

Here is another idea. It's not as good as directly decrypting (i,j), but at least it doesn't require any additional address data and blockchain space.

It would require three small changes:

  1. Define new account-level spend key Ksi = Ks + DeriveKey(kac, "account key extension" || i) X
  2. Redefine address spend key extensions as kxi,j = (kai * (j + 1))-1 mod ℓ
  3. Redefine address spend keys as K1i,j = Ksi + kxi,j X

Consequences:

  1. Account-level tiers can still generate addresses if provided with the public key Ksi instead of Ks.
  2. Addresses from the same account are still unlinkable because there is no simple linear relationship between them. However, someone who knows the public key Ksi can link two addresses from the account i with a work factor of less than 264 (by finding indices m and m' such that (m+1) (K1-Ksi) == (m'+1) (K1'-Ksi). The keys Ksi should be kept secret.
    1. Wallet tier "FindReceivedSimple" does not know Ksi, for any i, so it cannot link blockchain outputs to the wallet this way.
    2. Wallet tiers "AddrGenAccnt" and "ViewReceivedAccnt" know Ksi, but this does not give them any extra abilities. They can still only derive addresses for the account i and cannot recognize spend keys for other accounts.
    3. Higher wallet tiers can already derive all addresses.
  3. Output recognition is somewhat simplified, assuming the number of accounts in use is much smaller than the number of subaddresses used for each account.
    1. For each blockchain output with a matching view tag, derive the nominal spend key K1'
    2. For each account i in use:
      1. Calculate Kj' = kai (K1'-Ksi)
      2. Query a database to get a 32-bit index j such that Kj' = (j + 1)-1 X. This database is the same for everyone, so there could be third party services providing this lookup or even a public node RPC method. It would require roughly a 32 GB database and can be optimized using a ~few GB bloom filter to discard values Kj' that are definitely not in the database. If no database or 3rd party service is available, j can still be calculated from Kj' using the Baby-step giant-step algorithm, which would require about 217 point additions and about 1 MB of memory.

Of course, the normal hash-table based scanning would still work.

Edit: a better definition of kxi,j

@UkoeHB
Copy link

UkoeHB commented Jan 10, 2022

We should compare cost vs benefit. Encrypting (i,j) with k_a^i is more useful, but still requires some brute forcing and has a much higher cost (8 bytes per address, 8 bytes per blockchain output). It could probably be reduced to 5 bytes (encrypt just j + MAC) [@tevador from IRC]

With a 2-byte MAC I think the amortized cost of brute forcing is just ciphering the encrypted indices of view tag matches with each account key (only 1/2^24 outputs unowned by an account would leak through to that account's final nominal spend key check). Most users will have a very small number of accounts, and the API can even require that the accounts to search be user-defined (a reasonable constraint imo). It might be helpful to know ballpark costs for ciphering...

The goal here is that a user can reliably identify all of an account's outputs, even if those outputs are received to an unknown address index j. With index-encrypting, the user-cost is ciphering all view tag matches (with very minor expense to test nominal spend keys when the cipher MAC passes). With a j-database, the user-cost is either storing a huge table or connecting to a third-party (which is very hard to implement in constant time with no/minimal privacy losses). It's possible that additional requests to a third-party to consult a j-table would be slower than ciphering indices (especially for small updates).

@tevador
Copy link
Author

tevador commented Jan 11, 2022

Assuming we will encrypt (i,j) with kai, does it make sense to have 4.3 billion different keys? Even if there was a legitimate use case of having billions of accounts under the same wallet, it would be unusable because checking for output ownership would be too slow even with the fastest encryption algorithms.

What about a 3-level heirarchy?

Wallet -> 65536 branches
Branch -> 65536 accounts
Account -> 232 addresses

There would be a different private key for each branch and all addresses under that branch would be generated from that key. The wallet tier "AddrGenAccnt" would become "AddrGenBranch" or something similar.

The 216 key space should still be large enough to allow for replacements in case some keys are compromised, which was the main idea behind the account-level tiers.

Checking all 216 branch keys would be feasible (roughly 5 ms per output according to benchmarks by @UkoeHB, fast enough to scan a day's worth of outputs in 1-2 seconds). The (i,j) tuple would become (branch, accnt, addr) tuple. When decrypting with a specific branch key, the branch index would become a 16-bit MAC, which seems more reasonable than a 32-bit MAC.

This change would be backwards compatible, just the existing account indices would be grouped under branches. Branch 0 could be transparent in the wallet for better UX (most users will probably not use more than 65K accounts).

Edit: for backwards compatibility, it would still be an (i,j) tuple and the 16-bit "division" index calculated from i would be just for the private key derivation and otherwise hidden from users.

@UkoeHB
Copy link

UkoeHB commented Jan 11, 2022

Summary: approaches to output->address mapping

When view-scanning an output, first you compute a nominal view tag. If that tag matches the output's view tag, then you compute a 'nominal spend key' (K^{i,j}_1_nominal in jamtis). If the nominal spend key matches one of your wallet's actual spend keys, i.e. the spend key of address {i,j} (K^{i,j}_1), then address {i,j} probably owns that output ('probably' because if the amount is malformed then the output is unspendable).

After recent discussion, there are now three methods for figuring out if a nominal spend key corresponds to one of your addresses.

Method 1: address look-up table

The method currently used in Monero is for view-scanning wallets to pre-compute a large number of subaddresses (note: in jamtis, the term 'subaddresses' is deprecated in favor of 'addresses'). To check a nominal spend key, the wallet just looks it up in the subaddress table. If there are no matches, then the wallet assumes the output is not owned by the wallet.

Pros

  • conceptually simple
  • zero-cost lookups once the subaddress table is set up

Cons

  • If an output is owned by the wallet, but the owning subaddress isn't in the lookup table, then the output won't be found. This leads to hacky heuristics around setting and increasing the table size.
  • Setting up the subaddress table is expensive (elliptic curve operations are needed to create each subaddress), and depending on the table's size it may take up a large amount of storage. These storage constraints especially impact third-party view scanners (however would not be an issue for third party scanners with jamtis, since mapping outputs to addresses would no longer be the expected responsibility of third party scanners).

Method 2: embed address index in spend key

@tevador proposed embedding an address's index in its spend key. Technically there are two indices: i (the account index), and j (the address index). In the proposal, j is embedded explicitly and i is embedded implicitly.

Assuming a user wants to check if an output is owned by one of their accounts at an address not in a Method 1 lookup table (either they don't have such a lookup table, or they need to do a special check on an output), they will perform some elliptic curve operations on the output's nominal spend key K^{i,j}_1_nominal. For each account i they want to check, they compute K_j_nominal = k^i_a * (K^{i,j}_1_nominal - K^i_s). The value K_j_nominal can be looked up in a static lookup table which contains all possible values of K_j (there are 2^32 valid values, which are the same for all users). If lookup succeeds, then the table lookup will return j.

Pros

  • Allows a user to reliably identify outputs owned by any address in an account.
  • The K_j lookup table is static, so a third-party could allow lookups for an arbitrary number of users.
  • Addresses and outputs would stay the same size (number of bytes).

Cons

  • If this method is used to map all outputs to addresses, it would be computationally expensive for clients of a third-party scanning service. Computing K_j_nominal is similar in cost to computing nominal spend keys. This isn't relatively expensive for a normal scanner, because most of scanning cost (~99%) is computing view tags. However, if view tag scanning is done by a third party (with FindReceivedSimple), then in Method 1 the client (e.g. a ViewAll wallet) doesn't have to do extra work, but in this method they have to compute K_j_nominal.
  • Anyone using this method (even to figure out just a small number of output->address mappings) would need a very large K_j lookup table (~8-32 GB). Alternatively, they could use baby-step-giant-step to derive j from each K_j directly, which would cost ~20 ms per output according to @tevador (20 seconds for 1k outputs).
  • Implementing a privacy-conscious third-party lookup system for the K_j lookup table is non-trivial. Moreover, K_j lookups can only occur after computing K_j, so a client using a third party K_j looker-upper would need to wait for those requests to succeed before finalizing their balance.
  • Account keys k^i_a must be pre-computed.

Method 3: encrypted account tags

In this method, the index pair {i,j} for an address would be encrypted with k^i_a using a cheap 64-bit block cipher (e.g. Blowfish) and attached to the address. These encrypted indices e_accnt_indices would be further encrypted with the same cipher by a tx author when sending funds to that address, using the output shared secret q (change/self-spends would encrypt {i,j} directly with a hash of k_vb and sender-receiver ephemeral key K_e). The resulting 8-byte encrypted account tag e_accnt_tag would be added to the output.

To identify whether an output is owned, the user will decrypt the account index with q, then with and a set of account keys {k^i_a}. When one of the account keys k^i_a computes an i_nominal == i, then the user further tests if the output is owned by that account (i.e. owned by address {i,j}_nominal).

The view-scanning workflow looks like this:

  1. Use k_fr to compute and check the view tag. If this fails, abort.
  2. Use k_fr to compute the nominal shared secret q_nominal.
  3. Use q_nominal to decrypt e_account_tag and get e_accnt_indices_nominal, and compute the nominal spend key K^{i,j}_1_nominal.
    1. Note: Since e_account_tag is always paired with K^{i,j}_1, but can't be further decrypted with k_fr, a FindReceivedSimple service won't learn anything about the indices {i,j}.
    2. Note2: Since change/self-spends don't use q directly, e_accnt_indices_nominal will be garbage for change/self-spends. Only ViewAll wallets will be able to discern any information from e_account_tag for change/self-spends.
  4. Use a set of account keys k^i_a where the user believes there may be funds to compute {i,j}_nominal.
    1. For each key k^i_a where i == i_nominal (this is effectively a MAC test on decryption), compute the spend key K^{i,j}_1 and test K^{i,j}_1 ?= K^{i,j}_1_nominal. If the test succeeds, then the address {i,j} owns the output.
      1. Note1: For {i,j}s that already have funds, the user can just look up the spend key instead of computing it. This optimization helps users who receive many outputs to a small set of addresses.
      2. Note2: A MAC consisting of padding 0s would also work in this case, but the value i_nominal is needed by change/self-spends, so for consistency it is used here as well.
  5. If the tx where this output was found contains one of the wallet's key images, then...
    1. Use k_vb and K_e to compute {i,j}_nominal directly from e_account_tag.
    2. Use k_vb and {i,j}_nominal to compute the spend key K^{i,j}_1.
    3. Use k_vb and K_e to compute q_nominal_change and q_nominal_self_spend, then compute nominal spend keys K^{i,j}_1_nominal with both of those q_... values. Test K^{i,j}_1 ?= K^{i,j}_1_nominal for both of them. If either test succeeds, then the output is owned by the address {i,j} (and is either a change/self-spend depending which test succeeded). (in practice the change variant would be fully tested first, since change outputs are far more common than self-spends)
  6. Use k^i_a, {i,j}, and q to decrypt the output amount, and to check that the amount commitment can be reconstructed (only q is needed here for change/self-spends). If reconstruction fails, abort.
  7. Use k_vb, {i,j}, and q to compute the output's key image. Check the ledger to see if the output has been spent.

Pros

  • Allows a user to reliably identify outputs owned by any address in an account.
  • While the decision-tree for output scanning appears quite complex, it is highly optimized.
    • Decrypting 8 bytes with a block cipher like Blowfish is extremely fast (~100ns on my machine). Even initializing the cipher context is very fast (~40us on my machine), and a single cipher context can be reused for decrypting unlimited encrypted account tags (one cipher context per account key k^i_a would be needed).
    • For normal outputs, the MAC-like test i ?= i_nominal ensures a very very low rate of false positives after decrypting a e_accnt_indices_nominal. This means a user only very rarely has to compute the spend key K^{i,j}_1 when encountering an unowned output.
    • For change/self-spends, a user only examines outputs that A) are found in txs that contain key images from their wallet, B) pass the view-tag test. Again, this means only very rarely will the spend key K^{i,j}_1 be computed pointlessly (also, computing the hash to get the key for decrypting {i,j}_nominal and setting up a Blowfish context would only be done rarely).
    • Therefore, third-party-based view scanning (FindReceivedSimple) is the same time cost for clients (e.g. ViewReceivedAccnt/ViewReceivedAll/ViewAll) as in Method 1 (practically instantaneous aside from network requests, even for huge numbers of outputs).
  • No pre-computed tables (aside from k^i_a for each active account).

Cons

  • Public address strings and outputs would be 8 bytes larger (~5% and ~8-9% respectively; for a 2-in/2-out tx it would be ~0.5%).
  • Account keys k^i_a must be precomputed for non-change/non-self-spends.
    • This restriction ensures that a ViewReceivedAccnt wallet can't decrypt e_accnt_indices_nominal and learn about output:index mappings for outputs owned by other accounts in the same master wallet.

Addendum: changing the indexing strategy

Currently, i and j are both 32 bits (~4.3 billion). This amount is excessive for accounts, but limiting for addresses (e.g. a merchant might generate many addresses that map to product IDs, or need a non-incrementing ID strategy that would benefit from more bits).

@tevador has floated the idea to change those index sizes from i=32|j=32 bits to i=16|j=48 bits.

Pros

  • Account/address indexing would be more practical and each bit would be relatively more valuable.

Cons

  • Changing the indexing strategy would not be backward compatible. If an old account has an index >= 2^16, then it would not be able to generate jamtis addresses. Users with funds in such accounts would have to sweep (either singly or all-at-once) their funds into a permitted account index (or use custom wallet software to deal with address-generation and output-handling).
    • It is unlikely that more than a handful of users (if that) have accounts like this.

@tevador
Copy link
Author

tevador commented Jan 12, 2022

Another thing to consider would be the removal of the account-level wallet tiers.

Pros

  • Reduced complexity (only 5 wallet tiers instead of 7)
  • Simpler and shorter certified addresses (only one signature is needed)
  • No need for separate (i,j) indices in this specification. Addresses could be identified by one 64-bit* number and it would be up to the wallet how that address space is divided.
  • "Method 3" output recognition would be greatly simplified. Address tags would have to be decrypted only with one of two keys: kac for public addresses or kvb for private addresses (change/self-spends).

Cons

  • The address-generating wallet tier would require the global key kac. In case the key is compromised, the whole wallet would have to be replaced.

* In this case, I would recommend to reduce the address space from 64 bits to 56 bits to allow for an 8-bit MAC. A 56 bit address space still allows over 7.2 x 1016 addresses per wallet, which is more than enough for any imaginable use case. This would only affect legacy wallets that have used account indices greater than ~16.7 million, which is impossible in the official wallets because of the lookup table size. Another option would be to reduce the account index to 16 bits and expand the address index to 40 bits.

@tevador
Copy link
Author

tevador commented Jan 16, 2022

I updated the specification based on the January 12th Monero Research Lab meeting.

The most important changes are:

  • Encrypted address tags have been added to addresses and outputs. Addresses are now slightly longer at 181 characters.
  • Account-specific wallet tiers have been discontinued.
  • Address generation uses two auxiliary keys kid and ket, which are used for elliptic curve operations and symmetric encryption, respectively. Using independent keys for different algorithms is a standard security practice that reduces the attack surface.
  • Updated output construction and detection procedures to use address tags.
  • Added a new self-spend detection method in 5.4.1.
  • Self-spends and change outputs are differentiated using one MAC bit of the encrypted address tag.

@UkoeHB
Copy link

UkoeHB commented Jan 17, 2022

KeyDerive(k, x) = Hs(Pad136(k) || x) is a function to derive elliptic curve private keys.
SecretDerive(k, x) = H(Pad136(k) || x) is a function to derive secret keys for symmetric cryptography.

Where does Pad_136(k) come from? Are these supposed to be H_s(k || Pad32(x))?

@tevador
Copy link
Author

tevador commented Jan 17, 2022

136 bytes = 1088 bits is the bitrate of keccak-256. This padding causes the key to be processed alone by the keccak permutation before any other data. It's a security measure, although it might not be strictly needed in our case since none of the following input can be controlled by an attacker.

It may have a slight security advantage because the key kga does not need to be in memory in order to generate addresses. You just need to cache the intermediate 200-byte keccak state after the first permutation.

@j-berman
Copy link

j-berman commented Jan 26, 2022

I'm growing to feel a lotta love for this. I'm just a fledgling cryptographer, but it seems very elegant to me. I also understand why you'd like to include everything all at once and it's as comprehensive as it is. We leave room e.g. in the headers for cool new stuff. I hope more people take the time to read through this carefully.


Some suggested touch-ups on section 2:

This reduces the amount of data that a light wallet has to download by a factor of at least 256.

This tier doesn't reduce data a light wallet has to download. Light wallets currently only download outputs sent to the user, and the transactions of outputs plausibly spent by the user.

With this change, light wallets will need to download 0.4% of all outputs in the chain, and the transactions (or just key images) of outputs plausibly spent by the user (which will likely just be the entire chain).

The third party will not learn which outputs actually belong to the wallet

I don't think this claim can be relied on. 1, if a user re-uses an address, then the third party will learn that those outputs received to the same address belong to a user. 2, timing analysis based on when the client visits the app and submits a tx that has view tag matched outputs leaks metadata in an unavoidable way that can lead to the third party learning which outputs belong to the user.

I think the claim should be weakened to reflect the above reality. Maybe something like: "The third party will not learn output amounts, and the outputs the user received will be obfuscated. Light wallet users should avoid reusing addresses so the third party does not learn received outputs."

and typically consist of 181 characters

Would be nice to include an old address and its character length here for clearer comparison.

Jamtis prevents this attack by allowing the recipient to recognize a Janus output.

Would be nice to include size impact here, or link to this section.

Jamtis addresses and outputs contain an encrypted address tag

Same here -- size impacts would be nice. Just "8 bytes each" seems it would suffice.


This [FindReceived] tier should provide a noticeable UX improvement with a limited impact on privacy.

I disagree with this framing, unless you mean to suggest that light wallets will replace full scanning wallets, which I would disagree with. Rather they will have an improved impact on privacy, and a limited UX impact.

1, we can have an always-on component with just view keys today so wallets only need received outs (source
2, like mentioned above, third party scanners will need to download more, not less


For better UX when opening or restoring a wallet, the wallet is identified by HashIdent("Monero RID" || Kid)

This is cool. You mean like when I open up my wallet app, I see this RID instead of the primary address today, something like that.

An address is identified by a 64-bit integer j with the most significant 8 bits set to zero.

Can you provide a bit more color here why the most significant 8 bits should be set to zero? I'm not seeing why (EDIT: guessing leaving room for the 8 bit MAC discussed above but I'm not seeing where that is.. I'm a little tired)

Wallet software MUST NOT generate addresses with j >= 2^56.

Just curious, why not? What happens if they do (aside from the receiver aborting and not seeing it), and is this something new? EDIT: I see it would overflow, but just trying to understand why this number in particular

If j' >= 2^57, abort.

2^57 is a typo, right?


On certified addresses. I bring this up as a pre-caution. I get how you make "c" very much so different from "a" (anonymous), but identity is sort of opposite to privacy in my view. I think it is actually a good thing that I get plausible deniability on a payment receipt. It means third parties cannot with certainty determine how much I've received, either as an individual or business, if I'm generating new addresses across merchants. The whole crux of this feature is that it is more beneficial to systems that rely on identity, rather than anonymity which is Monero's goal.

As a more practical example, what if some exchanges start requiring "c" withdrawals? It would ruin the whole point of subaddresses, and harm people who might not realize that even though they're generating new addresses, they're leaking their identity.

The person who I am transacting with does not need to know my identity. If someone claims I did not send them money, I gain defense on my claim because there is an identity tied to their address. I would hope that individuals don't start using this. Maybe huge businesses. But maybe we can just leave this feature to PGP (edit: or open alias), and keep it outside of the Monero address protocol? I don't know. Sorry if this is fud-like.

EDIT: really the only debate over this I guess is whether or not we need the address type header because "c" can be implemented and/or decided upon later, which is a super insignificant thing to debate over in the grand scheme of JAMTIS. I hope this doesn't turn into a distraction from the rest of the scheme. Considering people do seem relatively favorable toward the certified address scheme and I'm the only one with concerns from what I can see, it probably does make sense to keep the address type header so the option is there.


Apologies for not much feedback on the cryptography. I will be studying it further, but generally I lean favorable toward all the ramifications and tradeoffs made (I only feel uneasy about certified addresses).

@Tigerix
Copy link

Tigerix commented Jan 26, 2022

Is the RID still up for discussions?
I am coming from a user perspective and would really love to see human readable words.
It makes communication of the RID so much easier, faster and less prone to errors when reading them on the phone.
And its less effort comparing words instead of characters.

Thus I am strongly voting for this:
"It would be much more human readable if the RID was, say, 4 words from a 2048-word dictionary (like BIP39), so that instead of the difficult-to-read h8eug-w77qs-aaf7m-ww63i-hn33c you would get correct-horse-battery-coffee"

@UkoeHB
Copy link

UkoeHB commented Jan 26, 2022

Wallet software MUST NOT generate addresses with j >= 2^56.

Just curious, why not? What happens if they do (aside from the receiver aborting and not seeing it), and is this something new? EDIT: I see it would overflow, but just trying to understand why this number in particular

If j' >= 2^57, abort.

2^57 is a typo, right?

There are 56 bits for the address index, with 8 bits for a MAC. For normal outputs the MAC is 0x00, for self-send outputs the MAC is either 0x00 (change) or 0x01 (self-spend). Saying 2^57 is just for the case with 0x01.

In practice I am byte-pasting the decrypted tag into separate variables (index and mac), so '> 2^57' isn't really accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment