The User Experience of Transaction Confirmation
https://news.ycombinator.com/item?id=35845540
Why the web is friendly and web3 isn't.
Background
Operations in an application generally work like this: ![[Decentralized Consent 2023-05-01 10.04.07.excalidraw|600]] The user interacts with the digital world through a user agent1. The user agent is typically a web browser, but could also be a native app. In the decentralized web the user agent also has a wallet. The wallet is either embedded in, alongside, or instead of a web browser. The wallet is necessary to add support for certain cryptographic operations and network protocols.
The term user agent is imho underutilized. Official definitions can be found in RFC 9110 and W3C, but the term predates HTTP and likely the internet itself. It refers broadly to any software the user uses to generate network requests and related utilities. This certainly covers wallets, including hardware wallets. I would argue that wallets are particular strong examples
A service is any external state that relates to the user. Examples are
An application is a user interface.
Web2
[!tldr]
- tightly integrated application and services,
- hierarchy of trust to authenticate,
- empowered vendors.
In web2 the vendor is in charge. It's the user-focused web. In the web2 world the user agent is a web browser and the application and service are build by the same vendor. The application communicates with the user agent using the extensive and well known web standards. Should a third-party service be required this can be implemented by
- linking directly to the application. This is trusted because the third-party application will interact directly with the user agent. Technically this is implemented using a session cookie.
- using OAuth to authorize the first party service to directly interact with the the third party service on the users behalf. Note that to set up the authorization the above direct link method is used. The first party service often also needs to register and be approved by the third party service.
- using hacky work-around solutions like Plaid. Authenticity of the application and services is achieved using public key infrastructure.
Web 2 trust model
The user trusts their browser to implement the standards correctly, and therefore to present the applications as intended by the vendor.
Web3
[!tldr]
- decoupling of application and services,
- removal of entry barriers, and
- empowered user (agents).
In web3 the user is in charge. This puts substantially more emphasis on the user agent. It's the user-centered web. In the web3 world a number of things are different:
- we do not want to trust the existing public key infrastructure.
- It is centralized by design, see criticism and also the needless complexity of the TLS/x509 protocols.
- anyone should be free to develop a new application on top of existing services without prior approval of the service.
- services and applications can be developed by anonymous vendors.
- User in control, and user as developer.
Consequently
- can not trust application or service
References
- Jon Stokes (2022). The Billion User Table
In an ideal decentralized system, all these components can be permissionlessly replaced. The users ate free to use a user agent of their choice and user Developers are free to build new applications on top of existing services.
WorldID requires users to generate a proof using their identity key, an external nullifier and a signal. The resulting proof authorizes a particular action to be taken. These actions are potentially high value and it is important that the user understands exactly what they are about to proof.
Phishing. Given a high value app ‘ImportantApp’, an attacker could phis by creating a ‘FunnyApp’ that uses the same external nullifier. The user would be tricked into generating a proof for FunnyApp that ostensibly authorizes an innocent desirable action. However, due to the set up this proof also authorizes a mallicious action in ImportantApp. The attacker can have the user send the proof directly to ImportantApp or have it send to FunnyApp, but then replay it in ImportantApp.
Analogy with Signing. This is basically the eth_sign
problem as discussed in EIP-712 but with identity key, an external nullifier, signal and proving substituted for private key, domain separator, message and signing respectively.
Requirements
- Human readable. Making it machine readable can be done by making the domain separator commit to a schema for the message. This can get us from a binary string to structured data. Structured data (think JSON) is not a format most users can grok.
- Internationalization.
- App implementer freedom of implementation (potentially updatable)
- Multi media.
- Support web wallets, mobile wallets and hardware wallets.
- Mobile wallets can display interactive UI elements, have graphics links etc.
- Web wallets
- Ledger nano can only show a short single line ascii string.
- Implementer freedom to optimize user experience.
- Support web wallets, mobile wallets and hardware wallets.
Ideas
We need a function like (context, message: bytes, modality) -> user_interface
where
context
can be empty, contain the domain separator, or further information about the user.- Q: What kind of context information would be useful? What are the privacy implications of including it?
message
is the actual data to be signed, for which we need a user-readable representation.modality
which specified what kind of UI is expected, i.e. which language and wheter we want a short line of text for use in ledger nano or a full mobile
This function is not necessarily a pure function and may do further lookups like pulling in the proposol to be voted on to get descriptive texts of the proposal and the choices. It should not mutate any state. (Aside: the signed messages themselves should be idempotent.)
- Have the domain separator contain one of
- a contract address with a function
(message: bytes, locality) -> string
.- The contract could be an upgradable proxy.
- a domain name with a record containing a contract address that implements the above.
- a url to an RPC endpoint implementing
(message: bytes, locality) -> string
.
- a contract address with a function
- Have th
Issue
https://github.com/worldcoin/world-id-contracts/security/advisories/GHSA-22v7-gcvp-q364
Sub problems
1. How to retrieve required information securely.
2. How to encode the required information.
This requires input
Solutions to problem #1.
1. Hardcoded list
The easiest immediate solution is for the wallet to have a hardcoded list of domain separators and their meta data. This can be fetched from a back end so we can updated it without having to go through an app-update. This is the process Metamask uses for tokens (source, main net list). (This is how we do it currently. https://github.com/worldcoin/developer-portal https://docs.worldcoin.org/ Sync with Paolo).
- Q: How does wallet connect do this curation? This requires us to approve each new domain separator. We can make this process more transparent by maintaining the list in a public Github repo and inviting developers to submit PRs. This is how Uniswap manages the default list of token addresses, see uniswap/default-token-list. Tokens are easy, all that is needed is a basic trusted metadata like a name, description and an icon everything else is standardized in ERC-20. The WorldID protocol is more complex. A message is nearly always required, if only for replay protection. For airdrops the message contains the token recipient. For voting applications the message would contain the choice made. Other applications will have their own requirements.
2. Curation lists
The previous approach makes the wallet developer a gatekeeper for applications, which we'd like to avoid. Again following Uniswap's lead, we can set a standard for curated metadata similar to tokenlists. Anyone can then This does raise the question of "Who curates the curated lists?". Uniswap 'solved' this with another public Github repo: uniswap/tokenlists-org. Though for their swap front-end they opted to hardcode a list of lists (source).
3. Contract supplied metadata
-
May want to limit which contract can receive the proof and which websites can request them.
-
How many verified external nullifiers: less than a thousand.
-
Unverified / experiments / tests: more, maybe 10k.
-
External nullifier use-cases.
- Airdrops,
- Log in (OAuth),
- Proof of personhood.
Do we want alternative metadata?
Solutions to problem #2
Check out existing work.