White Paper on Coblue’s Key Management

Remco Bloemen

Executive Summary

Coblue allows people to work together using secure resilient systems. We assume that every user, device or connection can fail or can become malicious. This includes Coblue itself. We expect malicious actors to have the resources of powerful nation-states. Our software runs on a wide range of hardware, from smartphones to servers. Our software handles various modes of connectivity, if connected at all. These assumptions are harsh, but true for modern information systems. It requires a different approach to system design.

This paper presents Coblue’s approach to secure resilient systems. We demonstrate our approach to be more secure, less complex, faster and more efficient than alternatives. Resilient systems need redundancy and are therefore necessarily networked systems. Networked systems are hindered by eight often overlooked factors. Resilient systems require resilient networks, and the most resilient networks are distributed networks.

Distributed systems cannot guarantee consistency, availability and partition tolerance at the same time. Instead, the Coblue solution offers strong eventual consistency. In order to ensure eventual consistency that cannot be tampered with, we use several techniques.

Our cipher suite is based upon Keccak-f₁₆₀₀ for all symmetric cryptography and Ed448 for all asymmetric cryptography, and we have focused on simplicity for our implementation. The advantage of the algorithms we use is the invulnerability against the side-channel attacks AES implementations suffer from. The key strengths of our implementation of the algorithms are beyond the upper bounds of all evaluation models, and hence it is very likely that breaking the used cryptography remains infeasible. On top of that, our cipher suite offers more security through a number of extra features, not offered by any other cipher suite.

Coblue’s Toolbox consists of cryptographic techniques, Defensive Programming and Secure Memory. Defensive Programming hardening techniques offer extra defence in case a user’s device is under attack. Secure Memory protects application running in untrusted environments. A range of low-level cryptographic primitives ensure secure protocols and secure systems. Our advanced cryptography allows us to fight denial-of-service attacks, enables having key escrow without a trusted third party, and enable secure authentication of users over insecure channels.

Coblue’s networking stack constructs resilient end-to-end secure channels in hostile environments while keeping the communication overhead to only ten bytes per packet. The networking stack can establish secure connections as soon as simple web-browsing is possible or while active in a local area network. The decentral network that is being formed is peer-to-peer and requires no central coordination, unless applications specifically demand so.

Input/output routines and storage are a process’ interface to the world, which makes it where the strongest defences need to be. Since this code is so critical to get correct, we decided to generate our I/O procedures from formal specifications. We specify structures in a domain specific formal language, and our tooling generates most of the data handling code from the formal specification, including all binary encoding/decoding code.

Our cipher suite and toolbox are most recently employed for key management in a concrete application: Storro. Storro is a tool to securely share and collaborate on data on a project basis. Five different user roles are specified: outsider (i.e. not part of a project), facilitator, observer, participant and administrator. A license for a project is necessary for interacting, and a license needs to be renewed periodically to ensure it has not yet expired. Our Distributed Content Addressed Store stores all information related to every project for which the user is at least a facilitator of a project. Version management safeguards the exchange and integrity of the project’s version history. Version management creates a strong eventually consistent latest version. The user management ensures that new users can be added to and removed from a project, while at the same time ensuring that roles cannot be escalated.

Paper overview

In the first chapter we will introduce our basic cipher suite. We then explain our algorithm and key strength choices. The chapter concludes by comparing our transport security protocol with existing offerings. The second chapter showcases the Coblue toolbox. It explains the various techniques used to create secure resilient systems. It starts with a description of the non-cryptographic defences necessary for running secure applications on commodity hardware. It explains secure key generation and basic cryptographic primitives. The next section covers advanced techniques that facilitate decentralized systems. This is followed by our works-everywhere network communication stack. Finally we describe our approach to secure input/output and storage. The last chapter contains a case study of a complex real world application currently in production.

Resilient systems

  1. The network is reliable
  2. Latency is zero
  3. Bandwidth is infinite
  4. The network is secure
  5. Topology doesn’t change
  6. There is one administrator
  7. Transport cost is zero
  8. The network is homogeneous

— The “8 Fallacies of Networking”. Peter Deutsch (Sun Microsystems Labs, 1991)

Resilient systems are necessarily distributed computer systems. Resilient systems cannot have a single point of failure. So a central server is not allowed. Servers can be used, but they need to have sufficient redundancy for the application. The servers would then themselves form a distributed computer system.

Theorem: CAP Theorem. No distributed computer systems can give all three of the following desirable guarantees:

Network failure is unavoidable, so the choice is between consistency and availability. Consensus based protocols like Paxos sacrifice availability. When the network splits in two, only the larger part can continue to operate. If the network splits in more parts, none of them may function. In other words: the systems become unavailable. Most practical implementations use (non-Byzantine) Paxos. Paxos relies on the integrity of devices: devices can fail, but they cannot become malicious.

Coblue cannot afford to assume devices won’t become malicious. We also cannot afford the system becoming unavailable. Thus, we need to sacrifice consistency.

We provide strong eventual consistency instead: the network propagates state changes as updates. Eventually all devices receive all updates. Two devices that have received the same unordered set of updates will be in the same state. Two devices that have reached the same state are said to have converged. The system will always be available on every device for read and write access. In practice, connected devices should always be converged. The only exception is the brief period while an update propagates through the network. Disconnected devices should converge soon after the connection re-establishes. We build strong eventually consistent systems using conflict-free replicated data types. We prefer the state-based variant, convergent replicated data types. This variant makes the weakest assumptions and offers the strongest guarantees. We further extend these with cryptographic guarantees to get to Byzantine fault tolerance.

Security guarantees in distributed systems can be classified in four categories:

Sources:

Cipher Suite

“Cryptography is not broken, it is circumvented”
— Adi Shamir (Weizmann Institute, co-inventor RSA and differential cryptanalysis)

Simplicity is an important goal in our cryptography. Cryptography developed in the past two decades has no significant known weaknesses (as the Snowden revelations support). Instead, weaknesses in the implementation break security. The complexity of standards such as SSL/TLS facilitates this. We have carefully evaluated and selected one suite of algorithms and key lengths. Since the algorithms are known in advance, no negotiation is necessary. Should it be required to change algorithms, a new system will be deployed. Data can be migrated from the old system to the new system.

The essential cryptographic algorithms we use are:

The algorithms have no known weaknesses. They have relatively simple specifications without unexplained constants. A concise specification is included in the appendix. The algorithms all run in constant time and have fixed memory access patterns. This makes them immune to a variety of side-channel attacks. The implementation is simple, readable and high-performance, both in software and in hardware: we use well-known public domain implementations. The implementations have 270 and 1046 lines of code and we audited the implementations. Wrapping classes expose only high-level functionality and enforce ‘sane’ usage. All together the end-to-end network encryption amounts to 2,909 lines of code. In comparison: the OpenSSL TLS implementation has 437,157 lines of code. The low code complexity makes the implementations less error prune and easier to audit.

Why not AES-GCM

We use authenticated encryption with associated data. Daemen and Rijmen created AES almost two decades ago. Back then encryption and authentication where considered separate operations. The cryptographic community has since found this to be error-prone. Authenticated encryption integrates both in one operation. McGrew and Viega retrofitted AES with the GCM construction to give it authenticated encryption. Currently AES-GCM is a popular algorithm. In fact, modern desktop processors have AES-NI instructions to enhance its performance. But, it has problems:

AES-GCM has the following problems:

  • In the case of nonce reuse both integrity and confidentiality properties are violated. If the same nonce is used twice, an adversary can create forged ciphertexts easily.
  • When short tags are used, it is rather easy to produce message forgeries. For instance, if the tag is 32 bits, then after 2¹⁶ forgery attempts and 2¹⁶ encryptions of chosen plaintexts (also of length 2¹⁶), a forged ciphertext can be produced. Creation of forgeries can be instantaneous when enough forgeries have been found.
  • GCM security proof has a flaw. It has been repaired recently, but the new security bounds are far worse for nonces not 12 bytes long;
  • GCM implementations are vulnerable to timing attacks if they do not use special AES instructions. The vulnerability remains even if the AES itself is implemented in constant-time. Constant-time implementations of GCM exist, but they are rather slow.
  • GCM restricts the message length to 68 GBytes, which might be undesirable in the future. The total amount of data allowed to encrypt on a single key is limited by 2⁶⁴ blocks, but this number decreases if long nonces are allowed.
  • Reasonably fast implementations of GCM require specific lookup tables, which do not fit into fast memory (L1 cache or similar) on some architectures.

— Dmitry Khovratovich (University of Luxembourg, Microsoft Research)

AES, like GCM, is also vulnerable to side-channel attacks. In 2005, Osvik, Shamir and Tromer demonstrated a cache-timing attack. They obtained an AES key after only 800 encryptions, totaling 65 milliseconds. In 2010 Bangerter, Gullasch and Krenn extended this. They recovered AES keys without the need for either cipher text or plaintext. Their approach works on real-life implementations such as OpenSSL. The cryptographic community is looking for a successor with the CAESAR competition. The committee has of over 20 leading cryptographic experts, including the original creators of AES.

Keyak is Keccak-f₁₆₀₀ employed as an authenticated encryption algorithm. Keccak-f₁₆₀₀ has received extensive analysis as the winner of the SHA-3 competition. The use as an authenticated encryption algorithm has a mathematical proof of safety. Keyak is a prominent contender in the CAESAR competition. It has received extensive review by the cryptographic community itself.

Like Keccak and our other algorithms, Keyak has no sensitive branches or lookups. This makes it resilient to the side-channel attacks. Described above are side-channel attacks for AES. The authentication tags do not have the weaknesses of GCM described above. The algorithm and implementation are also less complex. This prevents errors and facilitates audits. Moreover, all symmetric algorithms share the same core, the Keccak-f₁₆₀₀ function.

Besides increased security and reduced complexity, Keyak also offer increased throughput. Performance is in clock cycles per byte of data, lower is faster:

Architecture AES-GCM Keyak
x86-64 with AES-NI 4 9
x86-64 11 9
ARM Cortex A8 51 16
FPGA 1 0.08

On smartphones and embedded systems Keyak outperforms AES-GCM. It is on these systems that performance is the most critical because of the scarcity of resources on these devices. On x86-64 systems the cryptography overhead is usually negligible. There the trade-off is best made in favour of security.

Note: The AES-GCM FPGA implementation takes 3781 ALUTs (on Cyclone-IV). A crypto-processor would also require a hashing algorithm. Adding SHA-256 takes another 2416 ALUTs (on Cyclone-III), making the total 6197. The Keccak-f₁₆₀₀ implementation takes 4574 ALUTs (on Artix-7). It provides hashing, authenticated encryption and other symmetric cryptography.

Sources:

Key strength

Our implementations have key strengths with fixed values. This lowers complexity and prevents accidental or malicious changes to key strengths. We made a trade-off between realistic multi-decade security, algorithm availability, complexity and performance. We chose the following keys strengths:

Operation algorithm strength
Hash Keccak 512 bits capacity
Symmetric encryption Keyak 256 bits key
Diffie-Hellman Ed448 448 bits key
Digital signatures Ed448 448 bits key

Key strength evaluation models express the strength of cryptographic keys in terms of years it can secure information. These take double or triple Moore’s laws into consideration. This accounts for combined advances in algorithms processors. Using common evaluation standards we get:

Method hash symmetric elliptic curve
Lenstra year 2282 year 2282 year 2233
IETF year 2245 year 2209
ECRYPT II > year 2041 > year 2041 > year 2041
NIST ≫ year 2030 ≫ year 2030 ≫ year 2030
ANSSI > year 2030 > year 2030 > year 2030
BSI > year 2021 > year 2021 > year 2021
NSA top secret top secret top secret

Our key strengths are beyond the upper bounds of the evaluation models. Thus breaking the cryptography is infeasible for the foreseeable future, with a wide security margin. Remaining risks are mathematical breakthroughs and powerful quantum computers. Save for a very unlikely radical breakthrough, mathematical breakthroughs and advances in quantum computing slowly eat away the security margin. When the margin is insufficient the cryptography will be upgraded. During such an upgrade a parallel upgraded system will be deployed and data migrated to the new system.

Sources:

Comparison of transport security protocols

End-to-end encryption is a key component in our peer-to-peer network. It allows us to establish reliable secure communication over untrusted networks. There are more approaches to securing communication over the internet. We evaluated TLS, SSH, IPSec and CurveCP using OpenSSL-1.0.1f, OpenSSH-6.7p1, StrongSwan-5.3.2 and NaCL-20110221 respectively. Both TLS and SSH have a client-server model that is not appropriate for peer-to-peer networks. None of the protocols support our full cipher suite. For evaluation we substituted AES-GCM-256, SHA256 and P384. None of the existing approaches satisfy our strict requirements, as this table will elaborate:

Feature Ours TLS SSH IPSec CurveCP
Complexity (lines of code) 2,909 437,157 96,378 309,796 34,786
Handshake round trips 1 2 15 3 1
Handshake overhead (bytes) 240 2403 4485 > 650 872
Packet overhead (bytes) 10 29—203 33 36 64—672
Confidentiality yes yes¹ yes¹ yes¹ yes
Integrity yes yes¹ yes¹ yes¹ yes
Authenticity yes optional yes¹ yes¹ yes
Forward secrecy yes optional partial optional yes
Safe curve yes no no no yes
Deniability yes no no no no
Identity hiding yes no no no no
Marker free yes no no no no

Note ¹: Users can disable it in configuration. This allows for mistakes in practice. TLS and SSH support many different algorithms, some of which are not secure. This “flexibility” increases code complexity and adds risk and overhead.

More aspects are important than just the standard confidentiality, integrity and authenticity guarantees. Safe curves are a set of additional mathematical requirements established by elliptic curve experts. Forward secrecy guarantees previous conversations remain confidential, even if keys get compromised. Deniability means that neither party has proof that the conversation took place. Identity hiding means an eavesdropper cannot see the identities being authenticated. Marker free protocols are indistinguishable from random noise.

Coblue’s Toolbox

The cipher suite described above implements a range of cryptographic primitives. In addition to the basic cryptography we use unique techniques that offer extra defences. We call this our toolbox. Our toolbox description starts with two non-cryptographic methods. These methods construct lines of defence for processes running on possibly compromised devices. The non-cryptographic defences are:

Defensive programming: Our applications employ an array of hardening techniques to give the user a line of defence even if his/her device is under attack.

Secure Memory: Our applications run in untrusted environments. Malicious (kernel) processes can try to extract information from our applications process. A perfect solution is only possible with trusted hardware, but there are defences. We use the following methods to store sensitive information:

Cryptographic primitives

We have a full suite of well-established, cryptographic primitives. These are essential for designing secure protocols and systems. They are based on the two algorithms presented in the Cipher Suite chapter. In accordance with defensive programming, the interfaces are carefully designed to prevent mis-use.

Pseudo Random: Our cryptographically secure pseudo random generator is Keccak in SpongePRG mode. The state-recovery complexity is 2¹²⁸ assuming 2²⁰ known blocks. The differentiability complexity is 2⁷⁴ blocks. The block size is 180 bytes. Secure memory is used for the internal state.

Random: Every application instance maintains it’s own entropy pool. The pool implements Ferguson & Schneier’s Fortuna algorithm as improved by Dodis et al. The random generator contains an accumulator pool, a generator pool and 32 staging pools. Fortuna protects against manipulation of entropy sources. Even after a full compromise, it guarantees a return to correct operation. Our implementation makes heavy use of secure memory. No two pools are ever unlocked at the same time. Entropy passes between pools through a short-lived segment of secure memory. The pools and generators use our pseudo random generator.

On application start up the pools are initialized with:

  • operating system supplied cryptography grade entropy
  • processor’s hardware based random number generator (RDRAND or RDSEED)
  • entropy stored before
  • screenshot of the current desktop (if supported)
  • processor’s tick counter (RDTSC)
  • process id
  • stack base address
  • heap base address
  • current time in milliseconds

While running the entropy pools are replenished using:

  • all the sources above
  • application IO-event timing (disk, network, user, OS, etc.)
  • mouse coordinates, key presses, screen touches (if available)
  • processor internal state (adapted HAVEGE)
  • several processor tick measurements through collection

Hash: The Keccak hash function is used. Keccak has a large 1600 bit interal state, of which a part is reserved as capacity. The capacity is not directly exposed to input and output. The function requires keying of every usage with a 256 bit domain separating key. Keys relevant to this document are listed in the appendix. The default configuration is a hash of 256 bit output length and 512 bit capacity. It provides 128 bit collision resistance, 256 bit (second) pre-image resistance and 256 bit resistance against any attack. Keccak is immune to length extension and padding attacks.

Message Authentication: Our hash functions are already keyed and meet the requirements for message authentication. For authentication applications we generate random keys instead of fixed domain separating keys.

Key Derivation: The SCrypt key derivation algorithm is used. Older key derivation methods only need processing time. Attackers use parallel computation to brute-forces these methods. Depending on budget, they use GPUs, FPGAs or ASICs. SCrypt needs both memory and processing time. This makes it expensive to brute-force in parallel. In practice, the bottleneck is processor to memory bandwidth. This resource does not follow Moore’s law. Keys derive from passphrases using a random 256 bit salt, 12 rounds of Keccak, 228+random(224)2^{28} + \mathrm{random}(2^{24}) memory cost and 224+random(220)2^{24} + \mathrm{random}(2^{20}) processor cost. This takes 256 MB of RAM and about 1.5 sec of CPU on a modern processor. Every byte is shuffled twice on average. Our implementation uses Keccak instead of PBKDF2-SHA256 + SalsaMix20/8. This simplifies the amount of crypto-code to audit.

Symmetric Key: Keyak mode with 256 bit keys is used. We always use authenticated encryption. Nonces are 128 random bits. Tags are 128 bit. This adds (only) 32 bytes to the message.

We use the same algorithm in bidirectional streams, for example in network communication. Both directions have a 256 bit key and a 128 bit nonces. A packet’s nonce is the 64 bit sequence number combined with the initial nonce. The nonce is not transmitted. Packet tags are 64 bits. This adds 8 bytes to each packet. We follow the NIST SP 800-38D C recommendation for 64 bit tags. The recommendation limits packet size to 512 kilobytes and key usage to 2²⁶ packets. We expire keys after 2²⁶ packets or 24 hours, whichever comes first.

Signatures: The Edwards-curve Digital Signature Algorithm Ed448 is used. Nonces are deterministic and invulnerable to repeated-nonce weaknesses. The keyed hash function separates the signature domains. Public and private keys are both 448 bit in size. Signature size is 112 bytes.

Shared Secrets: The Elliptic Curve Diffie-Hellman protocol Ed448 creates shared secrets. The private and public keys are 448 bit. Shared secret entropy is 446 bits. We expand/contract secret entropy to the final secret size using a domain specific hash. Public/private key size is 56 bytes.

Advanced cryptography

This section will describe methods beyond basic cryptography. It allows us to fight denial-of-service attack, have key escrow without a trusted third party and authenticate users over insecure channels.

Secure Files: Files have read/write access to arbitrary parts and are resizeable. It is infeasible to encrypt and decrypt the entire file for each access. Files are split in four kilobyte blocks. The blocks are symmetrically encrypted using a 256 bit file master key. The 64 bit block number and 64 random bits form the nonce. Tags are 128 bit. A Merkle tree interleaved between content blocks stores the tags. This increases the file size by 0.5%. The tree’s root guarantees the integrity of the entire file.

The Merkle tree makes our implementation authenticated and tamper-proof. Conventional disk encryption (TrueCrypt, BitLocker, FileVault, etc.) do not have integrity checks. In those systems, an attacker can tamper with the content. An attacker can revert changes to a specific part of a file. This is not possible in our implementation.

Convergent Encryption: Based on symmetric encryption with 256 bit encryption keys the algorithm keys a hash with a 256 bit convergence key. It then derives the encryption key by hashing the plaintext. Tag size is 128 bit and there is no nonce. Nonces are not necessary since the generated keys are single-use. Adding a random nonce would remove the convergence property.

Convergent encryption identifies repeated information in a multi-agent system without revealing the information itself. Identical plaintexts result in identical ciphertexts. Yet every message has its own key which is necessary to decrypt. There are important security properties: everyone with access can see if two ciphertexts match. Possessors of the convergence key can create new ciphertexts. They can therefore also brute-force the content of existing ciphertexts. A possessor of a ciphertext and its encryption key can decrypt the message.

Proof of Work: Coelho’s Hokkaido challenge-response protocol is the basis of our proof of work protocol. Like SCrypt in our key derivation, Hokkaido is a memory-bound algorithm. A pseudo random generator generates Hokkaido’s permutation and branch tables. Its seed is a 128 bit random bitstring. The message authentication code key is 256 random bits. The challenge issuer is regenerated every hour. Challenges have configurable difficulty and time limit for solving.

The challenge contains several problem instances. It has a timestamp and a 256 bit message authentication code. The challenge is less than 100 bytes in size. The response has the solutions and a copy of the challenge. The response is less than 132 bytes.

The algorithm requires 16 megabytes of memory. Generating and verifying challenges takes 168 steps, solving challenges requires 2²⁴ steps. Challenges expire if not solved in configured time. The issuer maintains no state for outstanding challenges.

The proof of work protocol fights off denial-of-services attacks. Attacks normally work if the attacker uses few of his own resources to consume a lot of the victim’s. Our solution reverses this proportion. Proof of work forces the attacker to consume memory and processor resources. The server only consumes a tiny amount of processor time generating challenges. The server load and request size tunes the size of the issued workload. In non-attack situation there is no overhead. The burden on honest clients scales with the sophistication of the attack.

Erasure Code: We use a Reed-Solomon code over GF(216)GF(2^{16}). The code supports kk-out-of-nn schemes up to 2¹⁶ shares. (But performance is cubic in nkn - k). Any kk sized subset can generate more shares, this way lost shares can be recovered.

Secret Sharing: We use Shamir’s Secret Sharing Scheme over GF(216)GF(2^{16}). This supports kk-out-of-nn schemes of up to 2¹⁶ shares (with cubic complexity in nkn-k). Payloads larger than 256 bytes are symmetrically encrypted using a random key. The secret sharing scheme then shares the key; the erasure code shares the payload. Any kk sized subset can generate more shares.

The secret sharing protocol allows escrow of sensitive information. A group of five users can decide that any majority of three should be able to recover lost keys. The key generator distributes shares to the group using a 3-out-of-5 scheme.

Zero-Knowledge Password Proof: We use the Socialist Millionaire Protocol implemented in Ed448.

The proof protocol allows two users to interrogate each other. Questions like “Where did we first meet?” or “What is the agreed upon password?” help to authenticate. Both parties will known if they gave the same answer or not. Yet the protocol prevents a potential man-in-the-middle from learning anything. The other party participates in every guess. So an attacker is unable to brute-force the answers. This way two users sharing a common secret can authenticate, even over insecure channels.

Steganography: Our steganography formats a message as a valid bitmap image file. The encoder zero-pads the data to fit an approximately square image. It then adds a valid bitmap header. The overhead is 58 bytes plus padding

Our steganography is trivial and will not fool any targeted attack. Our applications use it to establish encrypted communication channels that appear unencrypted. Sometimes proxies, firewalls and other men-in-the-middle block encrypted connections. With steganography we can still build end-to-end encryption connections.

Pronouncable Entropy: We try to avoid presenting key material to the user where possible. For humans, cryptographic keys are hard to compare and remember. Pronouncable entropy mimics human language to present cryptographic information in a memorable form. It stores up to 100 bits of entropy in a phrase as “michmeckrojef in beony”. Compare this with 100 bits in base64: bHlN va/T sfEf z5C9 dw==.

Passphrase Entropy Estimation: We try to avoid the use of passphrases where possible. When users do set a passphrase, the quality of their passphrases is estimated. An n-gram model estimates the predictability of the password, expressed in bits of entropy. We trained the model with ten million passwords and equally many names and titles. It models the brute-force complexity of an attacker with this information.

Network communication

Coblue’s networking stack constructs resilient end-to-end secure channels in hostile environments. The communication overhead is minimal (only ten bytes per packet). The stack establishes secure connections wherever simple web-browsing is possible. They are also possible in an offline local area network. The network is peer-to-peer and requires no central coordination. Applications can create centralized services if needed.

Socket: The connection is agnostic over the transport protocol used. We support bidirectional transports as bad as 10% success rate and 10 second latency. Transports are not required to deliver in order.

In a peer-to-peer network there is no distinction between clients and servers. Yet, every communication has an initiating side and a listening side. To the network, listening is the same as being a server. Direct connections between two peers behind NAT routers and firewalls are not always possible. We establish connections using three methods. First, we try different transports. Second, we use NAT and proxy traversal. Finally, we route the traffic over any accessible routing peer. If necessary, we disguise encrypted traffic as normal HTTP traffic.

Currently implemented transports are:

  • Standard TCP/IP.
  • Standard UDP/IP. We distinguish transport channels by host/port. The transport is unreliable, but the connection takes care of this.
  • Plain HTTP or HTTPS. We ignore certificates since the connection guarantees security. A peer accepts incoming HTTP connections if enabled in configuration. It can run its own webserver or export a FastCGI interface. We disguise traffic as normal web traffic. Packets are steganographically hidden in images and sent and received over POST requests. The URL is randomized to prevent caching. The server delays responses until a packet is available (or timed out). The client makes sure there is always an outstanding request. This creates a low latency bi-directional connection that can traverse firewalls and proxy servers.
  • Routed connections. If a direct connection is impossible, an indirect connections is set up. The peer uses peer discovery to find publicly accessible routing peers it can connect to. It then advertises its reachability through these peers. The routing peers will forward traffic. Note that this happens below the connection layer. The router cannot inspect the connection and is not a man-in-the-middle. The worst it can do is bandwidth throttling or denying the connection, like any internet router can do.

The transports and listening ports are all optional and configurable. The least to connect is an outgoing HTTP connection. Proxy servers are no problem. This corresponds with what users consider ‘an internet connection’. In case no internet connection is available, local area peer-to-peer connections are still available. When multiple transports are available between two peers they select the best one. The peers try the different transports using an exponential back-off algorithm.

Connection: A connection takes a socket and uses cryptography to turn it into a secure connection. The connection is reliable, end-to-end encrypted and authenticated. Authentication is between the peer public keys. When a socket establishes a transport both peers engage in a triple Diffie-Hellman handshake. For an initiator with authentication key (i,I)(i, I) and responder with authentication key (r,R)(r, R) this goes as follows:

  1. The initiator generates an ephemeral key pair (a,A)(a, A).
  2. The initiator sends the public key AA to the responder.
  3. The responder generates an ephemeral key pair (b,B)(b, B).
  4. The responder computes the shared secret key K1=bAK_1 = b A.
  5. The responder computes the shared secret key K2=K1rAK_2 = K_1 \parallel r A.
  6. The responder sends BEncrypt[K1](R)Encrypt[K2](message)B \parallel \text{Encrypt}[K_1 ](R) \parallel \text{Encrypt}[K_2 ](\text{message})
  7. The initiator computes the shared secret key K1=aBK_1 = a B.
  8. The initiator computes the shared secret key K2=K1aRK_2 = K_1 \parallel a R.
  9. The initiator computes the shared secret key K3=K2iBK_3 = K_2 \parallel i B.
  10. The initiator sends Encrypt[K2](I)Encrypt[K3](message)\text{Encrypt}[K_2 ](I) \parallel \text{Encrypt}[K_3 ](\text{message})
  11. The responder computes the shared secret key K3=K2bIK_3 = K_2 \parallel b I.
  12. Both set up a bidirectional stream using K3AK_3 \parallel A and K3BK_3 \parallel B as keys.

In step six, the responder has not yet authenticated the initiator. The message can therefore only contain information it would share with anyone. In practice we leave this message empty and only send the tag. In step twelve the keys and nonces of the bidirectional stream are derived from K3K_3 . In the direction from initiator to responder K3AK_3 \parallel A is used and in the reverse direction K3BK_3 \parallel B is used. Each packet contains the lower 16 bits of the package sequence number. The receiver uses a 1024 packet window for out-of-order packets. Together with the tag, the total communication overhead is 10 bytes per packet. Connections have a heartbeat to sense if the connection is still alive.

Peer Info: Peers can announce themselves using a peer info object. The object contains the peer public key, the addresses the peer is reachable on, a timestamp and a signature. The signature authenticates the peer info to the peer public key. The peer private keys are stored in a key vault. The peer key pairs are not security critical. Should an application depend on it, tamper resistant hardware can store the key for enhanced security.

Peer Discovery: The peer-to-peer network self-assembles by discovering other peers. This comes down to collecting peer infos. The implemented optional methods are:

  • Bundling: The software includes peer info objects.
  • UDP Broadcasting: the peer probes the local area network.
  • Peer exchange: connected peers exchange known peer info objects.

Services: Application level services have their API specified in a formal language. A hash of the formal specification identifies the services. The network stack handles service discovery, serialization and message passing.

Storage

Input/output routines and storage are a process’ interface to the world. This is also where the strongest defences need to be. Since this code is so critical to get correct, we decided not to write it at all. Instead we generate our I/O procedures from formal specifications.

Objects: We specify information containing structures in a domain specific formal language. Tooling generates most of the data handling code from the formal specification. This includes all binary encoding/decoding code.

Bix: We serialize objects using a bix, our binary format. The format is a dense and high-performance serialization. It is like Google Protocol Buffers or Apache Thrift. Our format is unique in three ways:

  • It can encode cyclical data structures.
  • It is one-to-one. An object always has one unique valid encoding.
  • Encoded objects include a 128 bit hash of the formal specification for versioning.

Content Addressed Store: A key-value store where the key is a 256 bit hash of the serialized value. This allows the calling side to confirm the correct behaviour of the store. The store is a primitive in Byzantine systems.

Case Study: Storro Key Management

Storro allows users to collaborate on projects. It creates an encrypted drive on the users’ devices. The drive contains a folder for each project the user has joined. Changes made to the folder distribute to the project’s other users. Every device independently validates correct operation. Storro resolves any conflicting changes without user interaction. A revision history logs all changes. All historical versions are retrievable. The integrity is cryptographically guaranteed.

A user can be in one of five different roles in a project. In monotonically increasing order of privilege they are:

  1. Outsider: The user is not in any way part of the project.
  2. Facilitator: The user provides backup and hosting. Other than that, there is no read access. The user is not able to learn anything about the contents of the project: not even a list of files.
  3. Observer: The user has full read-only access to the project. The user is a passive observer and not able to make any changes.
  4. Participant: The user participates in the project. The user has full write access to the project folder. He/she is able to create, change and remove all files and folders.
  5. Administrator: The user administrates the project. Like a participant, he/she has read/write access. Administrators can also change the project metadata. This includes adding or removing users and changing their roles. (With one exception: the last administrator cannot remove him/her self.)

Storro enforces non-mutable privileges by controlling access to read-keys. It forces mutating privileges by having every participant confirm the correctness of the mutation. Storro limits storage of encrypted information to a set of allowed devices.

License Management

LicenseCertificate:
    user: User public key
    device: Device public key
    kind: enum of Server Desktop Laptop Smartphone
    validity: DateTime
    signature: Signature by a license authority

LicenseService:
    refreshLicense: ∅ → LicenseCertificate

A device maintains an up-to-date LicenseCertificate object. It periodically requests a new LicenseCertificate object from a license authority. The license authority’s signature validates the license. Storro has a fixed set of license authorities, compiled into the executable. Initial authentication uses the socialist millionaire protocol on the license key. Later authentications can also use user/device keys.

The full version of the licensing protocol exchanges more details about the licensee. Storro only shares these details between the licensee and the license authority. The authority can hand out LicenseCertificate objects with longer or shorter validity. The validity forces peers to come online and refresh the license. The validity is a trade-off between offline functionality and revoking licenses.

Storro peers will request a valid LicenseCertificate object before they interact. An expired license will lock a user/device out of the network. The user can continue to work locally. A renewed license reconnects the user. Storro then integrates the local and remote changes.

Distributed Content Addressed Store

StoreCertificate:
    projectUuid: Uuid
    users: set of public key
    admins: set of public key
    timestamp: DateTime
    signature: signature by an admin

StoreService:
    exchangeCertificate: set of StoreCertificate → set of StoreCertificate
    get: (Uuid, Hash) → (Uuid, Value)
    put: (Uuid, Value) → ∅
    sync: Binary → Binary

Storro stores all the projects’ state in a content addressed stores. The stores synchronize between all users that are at least facilitator in the project. Facilitators cannot decrypt anything in the store. The facilitators know who to synchronize with and who administrates the project. Storro maintains an up to date StoreCertificate object for every project. This object contains the list of devices that can access it. It also contains a list of administrators that can update the StoreCertificate.

Exchange Certificate: When two Storro peers connect, they check which projects the other peer has access to. They then exchange the latest StoreCertificates for those projects. For every certificate received, the peer verifies and updates. Only StoreCertificates signed by administrators are accepted. The timestamp guarantees that the latest certificates eventually reach every peer.

Get, Put and Sync: The store has the conventional get and put calls. Get retrieves a value based on its hash. The requester verifies correct behaviour by verifying the hash of the returned value. The sync call uses an efficient protocol to discover missing hashes between two stores. The put call is for completeness. Every store can validate its own integrity by checking the hashes. Observers can decrypt the contents and perform full integrity checks and garbage collection.

Version Management

VersionCertificate:
    projectUuid: Uuid
    readKey: set of encrypted keys to Version
    timestamp: DateTime
    signature: signature by an observer

VersionService:
    exchangeVersions: set of VersionCertificate → set of VersionCertificate

The readKey contains the keys to read the latest Version object. The certificate writer encrypts them for each user with at least observer role. Storro stores the Version object and the following related objects convergently encrypted in the content addressed store:

Version:
    metadata: key to Metadata
    previous: set of key to Version
    root: key to Directory
    signature: user signature

Directory:
    name: string
    entries: set of (key to File or Director)

File:
    name: string
    contents: key to binary

The next chapter will describe the Metadata object. The specification leaves out details such as modification times and compression. Observers can create a new version and the corresponding certificate. This new version merges updates from every other version the device is aware of. Devices with observer role have one unique latest version. Facilitators have insufficient permission to create their own versions. Facilitators will store and pass along the latest versions of other users.

The Version objects satisfy invariants. They are a state-based conflict-free replicated data type with strong eventual consistency. The state is monotonically increasing. This is explicitly constructed by the previous hashes. These create a causal lattice of Version objects:

  1. The versions form a directed acyclic graph through their previous fields, the version graph.
  2. The version graph has a least element, the initial version.
  3. The hash of the initial version is the projects’ universally unique identifier.
  4. The version graph has a greatest element, the latest version.
  5. Converged replicas have the same latest version.
  6. Converged replicas have identical states.
  7. Non-initial versions have either one parent, the commits, or more, the merges.
  8. The initial version has one user with the administrator role, the founder.
  9. The initial version is signed by the founder.
  10. Commits differ from their previous commit in either the metadata or root field, not both. These are called meta-commits and root-commits.
  11. Root-commits are signed by a user that has a participant or administrator role in the previous version.
  12. The previous versions in a merge are commits.
  13. The previous versions in a merge are pair-wise unreachable.
  14. Merges are a known deterministic function of their previous versions.
  15. No two elements in a Directory.entries have the same name.

Create Project: The founder creates an initial version that satisfies all invariants. The founder can work in the project on his device and can invite other users/devices.

Exchange Versions: Two devices start with the content addressed store synchronization. Next they exchange VersionCertificates for the latest versions of the project. Storro peers restrict this to versions the other device is also member of. Peers validate the certificate on receipt. Facilitators can stop here. Observers proceed to construct the Version object and verify the invariants. Peers ignore invalid data. Peers integrate new data and update their VersionCertificate.

Add Commit: The device registers local changes to the projects’ directories. Storro peers store these changes as a commit in the Version history. They update their VersionCertificate and distribute it to other devices. Devices validate the new Version on receipt. Storro rejects illegal Versions. Illegal changes will not propagate the network beyond the malicious device(s).

User Management

Invitation:
    projectUuid:
    name: String
    readKey: key to Version
    voucher: private key
    expiry: DateTime

Storro devices store the following three types of information in the content addressed store using convergent encryption:

Metadata:
    name: string
    description: string
    users: set of Credentials
    vouchers: set of Voucher

Credentials:
    user: user public key
    role: enum of Facilitator Observer Participant Administrator

Voucher:
    key: public key
    role: enum of Facilitator Observer Participant Administrator
    expiry: DateTime

The Metadata has extra invariants:

  1. A meta-commit is signed by an administrator, a regular user or a voucher.
  2. A regular user signed commit only affects that user and does not increase the maximal role of that user.
  3. A voucher signed commit removes that voucher and can add one user with the role specified in the voucher.
  4. Meta-commits have no vouchers with expiry dates before the signature date.
  5. There is at least one administrator.

Invite User: An admin can invite a user by adding the user/device keys to the credentials. The admin then distributes the updated StoreCertificate and VersionCertificate. When the user/device connects to an updated peer it can retrieve the latest version. The public keys of the user/device are not always known in advance. In this case, the admin can create a voucher. The admin can then send the user/device an Invitation by any secure channel. The voucher is a single-use right to create an account with a predetermined role. The user/devices can accept the invitation by creating its own account. The user/device can reject the invitation by removing the voucher. This only requires retrieving the latest Version and Metadata objects.

Remove User: The admin or user/device creates a meta-commit that removes the user/device from the project. The device then distributes this update. Later VersionCertificates will not include a key for the user. Later requests to peers will fail for the removed user. The user/device still receives the meta-commit containing its removal, which allows the user/device to detect its removal. When removed, a Storro device will remove encrypted data.

Appendix: Algorithms

“Il faut qu’il n’exige pas le secret, et qu’il puisse sans inconvénient tomber entre les mains de l’ennemi;”
— Auguste Kerckhoffs, La Cryptographie Militaire (1883)

Keccak: The internal state is SGF(2)1600S\in GF(2)^{1600} also represented as AGF(2)5×5×64A \in GF(2)^{5\times 5\times 64} by Ax,y,z=S320x+64y+zA_{x,y,z} = S_{320x + 64y +z}. The permutation for Keccak is f=R23R0f = R_{23}\circ \ldots \circ R_0 and for Keyak it is f=R23R12f = R_{23}\circ \ldots \circ R_{12} where Ri=ιiχπρθR_i = \iota _i \circ \chi \circ \pi \circ \rho \circ \theta . The transform θ\theta is Ax,y,z=+[0,4]u(Ax1,u,z+Ax+1,u,z1)A_{x,y,z} \stackrel+= \sum_{[0,4]}^{u}( A_{x-1,u,z} + A_{x+1,u,z-1}). Steps πρ\pi \circ \rho together are Ay,2x+3y,z=Ax,y,zRx,yA'_{y,2x+3y,z} = A_{x,y,z - \mathrm{R}_{x,y}} where Rx,y\mathrm{R}_{x,y} is computed by setting R0,0=0\mathrm{R}_{0,0}=0 and Rxi,yi=(i+1)(i+2)2\mathrm{R}_{x_i ,y_i }=\frac{(i+1)(i+2)}2 where x0=1x_0 =1, y0=0y_0 =0, xi+1=yix_{i+1}=y_i and yi+1=2xi+3yimod5y_{i+1}=2x_i +3y_i \operatorname{mod} 5. The χ\chi step is Ax,y,z=+(Ax+1,y,z+1)Ax+2,y,zA_{x,y,z} \stackrel+= (A_{x+1,y,z} + 1)A_{x + 2,y,z}. Step ιi\iota _i is identity except A0,0,2j1=+(xj+7imodx8+x6+x5+x4+1)modxA_{0,0,2^j-1} \stackrel+= (x^{j+7i} \operatorname{mod} x^8 + x^6 + x^5 + x^4 + 1) \operatorname{mod} x for j[0,6]j\in [0,6]. The permutation is used in a sponge or duplex constructions to derive all symmetric cryptography.

Ed448: Consider the elliptic curve y2+x3=139081x2+xamod244822241y^2 + x^3 = 1 - 39081 x^2 + x \phantom{a} \operatorname{mod} 2^{448} - 2^{224} - 1 with base point G=(,19)G = (\ldots , 19) of prime order n=244613818066809895115352007386748515426880336692474882178609894547503885n = 2^{446} - 13 818 066 809 895 115 352 007 386 748 515 426 880 336 692 474 882 178 609 894 547 503 885. The private key is a number dd chosen from {1,,n1}\{1,\ldots ,n-1\}. The public key is the point H=dGH = d G encoded as a 448 bit xx-coordinate. From two keypairs (dA,HA)(d_A, H_A) and (dB,HB)(d_B, H_B) the Diffie-Hellman secret is calculated as: S=dAHB=dA(dBG)=dB(dAG)=dBHAS = d_A H_B = d_A ( d_B G ) = d_B ( d_A G ) = d_B H_A.

To sign a message m{1,,n1}m \in \{1,\ldots ,n-1\} we compute r=hash(m)amodnr = \mathrm{hash}(m) \phantom{a} \operatorname{mod} n, R=rGR = r G and s=r+hash(mHR)damodns = r + \mathrm{hash}(m∥H∥R) d \phantom{a} \operatorname{mod} n. The signature is (R,s)(R,s) with RR encoded like the public key and ss as a 56 byte integer. To verify we check sG=R+hash(mHR)Hs G = R + \mathrm{hash}(m∥H∥R) H.

Hokkaido: Let D=GF(2)nD=GF(2)^n and tt be a random permutation on DD. Let vv and ww be vectors of random elements of DlD^l. The domain parameters are (n,l,t,v,w)(n,l,t,v,w). The challenge issuer chooses x0Dx_0 \in D and randomly iterates either xi+1=t(xivi)x_{i+1} = t(x_i \oplus v_i ) or xi+1=t(xiwi)x_{i+1} = t(x_i \oplus w_i ) while calculating a checksum c=0lirot(xi,i)c =\mathrm{\oplus }^{i}_{0\ldots l} \mathrm{rot}(x_i ,i). The challenge is (x0,c)(x_0 , c). The solver brute-forces the choices to produce a path that results in the same cc sqtarting from x0x_0 .

SCrypt: Given a salt SS, passphrase PP and parameters (n,l,r)(n, l, r). Take a random function h:GF(2)lGF(2)l\mathrm{h}:GF(2)^l\rightarrow GF(2)^l and initialize an array of bitvectors VGF(2)n×lV \in GF(2)^{n\times l} by setting V0=hashl(SP)V_0 = \mathrm{hash}_{l}(S ∥ P) and Vi=h(Vi1)V_i = \mathrm{h}(V_{i-1}). A sequence is then started with X0=h(Vn1)X_0 = \mathrm{h}(V_{n-1}) and iterated with Xi=h(Xi1VXimodn)X_i = \mathrm{h}(X_{i-1} \oplus V_{X_i \operatorname{mod} n}). The resulting key K=hash(PXr1)K = \mathrm{hash}(P ∥ X_{r-1}). In our implementation h\mathrm{h} is Keyak-ff.

Socialist Millionaire: Take GG and nn from Ed448. Alice and Bob have secrets sAs_A and sBs_B and want to verify their equality. They each generate three random numbers from {1,,n1}\{1,\ldots ,n-1\}: a1,a2,a3,b1,b2,b3a_1 , a_2 , a_3 , b_1 , b_2 , b_3 . Using Diffie-Hellman they establish S1=a1b1GS_1 = a_1 b_1 G and S2=a2b2GS_2 = a_2 b_2 G. They compute PA=a3S1P_A = a_3 S_1 and PB=b3S1P_B = b_3 S_1 and communicate the result. They compute QA=a3G+sAS2Q_A = a_3 G + s_A S_2 and QB=b3G+sBS2Q_B = b_3 G + s_B S_2 and communicate the result. A final Diffie-Hellman establishes S3=a1b1(QAQB)S_3 = a_1 b_1 (Q_A - Q_B). Both now verify: S3PA+PB=a1b1a2b2(sAsB)G=0S_3 - P_A + P_B = a_1 b_1 a_2 b_2 (s_A - s_B) G = 0 iff sA=sBs_A = s_B.

Sources: