Cweb

Cweb (si-wɛb) is a collection of P2P protocols for private communication, implementing the Open Private Data Architecture. There is no central server and clients communicate directly, but instead of direct TCP/IP connections they utilized online object storage (user-managed) as an intermediary. All data is end-to-end encrypted, sharing and access control are managed via cryptography.

Reliable P2P for mobile clients

Many traditional P2P protocols assume reliable network connectivity and high bandwidth - reasonable assumptions for desktop applications, but not for mobile clients. The basic technical challenges Cweb is solving is reliable P2P communication for peers that are constrained by one or more of: (a) intermittent connectivity, due to (mobile) network unavailability or device being in sleep mode, (b) inability to receive incoming connections, due to not having permanent IP addresses, being behind NAT, (c) low bandwidth and storage, due to hardware or (mobile) network limitations.

Plain TCP/IP is insufficient for reliable P2P between mobile clients, since it cannot buffer messages while the client is unreachable, and it can’t reach clients behind NAT. Traditional SaaS application overcome these by storing data on the developer-operated backend and having all clients communicate with the server, rather than with each other.

In Cweb, instead of the central server, every client maintains its own off-device online storage that can be though of as a lightweight home server. Client’s storage serves as their outgoing mailbox accessible to other clients irrespective of owner’s momentary connectivity state. The storage server is anything implementing (a small subset of) the Amazon S3 REST API. There is no specialized Cweb server - all application logic and encryption lives on the client. Cweb protocols rely on data envelopes and cryptography for reducing high-level communication protocols, such as messaging or file sharing, to object storage reads and writes.

The approach can be thought of as an application-level store and forward or delay-tolerant networking (DTN), with an important difference that messages are buffered in the client-controlled online storage, instead of networking infrastructure.

Storage-augmented P2P communication

Suppose every peer has provisioned online object storage and has two access keys: a write key for full access to its storage, and a read key for read-only access. Write keys are kept secret and are never shared. Read keys are shared with peers. An instance of Cweb protocol performs two types of operations:

writing files into its own storage, using its write key,
reading files from other peer’ storage, using their read keys.

Consider an example of delivering a message from peer A to B.

# Peer A
send(B, "hello")

# Peer B
message = receive(A)
print message  # "hello"

To send the message to B, peer A writes an encrypted file containing the message in its own storage, and shares its storage read key, and the message decryption key with B. If B is interested to receive messages from A, B polls A’s storage and checks whether a new message has been published.

Push notifications. Pushing (or notifying) a peer in asynchronous P2P system is an inherently hard problem. On mobile, push notifications are notoriously hard outside of Google and Apple solutions (FCM, APNS). These solutions, and even the independent ones (Pushy), all require app developer to provision and operate notification service on behalf of its users - something to be avoided in a P2P system. Therefore, Cweb protocols are asynchronous and pull-based and do not support push. Pull frequency can be used to optimize delivery latency vs resource usage tradeoff for specific use-cases.

Privacy and security

Besides the obvious requirement of message delivery, we are looking for privacy and security properties such as:

Only B can read the content of the message, but not the intermediaries or any other party.
B can verify authenticity of the message, i.e., that is was actually sent by A.
The fact of A sending the message to B should not be know to intermediaries or observers (metadata privacy).

End-to-end encryption

Cweb uses end-to-end encryption to ensure that only the intended recipients can read messages. The plaintext never leaves the client, not even to its own online storage (except for data that is intentionally public).

Every file is encrypted with an individual key, which allows fine-grained access control. Read access is granted by sharing specific decryption keys with the intended recipients. Efficient key management is one of the technical challenges in building Cweb protocols.

For verifying message authenticity Cweb employs standard public key signatures.

Metadata privacy

Consider three types of threats:

Peers that have read access to A’s storage could observe files and infer communication patterns with other peers.
Own or others’ storage systems could observe access patterns and network connection metadata.
Other intermediaries such as mobile network providers, internet routers could observe access patterns and network connection metadata.

For preventing peers guessing who A is communicating with by looking at files in A’s storage, Cweb encodes file names such that only the intended recipient(s) can construct them. Names are generated from random numbers or one-way hashes, thus guessing a file name intended for a specific peer is impossible. Cweb protocols fetch remote files by name, and it is advisable to disallow public list access. This makes the first threat hard to impossible, although it may still be possible to extract some information from observed file lifetime and sizes.

A’s storage system observes all writes by its owner A, and all reads by the peers that communicate with A. It can observe metadata such access timing, file sizes, IP addresses, etc. These are harder to hide, and are also visible to other parties such as network infrastructure providers. VPN or Tor allows obfuscating connection metadata, and it may be possible to enhance Cweb protocols to obfuscate timing and file sizes.

To conclude, Cweb does not guarantee full metadata privacy. Further analysis of potential threats is needed, as well as protocol enhancements to counter them.

Others’ data

Cweb does not cache, replicate, or otherwise serve other peers’ data. There are several reasons:

Serving other’s content comes with liability risks.
Cweb focuses on private communication where replication is less beneficial.
With always-on online storage, data availability does not require caching or replication.