Audit Logging using Trillian

Audit Logs: Why do we need them?

A lot of time we try to establish a digital paper trail (ledger) within the software product to record:

What action was taken?
Who initiated the action?
When did the action occur?

This information might be needed for compliance reasons or to establish operation integrity.

We at Intraedge have created a superior privacy rights management platform called truyo that is used for automating consumer privacy rights. One of our use cases is to establish a ledger that will record different actions carried out by consumer and organizations. This ledger can be provided to an external Auditor to prove compliance for different regulations like General Data Privacy Regulation, California Consumer Privacy Act and others. One of the key factors for our audit logs is that these should be immutable and Cryptographically verifiable.

Ledger using Blockchain

Our current solution uses blockchain based solution. The key driving force behind blockchain is that it is cryptographically linked and really hard to tamper with. However, the scalability of the blockchain based solution is a key factor. Our current solution reached a particular throughput and will not scale beyond that limit. Moreover, more efficient and scalable solutions like "Fabric" has a lot of learning curve and support overhead that needs to be accounted for.

After doing more analysis, we came to the consensus that we do not need the decentralization trust that blockchain solutions provide. However, we do like the blockchain technologies and we should be leverage the crypto linking and hashing to establish our immutable, cryptographically verifiable logs. That is where one of my colleagues, Kerry Martin discovered Google's Trillian which is an "A transparent, highly scalable and cryptographically verifiable data store."

Introducing Trillian

Trillian is an opensource project by Google that evolved from Certificate Transparency. While trying to solve certificate transparency, Google created a generic transparency solution that can be used beyond certificates and any use case that requires a verifiable data structure. This was a perfect fit for our use case. More importantly, it has a backing of Google.

However, the documentation around the project is still evolving and a lot of references revolves around Certificate Transparency. So I decided, to start the blog post so that anyone else who is planning to use and deploy Trillian can be benefited from the same. In the first series of the blog post, I want to focus on getting started on Trillian. As we progress with the solution and move towards production deployment, I plan to write more blog posts with our experience.

At some point in time, I hope to send a pull request to Trillian team so that it can be incorporated in Github repository.

Concepts

I do plan to create separate posts around some of the key concepts for Trillian. However, the following resources were really helpful for me to understand the key concepts around Trillian.

Merkle Trees
What is Certificate Transparency? (Even though it is not our use case, but it really helps to understand use case around Certificate Transparency)
How Certificate Transparency works?
How Log Proofs work?
gRPC protocol

Architecture

The client application generates the audit logs and sends to Trillian Log Server using gRPC for a particular tree.

These logs are queued in MySQL/MariaDB/CloudSpanner database.

Trillian Log Signer on regular interval creates a new Merkle tree and appends it to the existing tree. This creates a new Merkle root hash which is then signed by the private key for that tree.

An auditor application can constantly monitor the Root hash and verify the consistency of newly added log elements. It can also check the inclusion of a given log in a tree using audit-proof.

In order to add multiple log signer for better availability and resiliency, we can also add etcd cluster for leader election of Trillian Log Server. If we are using multiple trees, it can also provide better scalability.

Setting Up Trillian Locally

Pre-Requisites

Setup

In order to get started running trillian locally, follow the instruction as provided at https://github.com/google/trillian/tree/master/examples/deployment#setup

For local testing, I made a small tweak to the docker-compose file and exposed ports for mysql so that I can connect using MySQL Workbench

services:

mysql:

.............

restart: always # keep the MySQL server running

ports:

- "3308:3306"

Working with REST API

Note: REST APIs are not recommended for production use. It just gives us an easy way to get started on Trillian. I will work on providing an alternative guide using gRPC client.

Using the proto specification file, I was able to create a postman collection consisting of operations:

Adding a Tree
Initializing a tree
Getting Tree
Queue Leaf Entry
Get Leaf by index

The collection can be downloaded from https://www.getpostman.com/collections/a5e51863bdc949c9ebe3

gRPC Protocol Specification

That concludes the part-1 for this blog. Keep watching out for part-2 on how we plan to use Trillian in development and our findings using Trillian.

Search This Blog

Verifiable Audit Logs