Data Sources

SSL Certificates

Our data is provided by Rapid7, who generously makes (roughly) weekly full IPv4 HTTPS scans. The data can be downloaded from the University of Michigan Internet Scans Repository. For this study, we use the scans between October 30, 2013 and March 30, 2015. Overall, we observe 38,514,130 unique SSL certificates.

Alexa Domains

In section 7.2, we compare the revocation in CRLs from Alexa Top 1 Million domains with the CRLSets.

Trusted Root Set
We use the Root Store in OSX 10.9.2 as our set of trusted root.

Processed data

Intermediate Set

We verify all observed certificates by first building the set of all intermediates certificates that can be verified relative to the roots. And we discover 1,946 intermediate certificates, which we refer to as the Intermediate Set.

Leaf Set

We then verify all leaf certificates using this set of intermediates and root certificates. And we discover a total of 5,067,476 such leaf certificates, which we refer to as the Leaf Set.

Revocation Status

To determine the revocation status for all valid certificates from these scans, we check the status for Intermediate Set and Leaf Set every day, starting in October, 2014.

CRL

For the certificates that include a CRL distribution point, we use this CRL to obtain revocation information for the certificate. We observe a total of 2,800 unique CRLs, and we configure a crawler to download each of these CRLs once per day between October 2, 2014 and March 31, 2015.

OCSP

We observe a total of 499 unique OCSP responders across all certificates. And we only query the OCSP responders for the 642 certificates that only have an OCSP responder provided (i.e., no CRL distribution point). This data was collected on March 31, 2015.

OSCP Stapling

To determine what fraction of certificates are hosted on servers that support OCSP Stapling, we use the IPv4 TLS Handshake scans conducted by the University of Michigan, which can be downloaded from this link. We examine the scan of March 28, 2015, and look for servers that were advertising certificates in the Leaf Set.

CRLSets

To examine Google's approach CRLSets, we fetch the files once per day between September 23, 2014 and March 31, 2015, and crawled 110 historical CRLSets originally published between July 18th, 2013 and September 23, 2014; in total our dataset contains 300 unique CRLSets.

Certificate Database

We place all 5,067,476 valid certificates we find into a SQLite database. This database can be downloaded from this link (in total 2.9GB). There are a number of tables in this database, which are briefly described below.

Test Suite

Our test suite is currently offline. However, below are some test examples.

Examples to test how browsers deal with the revocations when both CRL distribution point and OCSP responder are available: Examples to test how browsers deal with the CRL revocations: Examples to test how browsers deal with the OCSP revocations: Examples to test how browsers deal with the OCSP Stapling revocations: