An End-to-End Measurement of
Certificate Revocation in the Web's PKI

Yabing Liu, Will Tome, Liang Zhang, David Choffnes, Dave Levin,
Bruce Maggs, Alan Mislove, Aaron Schulman, Christo Wilson

Paper Overview

While the overall SSL ecosystem is well-studied, the frequency with which certificates are revoked and the circumstances under which clients (e.g., browsers) check whether certificates are revoked are still not well-understood.
In our IMC'15 paper, we took a close look at certificate revocations in the Web's PKI. Using 74 full IPv4 HTTPS scans, we found that a surprisingly large fraction (8%) of the certificates served have been revoked, and that obtaining certificate revocation information can often be expensive in terms of latency and bandwidth for clients. We then studied the revocation checking behavior of 30 different combinations of web browsers and operating systems; we found that browsers often do not bother to check whether certificates are revoked (including mobile browsers, which uniformly never check). We also examined the CRLSet infrastructure built into Google Chrome for disseminating revocations; we found that CRLSet only covers 0.35% of all revocations. Overall, our results paint a bleak picture of the ability to effectively revoke certificates today.

This paper was published at IMC'2015 (Internet Measurement Conference) and you can download our paper here

Data Sources

Our data is provided by Rapid7, who generously makes (roughly) weekly full IPv4 HTTPS scans. The data can be downloaded from the University of Michigan Internet Scans Repository. For this study, we use the scans between October 30, 2013 and March 30, 2015. Overall, we observe 38,514,130 unique SSL certificates.

In section 7.2, we compare the revocation in CRLs from Alexa Top 1 Million domains with the CRLSets.

We use the Root Store in OSX 10.9.2 as our set of trusted root.

Processed Data

Intermediate Set

We verify all observed certificates by first building the set of all intermediates certificates that can be verified relative to the roots. And we discover 1,946 intermediate certificates, which we refer to as the Intermediate Set.

Leaf Set

We then verify all leaf certificates using this set of intermediates and root certificates. And we discover a total of 5,067,476 such leaf certificates, which we refer to as the Leaf Set.

Revocation Status

To determine the revocation status for all valid certificates from these scans, we check the status for Intermediate Set and Leaf Set every day, starting in October, 2014.

CRL

For the certificates that include a CRL distribution point, we use this CRL to obtain revocation information for the certificate. We observe a total of 2,800 unique CRLs, and we configure a crawler to download each of these CRLs once per day between October 2, 2014 and March 31, 2015.

OCSP

We observe a total of 499 unique OCSP responders across all certificates. And we only query the OCSP responders for the 642 certificates that only have an OCSP responder provided (i.e., no CRL distribution point). This data was collected on March 31, 2015.

OSCP Stapling

To determine what fraction of certificates are hosted on servers that support OCSP Stapling, we use the IPv4 TLS Handshake scans conducted by the University of Michigan, which can be downloaded from this link. We examine the scan of March 28, 2015, and look for servers that were advertising certificates in the Leaf Set.

CRLSets

To examine Google's approach CRLSets, we fetch the files once per day between September 23, 2014 and March 31, 2015, and crawled 110 historical CRLSets originally published between July 18th, 2013 and September 23, 2014; in total our dataset contains 300 unique CRLSets.

Certificate Database

We place all 5,067,476 valid certificates we find into a SQLite database. This database can be downloaded from this link (in total 2.9GB). There are a number of tables in this database, which are briefly described below.

  • domains: This is a table containing one entry for each domain. Included fields are:
    • domainid The index of the domain
    • domain The name of the domain
    • alexa The index of the domain within the Alexa Top 1 Million domains (Empty if not among them)
  • certs: This is a table containing one entry for valid leaf certificate. Included fields are:
    • certid The index of this certificate
    • fingerprint The "fingerprint" of this certificate
    • version The version of this certificate
    • ca Whether or not this certificate is a CA certificate
    • cn The common name of the certificate
    • sans The subject alternative names of this certificate
    • icn The common name of the issuer of this certificate
    • serial The serial number of this certificate
    • key The public key of this certificate
    • isev Whether or not this certificate is a EV certificate
    • start The beginning date of validity for this certificate
    • end The ending date of validity for this certificate
    • birth This certificate's date of birth (see the paper for the definition)
    • death This certificate's date of death (see the paper for the definition)
    • last The time when this certificate has been last seen
    • maxhosts The number of hosts who advertised this certificate
    • revoked This certificate's date of revocation (Empty if not revoked)
    • revokedreason This certificate's Reason Code for revocation (-1 if revoked but no reason provided)
    • skeyid The signature public key identifier of this certificate
    • akeyid The authority key identifier of this certificate
    • crlsetparent This certificate is covered by the crlset
    • crlsetfirst The first crlset that we have seen this certificate
    • crlsetlast The last crlset that we have seen this certificate
  • oids: This is a table containing one entry for each OID. Included fields are:
    • oidid The index of this OID
    • oid The name of this OID
    • ev Whether or not this OID is associated with a EV certificate
  • crlsdetail: This is a table containing one entry for each unique CRL. Included fields are:
    • crl The name of this CRL
    • crlhash The hash of this CRL's URL (for referencing purposes)
    • certs The number of certificates in this CRL
    • size The size of this CRL
  • crlentries: This is a table containing the entries corresponding to our certificates in the CRLs. Included fields are:
    • crlid The index of this entry
    • serial The serial number of this entry
    • revokedtime The time when the entry has been revoked
    • reason This certificate's Reason Code for revocation (-1 if revoked but no reason provided)
    • firstseen The time when the entry has been first seen in one CRL
    • lastseen The time when the entry has been last seen in one CRL
  • crls: This is a table containing one entry for each unique CRL. Included fields are:
    • crlid The index of this CRL
    • crl The name of this CRL
    • hash The unique index of this CRL
  • crlsizes: This is a table containing one entry for each unique CRL. Included fields are:
    • size The size of this CRL
    • hash The unique index of this CRL
  • crlnames: This is a table containing one entry for each unique CRL. Included fields are:
    • crl The name of this CRL
    • hash The unique index of this CRL
  • ocsps: This is a table containing one entry for each unique OCSP. Included fields are:
    • ocspid The index of this entry
    • ocsp The name of the OCSP responder
  • ocspstapling: This is a table containing one entry for each unique OCSP Stapling. Included fields are:
    • certhash The index of this certificate
    • stapled The number of servers that we have seen this certificate with stapled response
    • notstapled The number of servers that we have seen this certificate without stapled response
  • certdomains: This is a table containing the mapping between certificates and domains. Included fields are:
    • certid The index of the certificate
    • domainid The index of the domain
  • certcrls: This is a table containing the mapping between certificates and CRLs. Included fields are:
    • certid The index of the certificate
    • crlid The index of the CRL
  • certocsps: This is a table containing the mapping between certificates and OCSP. Included fields are:
    • certid The index of the certificate
    • ocspid The index of the OCSP
  • certoids: This is a table containing the mapping between certificates and OIDs. Included fields are:
    • certid The index of the certificate
    • oidid The index of the OID
  • crlsetcrls: This is a table containing the mapping between CRLSets and CRLs. Included fields are:
    • parent The hash of the public key of the CA that cover the CRL
    • crlid The unique index of the CRL
  • crlsetentries: This is a table containing one entry for each unique CRLSet. Included fields are:
    • parent The hash of the public key of the CA that signed this entry
    • serial The serial number of this entry
    • firstseen The time when the entry has been first seen in one CRLSet
    • lastseen The time when the entry has been last seen in one CRLSet
  • months: This is a table containing one entry for each month. Included fields are:
    • month The month information, between 2013/10 ~ 2015/03.

Test Suite

Our test suite is currently offline. However, below are some test examples.

Examples to test how browsers deal with the revocations when both CRL distribution point and OCSP responder are available:
  • EV certificate, 0 intermediates, leaf revoked (test308) -- This is EV certificate with 0 intermediates, and leaf revoked.
  • EV certificate, 0 intermediates, leaf revoked, No response CRL failure (test309) -- This is EV certificate with 0 intermediates, and leaf revoked. And the CRL server does not respond.
  • EV certificate, 0 intermediates, leaf revoked, No response OCSP failure (test310) -- This is EV certificate with 0 intermediates, and leaf revoked. And the OCSP server does not respond.
Examples to test how browsers deal with the CRL revocations:
  • EV certificate, 0 intermediates, leaf revoked (test145) -- This is EV certificate with 0 intermediates, and leaf revoked.
  • EV certificate, 0 intermediates, leaf revoked, 404 failure (test146) -- This is EV certificate with 0 intermediates, and leaf revoked. And the revocation server returns a HTTP 404 error code.
  • EV certificate, 0 intermediates, leaf revoked, NXDOMAIN failure (test148) -- This is EV certificate with 0 intermediates, and leaf revoked. And the domain name of the revocation server does not exist.
  • EV certificate, 0 intermediates, leaf revoked, No response failure (test147) -- This is EV certificate with 0 intermediates, and leaf revoked. And the CRL revocation server does not respond.
  • EV certificate, 0 intermediates, none revoked (test144) -- This is EV certificate with 0 intermediates, and not revoked.
  • EV certificate, 2 intermediates, intermediate 1/2 revoked (test159) -- This is EV certificate with 2 intermediates, and the 1st intermediate revoked.
  • EV certificate, 2 intermediates, intermediate 2/2 revoked (test160) -- This is EV certificate with 2 intermediates, and the 2nd intermediate revoked.
Examples to test how browsers deal with the OCSP revocations:
  • Non-EV certificate, 1 intermediate, intermediate 1/1 revoked (test195) -- This is Non-EV certificate with 1 revoked intermediate.
  • Non-EV certificate, 1 intermediate, intermediate 1/1 revoked, 404 failure (test197) -- This is Non-EV certificate with 1 revoked intermediate. And the revocation server returns a HTTP 404 error code.
  • Non-EV certificate, 1 intermediate, intermediate 1/1 revoked, NXDOMAIN failure (test201) -- This is Non-EV certificate with 1 revoked intermediate. And the domain name of the revocation server does not exist.
  • Non-EV certificate, 1 intermediate, intermediate 1/1 revoked, No response failure (test199) -- This is Non-EV certificate with 1 revoked intermediate. And the revocation server does not respond.
  • Non-EV certificate, 1 intermediate, intermediate 1/1 revoked, Unknown response failure (test203) -- This is Non-EV certificate with 1 revoked intermediate. And the OCSP responder generates a response with status unknown.
Examples to test how browsers deal with the OCSP Stapling revocations:
  • Non-EV certificate, 3 intermediates, leaf revoked (test330) -- This is Non-EV certificate with 3 intermediates, and leaf revoked.
  • Non-EV certificate, 3 intermediates, leaf revoked, Unknown response failure (test331) -- This is Non-EV certificate with 3 intermediates, and leaf revoked. And the OCSP Stapling responder generates a response with status unknown.
  • Non-EV certificate, 3 intermediates, none revoked (test329) -- This is Non-EV certificate with 3 intermediates, and not revoked.