Analysis of SSL Certificate Reissues and Revocations
in the Wake of Heartbleed

Liang Zhang, David Choffnes, Dave Levin,
Tudor Dumitras, Alan Mislove, Aaron Schulman, Christo Wilson

Paper Overview

In April, 2014, we conducted a study to tell how SSL certificates are reissued and revoked in response to a widespread vulnerability, Heartbleed, that enabled undetectable key compromise. We conducted large-scale measurements and developed new methodologies and heuristics to determine how the most popular 1 million web sites reacted to this vulnerability in terms of certificate management, and how this impacts security for clients that use them.

We found that the vast majority of vulnerable certificates have not been reissued; further, of those domains that reissued certificates in response to Heartbleed, 60% do not revoke their vulnerable certificates. If they do not eventually become revoked, 20% of those certificates will remain valid (not expire) for two or more years. The ramifications of this findings are alarming: modern Web browsers will remain potentially vulnerable to malicious third parties using stolen keys to masquerade as a compromised site for a long time to come. We analyzed these trends with vulnerable Extended Validation (EV) certificates, as well, and have found that, while they exhibit better security practices, they still remain largely not reissued (67%) and not revoked (88%) even weeks after the vulnerability was made public.

This paper was published at IMC'2014 (Internet Measurement Conference) and you can download our paper here

Data Sources

Our data is provided by Rapid7, who generously makes (roughly) weekly full IPv4 HTTPS scans. The data can be downloaded from the University of Michigan Internet Scans Repository. For this study, we use the scans between October 30, 2013 and April 28, 2014.

We filter the SSL certificates to only consider those that advertise a Common Name in the Alexa Top 1 Million domains.

To determine if a host is running a version of OpenSSL that was likely vulnerable in the past, we conduct our own scan. Please use the Contact Us link if you need access to this data set.

Processed Data

For each Alexa-domain-advertising certificate we encounter, we validate the certificate's chain using openssl verify. For certificate chains that were advertised in the past, we use the Faketime library in combination with OpenSSL.

We place all 628,692 valid, Alexa-domain-advertising certificates we find into a SQLite database. This database can be downloaded from this link (552 MB). There are a number of tables in this database, which are briefly described below.

  • scans: This is a table containing one entry for each of Rapid7's scans. Included fields are:
    • scandate The date of the scan
    • scannextscandays The number days until the next scan
    • scanprevscandays The number of days since the previous scan
  • domains: This is a table containing one entry for each Alexa Top 1 Million domain. Included fields are:
    • domainid The index of the domain
    • domainname The name of the domain
  • certs: This is a table containing one entry for valid leaf certificate. Included fields are:
    • certid The unique index of the certificate
    • domainid The domain to which this certificate belongs
    • certhash The "fingerprint" of this certificate
    • certseq The sequence number of this certificate, as assigned by the issuing CA
    • certvalidstart The beginning date of validity for this certificate
    • certvalidstart The ending date of validity for this certificate
    • certissuername The Authority Name of this certificate
    • certname The Common Name of this certificate
    • certca Whether or not this certificate is a CA certificate
    • certbirth This certificate's date of birth (see the paper for the definition)
    • certreissue This certificate's date of reissue (see the paper for the definition)
    • certhbreissue This certificate's date of Heartbleed-induced reissue (see the paper for the definition)
    • certdeath This certificate's date of death (see the paper for the definition)
    • certrevoked This certificate's date of revocation
    • certrevokedreason This certificate's Reason Code for revocation (-1 if revoked but no reason provided)
    • certnumprehbreissues The number of times a certificate with this Common Name was reissued before April 7, 2014
    • certnumscanhosts The number of hosts who advertised this certificate
    • certwasvulnerable Whether or not this certificate was vulnerable (see the paper for the definition)
    • certisvulnerable Whether or not this certificate was advertised by a host we observed to be vulnerable
  • crls: This is a table containing one entry for each unique CRL. Included fields are:
    • crlid The unique index of the CRL
    • crlurl The URL of this CRL
    • crlhash The hash of this CRL's URL (for referencing purposes)
  • certcrls: This is a table containing the mapping between certificates and CRLs. Included fields are:
    • certcrlid The unique index of this entry
    • certhash The hash of the certificate
    • crlhash The hash of the CRL
  • crlentries: This is a table containing the entries corresponding to our certificates in the CRLs. Included fields are:
    • crlentryid The unique index of this entry
    • certseq The sequence number of this entry
    • certhash The "fingerprint" of the certificate that this entry corresponds to
    • crlhash The CRL hash that this entry belongs to
    • crlentrytime The time of this revocation entry
    • crlentryreason This certificate's Reason Code for revocation (-1 if no reason provided)
  • hostswitch: This is a table containing the entries for hosts that switch certificates between scans (for the same Common Name). Included fields are:
    • olddate The previous scan date
    • newdate The next scan date
    • oldcerthash The previous certificate hash
    • newcerthash The new certificate hash

Analysis

Most of the analysis in the paper is expressed as queries against the SQLite database described above. For example, Figure 5 is generated using the following queries:
  • Birth: select certbirth, count(*) cnt from certs where certbirth != "" group by certbirth;
  • Death: select certdeath, count(*) cnt from certs where certdeath != "" group by certdeath;
  • Reissue: select certreissue, count(*) cnt from certs where certreissue != "" group by certreissue;
  • Revoke: select date(certrevoked), count(*) cnt from certs where certrevoked != "" group by date(certrevoked);