Certificate revocation and the performance of OCSP

Certificate revocation is a critical aspect of maintaining the security of the third-party Certificate Authority (CA) infrastructure which underpins secure communication on the internet using SSL/TLS. A certificate may be worth revoking when it has had its private key compromised, the owner of the certificate no longer controls the domain for which it was issued, or the certificate was mistakenly signed. Without the ability to revoke certificates, a CA has no direct means of marking a certificate as untrusted before the expiry of the certificate, which could be several years away. In particularly urgent cases a browser vendor may have the ability to block certain individual certificates, trusted roots, or intermediate certificates, but this is rarely performed and is not suitable for lower-risk issues where revocation is necessary but not urgent.

There are two main technologies for browsers to check the revocation status of a particular certificate: using the Online Certificate Status Protocol (OCSP) or looking up the certificate in a Certificate Revocation List (CRL). OCSP provides real-time revocation information about an individual certificate from an issuing CA, unlike CRLs which provide a list of revoked certificates and may be received by clients less frequently.

The graph below shows a comparison of the time taken for the TLS handshake, both with and without OCSP checking enabled. The data was collected using packet traces taken while using Firefox 20 on Linux from an IP address in the UK. Measurements were taken three times (each time with a fresh cache) after discarding an initial request.

The relationship between whether OCSP checking is enabled and the time taken to complete the TLS handshake is not straightforward. In order for the browser to display the "green bar" to distinguish an Extended Validation (EV) certificate, OCSP requests must be made for every certificate in the chain whereas in many browsers, if an OCSP request is made at all, intermediate certificates are not checked. The increased time taken for the TLS handshake when using an EV certificate can be attributed to Firefox's sequential OCSP checking behaviour. However, where an OCSP check can be performed within the round-trip time to the server — for example, if the OCSP responder is served via a content delivery network or CDN — the check does not dramatically affect the time taken for the TLS handshake. When both the web server and the OCSP responder are topologically close to the client, as is the case with www.globalsign.com, the short round-trip time to the server isn't sufficient to mask the the time taken to receive OCSP responses for both the web site's certificate and the intermediate certificate presented. The slight difference between Paypal and GlobalSign's performance can at least partially be attributed to the additional OCSP request made for GlobalSign: GlobalSign's certificate chain requires three OCSP requests whereas Paypal's requires just two.

Reliability of RapidSSL's OCSP responder — December 2012

Netcraft has extracted around 40 OCSP responder URLs from certificates seen in the Netcraft SSL server survey, and has been monitoring them since late November 2012. The performance and reliability of the services varies significantly: Symantec's VeriSign OCSP responder has had consistently solid reliability, only a handful of connections failed over a 4 month period; whereas, in the same period more than 6% of requests to one of StartCom's responders failed. The reliability and performance of StartCom's OCSP responders have improved significantly since the end of February 2013 when it switched to using Akamai. Geotrust, another Symantec brand, did not have as strong a performance as either Thawte or VeriSign — all three of GeoTrust’s OCSP servers were down for between 48 and 104 minutes in a single event. Performance and reliability is measured from 11 points spread around Europe and North America: outages require at least one failed response from all measurement nodes within the 15-minute measurement interval.

Shift in reliability and performance for StartCom — late February 2013

For those browsers performing a synchronous OCSP request during the TLS handshake, the performance of the OCSP responder is often crucial. Any delay in responding to the request may noticeably slow down the handshake. For example, comparing GlobalSign's CloudFlare-accelerated OCSP responder with Entrust's, you find that GlobalSign's responder is significantly faster than Entrust's which uses Akamai's CDN. However, despite GlobalSign's performance advantage, its reliability has been affected by a number of CloudFlare outages — since Netcraft began monitoring OCSP, GlobalSign's responders have had at least 45 minutes of downtime whereas Entrust has had none.

GlobalSign (blue) and Entrust (green) OCSP responder performance.

OCSP responses can be stapled to a response from a web server when negotiating the TLS handshake to avoid the need for the browser to make a secondary request to a third party server. CloudFlare has claimed that enabling OCSP stapling has led to a 30% speed improvement for HTTPS sites. OCSP stapling support is present in newer versions of nginx — an increasingly popular open source web server — as a result of a development project sponsored by GlobalSign, DigiCert, and Comodo. OCSP stapling is not supported in the most popular version of Apache, 2.2.x, nor is it supported in current versions of Firefox (although support is in the pipeline), so it must remain only part of the solution for the foreseeable future. Frustrated by some of the limitations of OCSP, some CAs have lent support to a proposed an alternative revocation method using short lived certificates.

Browser support for the both OCSP and CRLs is mixed: currently, Firefox does not automatically download the CRLs from trusted CAs, so Firefox users must rely on OCSP alone; Google uses a proprietary mechanism to distribute CRLs to users of Google Chrome which aggregates per-CA CRLs into a single update which is distributed using its automatic update channel. Many browsers default to a "soft-fail" approach, leaving users vulnerable to eavesdroppers able to block or tamper with OCSP traffic. For as long as the CAs running OCSP responders do not have a strong record for both the performance and the reliability of their OCSP responders, browsers will find it difficult to justify switching to synchronous "hard-fail" behaviour.

Updated 18/04/2013