OpenSSL CVE-2016-0799: heap corruption via BIO_printf

There are a couple of issues with OpenSSL’s BIO_*printf() functions, defined in crypto/bio/b_print.c, that are set to be fixed in the forthcoming security release.
The function that is primarily responsible for interpreting the format string and transforming this string and the functions arguments to a string is _dopr().

_dopr() scans the format string in an incremental fashion and employs doapr_outch() for each character it wants to output.

doapr_outchr()

doapr_outch()’s first two arguments are a double pointer to a statically allocated buffer (char** sbuffer) and a pointer to a char pointer (char **buffer) whose value will be set to a memory region dynamically allocated by doapr_outch().

The first argument, the static buffer, should always be valid. Its size is pointed to by the third argument to doapr_outch(), size_t* currlen.

   700    static void
701    doapr_outch(char **sbuffer,
702                char **buffer, size_t *currlen, size_t *maxlen, int c)
703    {
704        /* If we haven't at least one buffer, someone has doe a big booboo */
705        assert(*sbuffer != NULL || buffer != NULL);
706
707        /* |currlen| must always be <= |*maxlen| */
708        assert(*currlen <= *maxlen);
709
710        if (buffer && *currlen == *maxlen) {
711            *maxlen += 1024;
712            if (*buffer == NULL) {
713                *buffer = OPENSSL_malloc(*maxlen);
714                if (!*buffer) {
715                    /* Panic! Can't really do anything sensible. Just return */
716                    return;
717                }
718                if (*currlen > 0) {
719                    assert(*sbuffer != NULL);
720                    memcpy(*buffer, *sbuffer, *currlen);
721                }
722                *sbuffer = NULL;
723            } else {
724                *buffer = OPENSSL_realloc(*buffer, *maxlen);
725                if (!*buffer) {
726                    /* Panic! Can't really do anything sensible. Just return */
727                    return;
728                }
729            }
730        }
731
732        if (*currlen < *maxlen) {
733            if (*sbuffer)
734                (*sbuffer)[(*currlen)++] = (char)c;
735            else
736                (*buffer)[(*currlen)++] = (char)c;
737        }
738
739        return;
740    }

The idea here is that doapr_outch() will incrementally fill the statically allocated buffer sbuffer until its maximum capacity has been reached; whether this is the case is asserted by the if on line 732, a byte will be appended to *sbuffer on line 734:

   732        if (*currlen < *maxlen) {
733            if (*sbuffer)
734                (*sbuffer)[(*currlen)++] = (char)c;

Once sbuffer is full (at which point *currlen is equal to *maxlen) and the calling functions allows the dynamic allocation of memory (buffer is non-zero), then this condition evaluates as true:

710        if (buffer && *currlen == *maxlen) {

From this point on, an allocation takes place every 1024 bytes. Once a single successful heap allocation takes place, *sbuffer is zeroed:

   713                *buffer = OPENSSL_malloc(*maxlen);
714                if (!*buffer) {
715                    /* Panic! Can't really do anything sensible. Just return */
716                    return;
717                }
718                if (*currlen > 0) {
719                    assert(*sbuffer != NULL);
720                    memcpy(*buffer, *sbuffer, *currlen);
721                }
722                *sbuffer = NULL;

The corollary of sbuffer being zero for the remainder of the BIO_printf() invocation is that from now on, bytes will be appended to the heap-based *buffer rather than the stack-based *sbuffer:

   732        if (*currlen < *maxlen) {
733            if (*sbuffer)
734                (*sbuffer)[(*currlen)++] = (char)c;
735            else
736                (*buffer)[(*currlen)++] = (char)c;
737        }

Differences between BIO_printf/BIO_vprintf and BIO_snprintf/BIO_vsnprintf

The functions BIO_printf() and BIO_vprintf() allow doapr_outch() to dynamically allocate memory by supplying a valid pointer to a char pointer.

   744    int BIO_printf(BIO *bio, const char *format, ...)
745    {
746        va_list args;
747        int ret;
748
749        va_start(args, format);
750
751        ret = BIO_vprintf(bio, format, args);
752
753        va_end(args);
754        return (ret);
755    }
756
757    int BIO_vprintf(BIO *bio, const char *format, va_list args)
758    {
759        int ret;
760        size_t retlen;
761        char hugebuf[1024 * 2];     /* Was previously 10k, which is unreasonable
762                                     * in small-stack environments, like threads
763                                     * or DOS programs. */
764        char *hugebufp = hugebuf;
765        size_t hugebufsize = sizeof(hugebuf);
766        char *dynbuf = NULL;
767        int ignored;
768
769        dynbuf = NULL;
770        CRYPTO_push_info("doapr()");
771        _dopr(&hugebufp, &dynbuf, &hugebufsize, &retlen, &ignored, format, args);
772        if (dynbuf) {
773            ret = BIO_write(bio, dynbuf, (int)retlen);
774            OPENSSL_free(dynbuf);
775        } else {
776            ret = BIO_write(bio, hugebuf, (int)retlen);
777        }
778        CRYPTO_pop_info();
779        return (ret);
780    }

BIO_vprintf() supplies both a statically allocated buffer (hugebuf), its size is encoded in hugebufsize, and it also supplies a pointer to a char pointer (dynbuf). The same applies to BIO_printf() through its use of BIO_vprintf().

By contrast, the other two *printf functions, BIO_vsnprintf() and BIO_snprintf() only use a statically allocated buffer, which is to be supplied by the caller:

   788    int BIO_snprintf(char *buf, size_t n, const char *format, ...)
789    {
790        va_list args;
791        int ret;
792
793        va_start(args, format);
794
795        ret = BIO_vsnprintf(buf, n, format, args);
796
797        va_end(args);
798        return (ret);
799    }
800
801    int BIO_vsnprintf(char *buf, size_t n, const char *format, va_list args)
802    {
803        size_t retlen;
804        int truncated;
805
806        _dopr(&buf, NULL, &n, &retlen, &truncated, format, args);
807
808        if (truncated)
809            /*
810             * In case of truncation, return -1 like traditional snprintf.
811             * (Current drafts for ISO/IEC 9899 say snprintf should return the
812             * number of characters that would have been written, had the buffer
813             * been large enough.)
814             */
815            return -1;
816        else
817            return (retlen <= INT_MAX) ? (int)retlen : -1;
818    }

The vulnerability

One of the problems with the doapr_outch() function is that it cannot signal failure to allocate memory to its caller, because it is a void-returning function:

   713                *buffer = OPENSSL_malloc(*maxlen);
714                if (!*buffer) {
715                    /* Panic! Can't really do anything sensible. Just return */
716                    return;
717                }
   724                *buffer = OPENSSL_realloc(*buffer, *maxlen);
725                if (!*buffer) {
726                    /* Panic! Can't really do anything sensible. Just return */
727                    return;

This lack of error signaling means that _dopr() will continue to call doapr_outch() as long as there are characters left to output.

Moreover, maxlen is incremented before the allocation. This means that even if the allocation fails, maxlen still represents the size of the heap memory which it would be if the allocation had succeeded:

   711            *maxlen += 1024;
712            if (*buffer == NULL) {
713                *buffer = OPENSSL_malloc(*maxlen);
714                if (!*buffer) {
715                    /* Panic! Can't really do anything sensible. Just return */
716                    return;
717                }

Thus, upon the first call to doapr_outch() after the failed allocation, the following condition evaluates as false:

   710        if (buffer && *currlen == *maxlen) {

The failed allocation caused *buffer (the value) to be zeroed, but buffer (the pointer) is still valid.
However, *currlen does no longer equate *maxlen, because *maxlen has just been incremented by 1024 in the previous call.

Failing to evaluate this condition as true, the entire middle part of the function is skipped, and the following code is evaluated:

   732        if (*currlen < *maxlen) {
733            if (*sbuffer)
734                (*sbuffer)[(*currlen)++] = (char)c;
735            else
736                (*buffer)[(*currlen)++] = (char)c;
737        }

*currlen is now indeed *maxlen, and *sbuffer is zero (if at least one valid OPENSSL_malloc() call is succesfull, *sbuffer is zeroed, as noted earlier). Thus this code is executed:

   736                (*buffer)[(*currlen)++] = (char)c;

*buffer is zero, and *currlen might be anything, depending on at which point in the process an allocation failed. Thus, effectively *currlen is used as a pointer to write data to.

*currlen is a 32-bit integer, so when used as a pointer it is bound to point to a byte within the first 4 gigabytes of the virtual address space. On a 64-bit system, it is unlikely that a write to this region will not cause a page fault. However, in a 32-bit memory layout, the odds are in the attacker’s favor, especially if they have some way of causing memory attrition within the germane system. It might seem far-fetched that an attacker might have the agency to cause an allocation to fail at a very precise moment, namely when *currlen, if used as a pointer, is pointing to a memory region that they want to overwrite.
However, how much memory there is left to allocate within a system is not merely constituted by OpenSSL’s (or the application that uses it) use of the heap; any other application running concurrently with OpenSSL whose resource consumption might be influenced by the attacker (such as other public-facing networking services running on a server) is susceptible to being complicit in heap corruption occurring in doapr_outch().

Even if precise memory corruption through memory attrition, that could lead to code execution, is in practice too difficult for the attacker, there’s still the possibility that important data within the program’s heap is corrupted, whose consequences could be nearly as disastrous. Heap vandalism, basically.

And even if you discount the presence of malice, then genuine, temporary shortages of heap memory could lead to random heap corruption.

An alternative approach to triggering the vulnerability

Moreover, an interesting, sure-fire way to cause a OPENSSL_realloc() failure exists.

OPENSSL_realloc() is really just a macro for CRYPTO_realloc():

   375    void *CRYPTO_realloc(void *str, int num, const char *file, int line)
376    {
377        void *ret = NULL;
378
379        if (str == NULL)
380            return CRYPTO_malloc(num, file, line);
381
382        if (num <= 0)
383            return NULL;

num is a signed, 32-bit integer. If it is zero or negative, NULL is returned.

Because in doapr_outch() *maxlen is incremented by 1024 for each allocation:

   711            *maxlen += 1024;

it will eventually become a negative value. The subsequent OPENSSL_realloc() will then inevitably fail, because CRYPTO_realloc() refuses to do allocations of a negative size.

In other words, by supplying a very large string to BIO_printf() (basically one where the result of the combination of the format string and the arguments exceeds 1 << 31 bytes minus the size of the stack-based buffer), the vulnerability is guaranteed to trigger.

Probably another way than using the “%s” format with a very large string is to exploit the padding mechanisms present in the helper functions fmtstr(), fmtint(), fmptp().

Affected software

I’ve been able to confirm that PHP’s openssl_pkcs7_encrypt is vulnerable to this attack through its internal use of BIO_printf, if an attacker is able to supply a very large $headers parameter.

Apache httpd also uses BIO_printf: https://github.com/apache/httpd/blob/trunk/modules/ssl/ssl_util_ocsp.c#L46 but I haven’t yet checked to what extent it might be exploitable.

A number of other high-profile applications are also using BIO_printf(): https://codesearch.debian.net/results/BIO_printf/page_0

2 thoughts on “OpenSSL CVE-2016-0799: heap corruption via BIO_printf

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.