Advertisement






PHP/LLVM/MYSQL/BSD regex library Heap Buffer Overflow

CVE Category Price Severity
CWE-122 $2500 Critical
Author Risk Exploitation Type Date
Unknown High Local 2015-02-08
CVSS EPSS EPSSP
CVSS:4.0/AV:L/AC:L/AT:P/PR:L/UI:N/VC:H/VI:H/VA:H/SC:N/SI:N/SA:N 0.02192 0.50148

CVSS vector description

Our sensors found this exploit at: http://cxsecurity.com/ascii/WLB-2015020029

Below is a copy:

Introduction
The following document describes a heap overflow vulnerability in Henry Spence's regex library, affecting 32 bit systems only. This library, or variations on and derivations of it, is used in such software as:

PHP
LLVM
MySQL server
Bionic libc

As well as various other *BSD libc implementations:

FreeBSD
NetBSD

The above applications are listed here merely to point out that they include the library. I have NOT tested the above applications for being vulnerable and thus I cannot give any guarantee that they are; they are listed here to point out that the library has been disseminated widely and that the vulnerability MAY not only be exploitable in'laboratory setting' cases and the danger of it MAY permeate deeply into software stacks.

The vulnerability requires a significant amount of control over one of the library's functions to be exploited and is unlikely to occur in a general programming context, since it requires a string of ~683 megabytes to be constructed. However, allocations of such a size are, in certain contexts, certainly feasible. An additional factor that limits the overall feasibility of an attack is that the exact data written outside the bounds of the heap can only be controlled by the attacker to a certain extent, as opposed to a fully arbitrary mutation of memory.

Technical description
Source code excerpts that follow are taken from https://codeload.github.com/garyhouston/rxspencer/tar.gz/alpha3.8.g5 (as referenced to on http://www.arglist.com/regex/).

The vulnerability is caused inside the regcomp function:

85 int /* 0 success, otherwise REG_something */
86 regcomp(preg, pattern, cflags)
87 regex_t *preg;
88 const char *pattern;
89 int cflags;
90 {
This function compiles the regex as defined in string form by 'const char *pattern'.

The vulnerable code:

111 len = strlen((char *)pattern);
...
...
118 p->ssize = len/(size_t)2*(size_t)3 + (size_t)1; /* ugh */
119 p->strip = (sop *)malloc(p->ssize * sizeof(sop));
‘len’ is here enlarged to such an extent that, in the process of enlarging (multiplication and addition), causes the 32 bit register/variable to overflow.

Formally, the smallest value of 'en' that causes an overflow is:


(2<<32 / 4 - 1) / 3 * 2 = 0x2AAAAAAA
Conversely:

(0x2AAAAAAA / 2 * 3 + 1) * 4 = 0x100000000
But since this is too large a value for a 32 bit register to hold, we yield:

0x100000000 & 0xFFFFFFFF = 0x00000000
The smallest ‘len’ value to result in a positive value to be passed to malloc is:

((0x2AAAAAAC / 2 * 3 + 1) * 4) & 0xFFFFFFFF = 0x0000000C
This is about 0x2AAAAAAC / 1024 / 1024 = 682 megabytes.

The 'p->ssize' variable, however, does not overflow, and contains the number of elements purportedly allocated by malloc, and is therefore an unreliable indicator to the library as to the size of the allocated buffer:

1375 /* deal with undersized strip */
1376 if (p->slen >= p->ssize)
1377 enlarge(p, (p->ssize+1) / 2 * 3); /* +50% */
Having discovered this vulnerability only recently, my research into the actual exploitability has been limited. At present I am mainly concerned at pointing it out rather than exploiting it. However, mutation of the heap-allocated memory that p->strip points to is mainly performed by the doemit function:

1363 doemit(p, op, opnd)
1364 register struct parse *p;
1365 sop op;
1366 size_t opnd;
1367 {
1368 /* avoid making error situations worse */
1369 if (p->error != 0)
1370 return;
1371
1372 /* deal with oversize operands ("can't happen", more or less) */
1373 assert(opnd < 1<<OPSHIFT);
1374
1375 /* deal with undersized strip */
1376 if (p->slen >= p->ssize)
1377 enlarge(p, (p->ssize+1) / 2 * 3); /* +50% */
1378 assert(p->slen < p->ssize);
1379
1380 /* finally, it's all reduced to the easy case */
1381 p->strip[p->slen++] = SOP(op, opnd);
1382 }
A simply grep of the invocations to doemit() in regcomp.c:

#define EMIT(op, sopnd) doemit(p, (sop)(op), (size_t)(sopnd))
EMIT(OEND, 0);
EMIT(OEND, 0);
EMIT(OOR2, 0); /* offset is very wrong */
EMIT(OLPAREN, subno);
EMIT(ORPAREN, subno);
EMIT(OBOL, 0);
EMIT(OEOL, 0);
EMIT(OANY, 0);
EMIT(OOR2, 0); /* offset very wrong... */
EMIT(OBOL, 0);
EMIT(OEOL, 0);
EMIT(OANY, 0);
EMIT(OLPAREN, subno);
EMIT(ORPAREN, subno);
EMIT(OBACK_, i);
EMIT(O_BACK, i);
EMIT(OBOW, 0);
EMIT(OEOW, 0);
EMIT(OANYOF, freezeset(p, cs));
EMIT(OCHAR, (unsigned char)ch);
EMIT(OOR2, 0);
EMIT(OOR2, 0); /* offset very wrong... */
EMIT(op, opnd); /* do checks, ensure space */
where (regex2.h):

43 #define OPSHIFT (26)
46 #define SOP(op, opnd) ((op)|(opnd))
49 #define OEND (1<<OPSHIFT) /* endmarker - */
50 #define OCHAR (2<<OPSHIFT) /* character unsigned char */
51 #define OBOL (3<<OPSHIFT) /* left anchor - */
52 #define OEOL (4<<OPSHIFT) /* right anchor - */
53 #define OANY (5<<OPSHIFT) /* . - */
54 #define OANYOF (6<<OPSHIFT) /* [...] set number */
55 #define OBACK_ (7<<OPSHIFT) /* begin d paren number */
56 #define O_BACK (8<<OPSHIFT) /* end d paren number */
57 #define OPLUS_ (9<<OPSHIFT) /* + prefix fwd to suffix */
58 #define O_PLUS (10<<OPSHIFT) /* + suffix back to prefix */
59 #define OQUEST_ (11<<OPSHIFT) /* ? prefix fwd to suffix */
60 #define O_QUEST (12<<OPSHIFT) /* ? suffix back to prefix */
61 #define OLPAREN (13<<OPSHIFT) /* ( fwd to ) */
62 #define ORPAREN (14<<OPSHIFT) /* ) back to ( */
62 #define ORPAREN (14<<OPSHIFT) /* ) back to ( */
63 #define OCH_ (15<<OPSHIFT) /* begin choice fwd to OOR2 */
64 #define OOR1 (16<<OPSHIFT) /* | pt. 1 back to OOR1 or OCH_ */
65 #define OOR2 (17<<OPSHIFT) /* | pt. 2 fwd to OOR2 or O_CH */
66 #define O_CH (18<<OPSHIFT) /* end choice back to OOR1 */
67 #define OBOW (19<<OPSHIFT) /* begin word - */
68 #define OEOW (20<<OPSHIFT) /* end word - */

Given the way doemit works (OR-ing the first and second parameter of EMIT and writing it to p->strip), this means that someone exploiting this has only a limited amount of control over which values are written.



Copyright ©2024 Exploitalert.

This information is provided for TESTING and LEGAL RESEARCH purposes only.
All trademarks used are properties of their respective owners. By visiting this website you agree to Terms of Use and Privacy Policy and Impressum