Contents

FILE Structure Attack: Part 1

When reading some of the pwn challenges in the recent Hacklu CTF 2022, the byor challenge was catching my eye, because it is about a FILE structure attack in the recent glibc. I have played CTF for a while, but I don’t have any idea about it at all.

So, I decided to learn the fundamentals first on what it is and I was writing this note during the process. Some of the attacks or concepts that I will explain in this series won’t work in the recent glibc. But I believe we need to understand the old version attack first so that we have fundamentals of the FILE structure attack which will help us a lot in understanding the newest attack of it. This note is the summary of various resources that I read during learning about it and I hope that this article will help future me and other people who try to learn about it.

Disclaimer
Feel free to reach me in case you found any mistake in this article, as this is a new knowledge for me and my understanding might be wrong.

FILE Explanation

Intro

FILE is a data type defined in the glibc which is usually used when we want to open a file in C. Note that this is different from the OS file descriptor that we usually use. The purpose of this data type is basically to make the file operation faster by using a buffer to reduce the number of IO syscall (read, write).

The concept (simplified version) is that rather than you use write syscall each time you want to write new data to a file (which will directly write the data to the harddisk), by using the defined methods in the stdio lib for FILE data type operations (in this case is fwrite), stdio will try to handle all the operations by managing the data in the buffer first (resides in memory), and then will move it to the hard disk (via the OS syscall) when a certain condition is met (For example, when the buffer is full or got flushed).

Implementation in C

Reading through the source code of the glibc, below I try to summarize some notable implementations related to the FILE structure data type that we need to take a look at so that we can understand more about how it works (We use glibc-2.35 in this specific section, but keep in mind that there might be a little different on how the struct is defined on each glibc version).

FILE

1
typedef struct _IO_FILE FILE;

Turn out, FILE datatype is a struct called _IO_FILE.

_IO_FILE

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
/* The tag name of this struct is _IO_FILE to preserve historic
   C++ mangled names for functions taking FILE* arguments.
   That name should not be used in new code.  */
struct _IO_FILE
{
  int _flags;		/* High-order word is _IO_MAGIC; rest is flags. */

  /* The following pointers correspond to the C++ streambuf protocol. */
  char *_IO_read_ptr;	/* Current read pointer */
  char *_IO_read_end;	/* End of get area. */
  char *_IO_read_base;	/* Start of putback+get area. */
  char *_IO_write_base;	/* Start of put area. */
  char *_IO_write_ptr;	/* Current put pointer. */
  char *_IO_write_end;	/* End of put area. */
  char *_IO_buf_base;	/* Start of reserve area. */
  char *_IO_buf_end;	/* End of reserve area. */

  /* The following fields are used to support backing up and undo. */
  char *_IO_save_base; /* Pointer to start of non-current get area. */
  char *_IO_backup_base;  /* Pointer to first valid character of backup area */
  char *_IO_save_end; /* Pointer to end of non-current get area. */

  struct _IO_marker *_markers;

  struct _IO_FILE *_chain;

  int _fileno;
  int _flags2;
  __off_t _old_offset; /* This used to be _offset but it's too small.  */

  /* 1+column number of pbase(); 0 is unknown. */
  unsigned short _cur_column;
  signed char _vtable_offset;
  char _shortbuf[1];

  _IO_lock_t *_lock;
#ifdef _IO_USE_OLD_IO_FILE
};

struct _IO_FILE_complete
{
  struct _IO_FILE _file;
#endif
  __off64_t _offset;
  /* Wide character stream stuff.  */
  struct _IO_codecvt *_codecvt;
  struct _IO_wide_data *_wide_data;
  struct _IO_FILE *_freeres_list;
  void *_freeres_buf;
  size_t __pad5;
  int _mode;
  /* Make sure we don't get into trouble again.  */
  char _unused2[15 * sizeof (int) - 4 * sizeof (void *) - sizeof (size_t)];
};

This is the rough struct of how FILE is implemented. For now, we can skip first how will the fields be used. We will explain more about it later when we talk about the history of the attacking scenario in glibc via FILE structure.

_IO_FILE_plus

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
/* We always allocate an extra word following an _IO_FILE.
   This contains a pointer to the function jump table used.
   This is for compatibility with C++ streambuf; the word can
   be used to smash to a pointer to a virtual function table. */

struct _IO_FILE_plus
{
  FILE file;
  const struct _IO_jump_t *vtable;
};

Glibc also has the extended version of _IO_FILE struct called _IO_FILE_plus, which is _IO_FILE + vtable (vtable = virtual table = array of pointers to the helper functions during executing the IO operation). The default filestream (stdin, stdout, stderr) is using this extended version instead of the raw _IO_FILE. Also if you open a file with fopen, it will use this extended version as well.

Why do we use the extended version (_IO_FILE_plus)? The purpose is to make the IO operation faster by having the vtable. The data type for the vtable is _IO_jump_t (see below LOCs), which stores the pointer to the needed IO helper methods.

_IO_jump_t

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
struct _IO_jump_t
{
    JUMP_FIELD(size_t, __dummy);
    JUMP_FIELD(size_t, __dummy2);
    JUMP_FIELD(_IO_finish_t, __finish);
    JUMP_FIELD(_IO_overflow_t, __overflow);
    JUMP_FIELD(_IO_underflow_t, __underflow);
    JUMP_FIELD(_IO_underflow_t, __uflow);
    JUMP_FIELD(_IO_pbackfail_t, __pbackfail);
    /* showmany */
    JUMP_FIELD(_IO_xsputn_t, __xsputn);
    JUMP_FIELD(_IO_xsgetn_t, __xsgetn);
    JUMP_FIELD(_IO_seekoff_t, __seekoff);
    JUMP_FIELD(_IO_seekpos_t, __seekpos);
    JUMP_FIELD(_IO_setbuf_t, __setbuf);
    JUMP_FIELD(_IO_sync_t, __sync);
    JUMP_FIELD(_IO_doallocate_t, __doallocate);
    JUMP_FIELD(_IO_read_t, __read);
    JUMP_FIELD(_IO_write_t, __write);
    JUMP_FIELD(_IO_seek_t, __seek);
    JUMP_FIELD(_IO_close_t, __close);
    JUMP_FIELD(_IO_stat_t, __stat);
    JUMP_FIELD(_IO_showmanyc_t, __showmanyc);
    JUMP_FIELD(_IO_imbue_t, __imbue);
};

How will the vtable be filled during creating a new extended FILE struct? It depends on the method that you use.

For example, if you try to open a new file via fopen, based on the below LOCs, you will see that the vtable will be initialized with the existing vtable called _IO_file_jumps. There’s a lot of existing vtable other than _IO_file_jumps (for example there is also _IO_str_jumps).

1
2
3
4
5
6
7
_IO_FILE *
__fopen_internal (const char *filename, const char *mode, int is32)
{
...
  _IO_JUMPS (&new_f->fp) = &_IO_file_jumps;
...
}

How to call it? Turn out it has implemented some definitions to make the jump call easier. Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# define _IO_JUMPS_FUNC(THIS) (IO_validate_vtable (_IO_JUMPS_FILE_plus (THIS)))
...
...
#define JUMP1(FUNC, THIS, X1) (_IO_JUMPS_FUNC(THIS)->FUNC) (THIS, X1)
...
...
/* The 'finish' function does any final cleaning up of an _IO_FILE object.
   It does not delete (free) it, but does everything else to finalize it.
   It matches the streambuf::~streambuf virtual destructor.  */
typedef void (*_IO_finish_t) (_IO_FILE *, int); /* finalize */
#define _IO_FINISH(FP) JUMP1 (__finish, FP, 0)

Above LOC is the example that the glibc implements some definitions for IO Operations which will be translated to jump to the stored pointer based on its key (index).

For example, if it calls _IO_FINISH(FP), that means it will call the stored function pointer of the passed FILE variable, specifically FP.vtable[idx] entry (idx is the index of __finish and vtable is the _IO_file_jumps in this case).

_IO_list_all

1
struct _IO_FILE_plus *_IO_list_all = &_IO_2_1_stderr_;

Another key point in FILE structure is that glibc maintains a linked list of the available FILE in a binary. Each of them will be connected via the _chain attribute (Refer to the _IO_FILE) struct. The linked list header will be the stderr by default (take a look in the below GDB), and the value will be updated with the most recent FILE that you open.

1
2
3
4
5
6
7
8
gef➤  print _IO_list_all
$23 = (struct _IO_FILE_plus *) 0x7ffff7dd2520 <_IO_2_1_stderr_>
gef➤  print _IO_2_1_stderr_.file._chain
$24 = (struct _IO_FILE *) 0x7ffff7dd2600 <_IO_2_1_stdout_>
gef➤  print _IO_2_1_stdout_.file._chain
$25 = (struct _IO_FILE *) 0x7ffff7dd18c0 <_IO_2_1_stdin_>
gef➤  print _IO_2_1_stdin_.file._chain
$26 = (struct _IO_FILE *) 0x0

The usage of vtable in a FILE structure

To give an example of how the IO Operations work and how the vtable will be used, let’s take a look at what will happen when we call exit() in the below C program.

1
2
3
4
5
6
#include <stdio.h>
#include <stdlib.h>

int main() {
    exit(1337);
}

What will happen when the binary executes the exit? Will IO operations take part in it?

Let’s take a look at the glibc implementation (version 2.35) and try to follow the calls (I will skip some LOCs because I only want to showcase how the vtable will be used).

exit

1
2
3
4
5
void
exit (int status)
{
  __run_exit_handlers (status, &__exit_funcs, true, true);
}

Okay, it turns out that exit will call __run_exit_handlers. Let’s move into that method.

__run_exit_handlers

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
/* Call all functions registered with `atexit' and `on_exit',
   in the reverse of the order in which they were registered
   perform stdio cleanup, and terminate program execution with STATUS.  */
void
attribute_hidden
__run_exit_handlers (int status, struct exit_function_list **listp,
		     bool run_list_atexit, bool run_dtors)
{
...
  if (run_list_atexit)
    RUN_HOOK (__libc_atexit, ());
...
}

Focusing on that LOC, What does it do?

Inspecting the compiled binary via gdb, turn out it will call _IO_cleanup

1
2
3
4
5
6
gef➤  disas __run_exit_handlers
...
0x00007ffff7a4a1fd <+125>:   lea    rbp,[rip+0x383694]        # 0x7ffff7dcd898 <__elf_set___libc_atexit_element__IO_cleanup__>
...
gef➤  disas __elf_set___libc_atexit_element__IO_cleanup__
Dump of assembler code for function _IO_cleanup:

_IO_cleanup

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
int
_IO_cleanup (void)
{
  /* We do *not* want locking.  Some threads might use streams but
     that is their problem, we flush them underneath them.  */
  int result = _IO_flush_all_lockp (0);

  /* We currently don't have a reliable mechanism for making sure that
     C++ static destructors are executed in the correct order.
     So it is possible that other static destructors might want to
     write to cout - and they're supposed to be able to do so.

     The following will make the standard streambufs be unbuffered,
     which forces any output from late destructors to be written out. */
  _IO_unbuffer_all ();

  return result;
}

What will _IO_flush_all_lockp do?

_IO_flush_all_lockp

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
int
_IO_flush_all_lockp (int do_lock)
{
...
  last_stamp = _IO_list_all_stamp;
  fp = (_IO_FILE *) _IO_list_all;
  while (fp != NULL)
    {
      run_fp = fp;
      if (do_lock)
	_IO_flockfile (fp);

      if (((fp->_mode <= 0 && fp->_IO_write_ptr > fp->_IO_write_base)
#if defined _LIBC || defined _GLIBCPP_USE_WCHAR_T
	   || (_IO_vtable_offset (fp) == 0
	       && fp->_mode > 0 && (fp->_wide_data->_IO_write_ptr
				    > fp->_wide_data->_IO_write_base))
#endif
	   )
	  && _IO_OVERFLOW (fp, EOF) == EOF)
	result = EOF;

      if (do_lock)
	_IO_funlockfile (fp);
      run_fp = NULL;

      if (last_stamp != _IO_list_all_stamp)
	{
	  /* Something was added to the list.  Start all over again.  */
	  fp = (_IO_FILE *) _IO_list_all;
	  last_stamp = _IO_list_all_stamp;
	}
      else
	fp = fp->_chain;
    }
...
}

Some notes that you could take from reading those LOCs:

  • _IO_flush_all_lockp will iterate all available FILE (iterating from the FILE linked list header stored in the _IO_list_all).
  • If meeting certain conditions, it will call _IO_OVERFLOW (fp, EOF)

Remember that _IO_OVERFLOW (fp, EOF) means that it will try to do the call by jumping to the stored pointer in the fp.vtable[__overflow].

This is one of the examples of how the vtable in a FILE object will be used, and this kind of IO operation happens in other methods as well, not limited to exit.

N.B. If you try to explore more by yourself, in the method _IO_unbuffer_all which is also called during _IO_cleanup, you will notice that there is a vtable call as well, which is _IO_SETBUF (fp, NULL, 0);

Possible Attack Scenario

Taking an example from the above scenario on how IO operation works inside the exit call of C library, there are some possible attack scenarios that we can do to abuse the FILE structure:

  • Hijack the vtable of the IO file (For example, stdout).
    • Remember that when we call exit in the above example, it will iterate the FILE linked list, and if some constraints are fulfilled, it will call fp.vtable[__overflow] right?
    • If we’re able to hijack the file vtable entry of __overflow with let’s say a pointer to system, that means if the binary call exit(), instead of quitting the binary, it will execute a command instead. Some possible ways to hijack it:
      • Create a fake vtable and overwrite the IO file stored pointer with the address of our fake vtable, so that when the IO operation tries to call __overflow, it will jump to our desired function pointer.
      • Overwrite the vtable pointer to another available vtable. For example, by default, stdout, stdin, stderr used _IO_file_jumps as the stored vtable. We can try to overwrite it with _IO_str_jumps, so that let’s say when the IO operation wants to call __overflow, it will use the __overflow stored inside the str jumps vtable instead of the file jumps vtable (will be explained more in the next article, but the __overflow in the str jumps can be abused to call our desired function pointer if we’re able to do some forgery on the file structure metadata).
      • Misaligned the vtable, so that let’s say when the IO operation tries to call let’s say __finish, instead of calling __finish, due to the misalignment, it will call __overflow instead (and continue with the previous point scenario).
  • Forge a fake FILE structure with a fake vtable, and then somehow try to trigger _IO_flush_all_lockp.
    • Remember that _IO_flush_all_lockp will iterate each available FILE in the linked list, so if we’re able to create a fake FILE structure and trigger the flush, that means it will use our fake vtable which will allow us to execute a command as well.
  • Use the FILE buffer metadata so that we can do write operation in our desired target address (Arbitrary Address Write).

The detail of each attack will be explained in the next part, but spoilers ahead, some of these possible attacks are only working in the old glibc version.

Resources

Social Media

Follow me on twitter