In this post I’ve documented the way I compile assembly code
using nasm
assembler in my x86_64 Macbook pro. While
the assembly code is basically the same across the Intel x86_64
architecture, differences arise in the system call numbers and the
linking process.
Calling Convention #
All system calls enter the kernel via the
syscall
instruction which uses the System V ABI
(section A.2.1) as Linux does and it uses the
syscall
instruction (int 0x80
for syscall
in Linux x86_64).
1
System Call Numbers #
Obviously MacOS system calls are different from the Linux x86_64 or BSD. System calls in the MacOS XNU kernel are tagged with a class information.
/*
* Syscall classes for 64-bit system call entry.
* For 64-bit users, the 32-bit syscall number is partitioned
* with the high-order bits representing the class and low-order
* bits being the syscall number within that class.
* The high-order 32-bits of the 64-bit syscall number are unused.
* All system classes enter the kernel via the syscall instruction.
*/
There are Mach system calls, BSD system calls, NONE, diagnostic and
machine-dependent (osfmk/mach/i386/syscall_sw.h
):
#define SYSCALL_CLASS_NONE 0 /* Invalid */
#define SYSCALL_CLASS_MACH 1 /* Mach */
#define SYSCALL_CLASS_UNIX 2 /* Unix/BSD */
#define SYSCALL_CLASS_MDEP 3 /* Machine-dependent */
#define SYSCALL_CLASS_DIAG 4 /* Diagnostics */
Here is how the UNIX system call class is constructed:
// 2 << 24 + syscall number
#define SYSCALL_CONSTRUCT_UNIX(syscall_number) \
((SYSCALL_CLASS_UNIX << SYSCALL_CLASS_SHIFT) | \
(SYSCALL_NUMBER_MASK & (syscall_number)))
You can find the complete list of system calls and their respective
numbers in the XNU kernel github source mirror:
bsd/kern/syscalls.master
Linker Options #
The default entry point for the MacOS is start
which
will call the entry point. Thus, you can not override this. When
you’re not linking with the system libraries It’s
perfectly fine to use the default start
entry point,
You can also override this by using the -e
. (Obviously
It’s not recommended to override this when linking with the
system libraries)
There are also these variables that you can define that I think is used when you link with the system libraries. Since I’m not interested in doing that I don’t really care about these variables and I keep my build script pretty simple.
I couldn’t find the complete list of these variables but I
found macosx_version_min
in some random Russian forum,
which is the earliest version of MacOS X that this executable will
run on is version.
2
Putting It Into Action #
Here is a simple hello world program that I wrote:
global _start
%define SYS_WRITE 0x2000004
%define SYS_EXIT 0x2000001
section .text
_start:
; user_ssize_t write(int fd, user_addr_t cbuf, user_size_t nbyte);
mov rax, SYS_WRITE
mov rdi, 1 ; stdout
mov rsi, msg
mov rdx, msg.len
syscall
; void exit(int rval)
mov rax, SYS_EXIT
mov rdi, 0
syscall
section .data
msg: db "Hello, World!", 10 ; 10 is the new line in ASCII
.len: equ $ - msg
Build script:
#!/bin/sh
set -xe
nasm -f macho64 -o "${1%.*}.o" "$1"
ld -e _start -static -o "${1%.*}" "${1%.*}.o"
Compiling the program:
./build.sh main.nasm
./main
And it outputs Hello, World!
with a new line as
expected!
Other Notes #
You can also use gcc -e _start -Wl,-no_pie
for linking
the object file and the gcc
compiler will figure out
the right flags for you. Also when using the GNU compiler
_start
can be replaced to _main
so that
you don’t need to specify the entry point for the executable.