header image
under construction image

This website is still under construction.

Blog

Compiling nasm Assembly code in x86_64 OSX

In this post I’ve documented the way I compile assembly code using nasm assembler in my x86_64 Macbook pro. While the assembly code is basically the same across the Intel x86_64 architecture, differences arise in the system call numbers and the linking process.

Calling Convention #

All system calls enter the kernel via the syscall instruction which uses the System V ABI (section A.2.1) as Linux does and it uses the syscall instruction (int 0x80 for syscall in Linux x86_64). 1

System Call Numbers #

Obviously MacOS system calls are different from the Linux x86_64 or BSD. System calls in the MacOS XNU kernel are tagged with a class information.

/*
 * Syscall classes for 64-bit system call entry.
 * For 64-bit users, the 32-bit syscall number is partitioned
 * with the high-order bits representing the class and low-order
 * bits being the syscall number within that class.
 * The high-order 32-bits of the 64-bit syscall number are unused.
 * All system classes enter the kernel via the syscall instruction.
 */

There are Mach system calls, BSD system calls, NONE, diagnostic and machine-dependent (osfmk/mach/i386/syscall_sw.h):

#define SYSCALL_CLASS_NONE  0   /* Invalid */
#define SYSCALL_CLASS_MACH  1   /* Mach */
#define SYSCALL_CLASS_UNIX  2   /* Unix/BSD */
#define SYSCALL_CLASS_MDEP  3   /* Machine-dependent */
#define SYSCALL_CLASS_DIAG  4   /* Diagnostics */

Here is how the UNIX system call class is constructed:

// 2 << 24 + syscall number
#define SYSCALL_CONSTRUCT_UNIX(syscall_number) \
            ((SYSCALL_CLASS_UNIX << SYSCALL_CLASS_SHIFT) | \
             (SYSCALL_NUMBER_MASK & (syscall_number)))

You can find the complete list of system calls and their respective numbers in the XNU kernel github source mirror: bsd/kern/syscalls.master

Linker Options #

The default entry point for the MacOS is start which will call the entry point. Thus, you can not override this. When you’re not linking with the system libraries It’s perfectly fine to use the default start entry point, You can also override this by using the -e. (Obviously It’s not recommended to override this when linking with the system libraries)

There are also these variables that you can define that I think is used when you link with the system libraries. Since I’m not interested in doing that I don’t really care about these variables and I keep my build script pretty simple.

I couldn’t find the complete list of these variables but I found macosx_version_min in some random Russian forum, which is the earliest version of MacOS X that this executable will run on is version. 2

Putting It Into Action #

Here is a simple hello world program that I wrote:

global _start

%define SYS_WRITE 0x2000004
%define SYS_EXIT  0x2000001

section .text

_start:
    ; user_ssize_t write(int fd, user_addr_t cbuf, user_size_t nbyte);
    mov       rax, SYS_WRITE
    mov       rdi, 1             ; stdout
    mov       rsi, msg
    mov       rdx, msg.len
    syscall

    ; void exit(int rval)
    mov       rax, SYS_EXIT
    mov       rdi, 0
    syscall

section .data

msg:  db  "Hello, World!", 10     ; 10 is the new line in ASCII
.len: equ $ - msg

Build script:

#!/bin/sh
set -xe

nasm -f macho64 -o "${1%.*}.o" "$1"
ld -e _start -static -o "${1%.*}" "${1%.*}.o"

Compiling the program:

./build.sh main.nasm
./main

And it outputs Hello, World! with a new line as expected!

Other Notes #

You can also use gcc -e _start -Wl,-no_pie for linking the object file and the gcc compiler will figure out the right flags for you. Also when using the GNU compiler _start can be replaced to _main so that you don’t need to specify the entry point for the executable.