Fuchsia是google开发的全新微内核操作系统,用来替换android。本文根据fuchsia最新的官方代码,来分析它所提供的一些安全功能。
现在的内核地址随机化不止包含内核代码段地址随机化, 还包括了内核自身页表、内核的堆等等。
zircon内核并没有支持内核地址随机化, 以aarch64为例:
zircon/kernel/arch/arm64/start.S:
adr_global tmp, kernel_relocated_base
ldr kernel_vaddr, [tmp]
Kernel_relocated_base符号保存着内核实际的加载地址,定义在:
zircon/kernel/arch/arm64/mmu.cc:
#if DISABLE_KASLR
uint64_t kernel_relocated_base = KERNEL_BASE;
#else
uint64_t kernel_relocated_base = 0xffffffff10000000;
#endif
可以看到,当前是一个固定的值,注释也强调了未来会使用随机化。
// Static relocated base to prepare for KASLR. Used at early boot and by gdb
// script to know the target relocated address.
// TODO(fxbug.dev/24762): Choose it randomly.
X86架构同样没有实现:
zircon/kernel/arch/x86/mmu.cc
// Static relocated base to prepare for KASLR. Used at early boot and by gdb
// script to know the target relocated address.
// TODO(thgarnie): Move to a dynamically generated base address
#if DISABLE_KASLR
uint64_t kernel_relocated_base = KERNEL_BASE - KERNEL_LOAD_OFFSET;
#else
uint64_t kernel_relocated_base = 0xffffffff00000000;
#endif
在商用os系统里了,windows nt内核首先将内核页表做了地址随机化处理,linux和xnu都没有实现。
zircon内核也同样没有实现。
zircon/kernel/arch/arm64/mmu.cc
// The main translation table for the kernel. Globally declared because it's reached
// from assembly.
pte_t arm64_kernel_translation_table[MMU_KERNEL_PAGE_TABLE_ETRIES_TOP] __ALIGNED(
MMU_KERNEL_PAGE_TABLE_ENTRIES_TOP * 8);
Fuchsia的stack、代码段(包含pie)等等都是通过zx_vmar_map函数来建立的,这是一个系统调用, fuchsia的系统调用似乎是自动生成的,笔者对这块的逻辑尚未熟悉,不过在docs文档里有说明是随机化产生的地址。
docs/reference/syscalls/vmar_map.md:
*vmar_offset* must be 0 if *options* does not have **ZX_VM_SPECIFIC** or
**ZX_VM_SPECIFIC_OVERWRITE** set. If neither of those are set, then
the mapping will be assigned an offset at random by the kernel (with an
allocator determined by policy set on the target VMAR).
fuchsia系统没有提供app证书签名和代码完整性校验的功能。
fuchsia系统没有提供系统调用审计和过滤的功能。
zircon/kernel/arch/arm64/exceptions.S:
LOCAL_FUNCTION_LABEL(arm64_syscall_dispatcher)
start_isr_func_cfi
cmp x16, #ZX_SYS_COUNT
bhs .Lunknown_syscall
csel x16, xzr, x16, hs
csdb
adr x12, .Lsyscall_table
add x12, x12, x16, lsl #4
br x12
SPECULATION_POSTFENCE
在判断系统调用号是否超出范围之后, 直接调用了syscall_table对应的函数指针。
没有像linux提供了audit系统调用审计和secomp系统调用过滤的功能。
X86架构同样如此:
zircon/kernel/arch/x86/syscall.S:
.balign 16
FUNCTION_LABEL(x86_syscall)
// swap to the kernel GS register
swapgs
// save the user stack pointer
mov %rsp, %gs:PERCPU_SAVED_USER_SP_OFFSET
// load the kernel stack pointer
mov %gs:PERCPU_KERNEL_SP_OFFSET, %rsp
.cfi_def_cfa %rsp, 0
push_value %gs:PERCPU_SAVED_USER_SP_OFFSET // User stack
push_value %r11 // RFLAGS
push_value %rcx // RIP
push_value %r15
push_value %r14
push_value %r13
push_value %r12
push_value $0 // R11 was trashed by the syscall instruction.
push_value %r10
push_value %r9
push_value %r8
push_value %rbp
push_value %rdi
push_value %rsi
push_value %rdx
push_value $0 // RCX was trashed by the syscall instruction.
push_value %rbx
push_value %rax
cmp $ZX_SYS_COUNT, %rax
jae .Lunknown_syscall
leaq .Lcall_wrapper_table(%rip), %r11
movq (%r11,%rax,8), %r11
lfence
jmp *%r11
fuchsia在提供的mmap接口中,没有禁止映射内存0的限制,而这个功能在linux和NT内核中都做了限制。
zircon/third_party/ulib/musl/src/mman/mmap.c
mmap的主体函数中没有对addr地址做限制。
同上, mmap/mprtect接口中没有对可写、可执行权限做限制, linux、xnu、nt都实现了此保护功能。
Printf未实现%K内核地址保护功能,利用%p可能将内核地址泄露出来。
zircon/third_party/ulib/musl/src/stdio/vfprintf.c
Fuchsia未实现类似linux的引用计数溢出保护。
zircon/kernel/lib/counters/counters.cc
zircon内核在启动之出对内核代码段的属性设置为rwx,在后续的vm_init中,将内核的代码和数据属性进行了正确设置:
zircon/kernel/vm/vm.cc:
namespace {
const ktl::array _kernel_regions = {
kernel_region{
.name = "kernel_code",
.base = (vaddr_t)__code_start,
.size = ROUNDUP((uintptr_t)__code_end - (uintptr_t)__code_start, PAGE_SIZE),
.arch_mmu_flags = ARCH_MMU_FLAG_PERM_READ | ARCH_MMU_FLAG_PERM_EXECUTE,
},
kernel_region{
.name = "kernel_rodata",
.base = (vaddr_t)__rodata_start,
.size = ROUNDUP((uintptr_t)__rodata_end - (uintptr_t)__rodata_start, PAGE_SIZE),
.arch_mmu_flags = ARCH_MMU_FLAG_PERM_READ,
},
kernel_region{
.name = "kernel_data",
.base = (vaddr_t)__data_start,
.size = ROUNDUP((uintptr_t)__data_end - (uintptr_t)__data_start, PAGE_SIZE),
.arch_mmu_flags = ARCH_MMU_FLAG_PERM_READ | ARCH_MMU_FLAG_PERM_WRITE,
},
kernel_region{
.name = "kernel_bss",
.base = (vaddr_t)__bss_start,
.size = ROUNDUP((uintptr_t)_end - (uintptr_t)__bss_start, PAGE_SIZE),
.arch_mmu_flags = ARCH_MMU_FLAG_PERM_READ | ARCH_MMU_FLAG_PERM_WRITE,
},
};
}
fuchsia使用clang编译器,默认开启了fstack-protector参数。
在启动阶段设置thread pointer地址时就开始引入了一个随机化的stack canary值。
zircon/kernel/arch/arm64/start.S
bl choose_stack_guard
zircon/kernel/top/debug.cc
__NO_SAFESTACK uintptr_t choose_stack_guard(void) {
uintptr_t guard;
if (hw_rng_get_entropy(&guard, sizeof(guard)) != sizeof(guard)) {
// We can't get a random value, so use a randomish value.
guard = 0xdeadbeef00ff00ffUL ^ (uintptr_t)&guard;
}
return guard;
}
以后每个线程在创建的时候,都会使用上述初始化选定的canary值。
zircon/kernel/arch/arm64/thread.cc
void arch_thread_initialize(Thread* t, vaddr_t entry_point) {
// compiler ABI (TPIDR_EL1 + ZX_TLS_STACK_GUARD_OFFSET).
t->arch().stack_guard = Thread::Current::Get()->arch().stack_guard;
// set the stack pointer
t->arch().sp = (vaddr_t)frame;
#if __has_feature(safe_stack)
DEBUG_ASSERT(IS_ALIGNED(t->stack_.unsafe_top(), 16));
t->arch().unsafe_sp = t->stack_.unsafe_top();
#endif
#if __has_feature(shadow_call_stack)
// The shadow call stack grows up.
t->arch().shadow_call_sp = reinterpret_cast<uintptr_t*>(t->stack().shadow_call_base());
#endif
}
所以每个线程使用的canary值都是同一个, 较linux每个线程使用不同的值有所弱化。
fuchsia使用的是clang编译器,默认使用了-fsanitize=shadow-call-stack,llvm的官方文档指出仅在aarch64上实现了shadow stack的功能,x86_64上因为一些安全原因将这个功能移除主线代码了。
Aarch64上,fuchsia的abi约定x18寄存器保存的是shadow stack的地址,由于笔者尚未编译完zircon内核二进制,以下示例来自llvm官方文档。
int foo() {
return bar() + 1;
}
不开启shadow stack:
stp x29, x30, [sp, #-16]!
mov x29, sp
bl bar
add w0, w0, #1
ldp x29, x30, [sp], #16
Ret
开启后:
str x30, [x18], #8
stp x29, x30, [sp, #-16]!
mov x29, sp
bl bar
add w0, w0, #1
ldp x29, x30, [sp], #16
ldr x30, [x18, #-8]!
ret
Safe stack与shadow stack功能非常相近,使用-fsanitize=safe-stack参数编译。Shadow stack仅仅保存函数的返回地址,而safe stack除了保存返回地址外,一些寄存器值和局部变量的值也会保存。
在shadow stack中, sp保存的是shadow stack的地址,x18寄存器保存的是正常stack的地址。而在safe stack中,sp保存的是safe stack的地址,而unsafe stack的地址是保存在fs或gs寄存器的某个固定偏移中。
X86_64: %gs:ZX_TLS_UNSAFE_SP_OFFSET
Aarch64: __builtin_thread_pointer()或者TPIDR_EL0/1