bthread源码剖析（三）: 汇编语言实现的上下文切换 - 哈喽比特

803次阅读 | 发布于3年以前

上回书说道，TaskGroup的run_main_task()有三大关键函数，剩余一个sched_to()没有展开详谈。那在今天的sched_to()源码探秘之旅开始之前呢，首先高能预警，本文会涉及到汇编语言，所以请大家坐稳扶好！

TaskGroup::sched_to()

sched_to()是用来进行上下文（Context）切换的。先看下sched_to()的代码，然后再解读：

inline void TaskGroup::sched_to(TaskGroup** pg, bthread_t next_tid) {
    TaskMeta* next_meta = address_meta(next_tid);
    if (next_meta->stack == NULL) {
        ContextualStack* stk = get_stack(next_meta->stack_type(), task_runner);
        if (stk) {
            next_meta->set_stack(stk);
        } else {
            // stack_type is BTHREAD_STACKTYPE_PTHREAD or out of memory,
            // In latter case, attr is forced to be BTHREAD_STACKTYPE_PTHREAD.
            // This basically means that if we can't allocate stack, run
            // the task in pthread directly.
            next_meta->attr.stack_type = BTHREAD_STACKTYPE_PTHREAD;
            next_meta->set_stack((*pg)->_main_stack);
        }
    }
    // Update now_ns only when wait_task did yield.
    sched_to(pg, next_meta);
}

通过传入的参数：next_tid找到TM：next_meta，和对应的ContextualStack信息：stk。

然后给next_meta设置栈stk。

最后调用另外一个重载的sched_to()，声明如下：

void TaskGroup::sched_to(TaskGroup** pg, TaskMeta* next_meta);

定义：

void TaskGroup::sched_to(TaskGroup** pg, TaskMeta* next_meta) {
    TaskGroup* g = *pg;

    // Save errno so that errno is bthread-specific.
    const int saved_errno = errno;
    void* saved_unique_user_ptr = tls_unique_user_ptr;

    TaskMeta* const cur_meta = g->_cur_meta;
    const int64_t now = butil::cpuwide_time_ns();
    const int64_t elp_ns = now - g->_last_run_ns;
    g->_last_run_ns = now;
    cur_meta->stat.cputime_ns += elp_ns;
    if (cur_meta->tid != g->main_tid()) {
        g->_cumulated_cputime_ns += elp_ns;
    }
    ++cur_meta->stat.nswitch;
    ++ g->_nswitch;

记录一些数据。继续看代码，判断下一个的TM（next_meta）和当前TM（cur_meta）如果不是同一个，就去切换栈。

    // Switch to the task
    if (__builtin_expect(next_meta != cur_meta, 1)) {
        g->_cur_meta = next_meta;
        // Switch tls_bls
        cur_meta->local_storage = tls_bls;
        tls_bls = next_meta->local_storage;


        if (cur_meta->stack != NULL) {
            if (next_meta->stack != cur_meta->stack) {
                jump_stack(cur_meta->stack, next_meta->stack);
                // probably went to another group, need to assign g again.
                g = tls_task_group;
            }

        }
        // else because of ending_sched(including pthread_task->pthread_task)
    } else {
        LOG(FATAL) << "bthread=" << g->current_tid() << " sched_to itself!";
    }

tls_bls表示的是TM（bthread）内的局部存储。先做还原，并且赋值成下一个TM的局部存储。接着执行jump_stack()去切换栈。

上面的大if结束之后，去执行TG的remain回调函数（如果设置过）。

    while (g->_last_context_remained) {
        RemainedFn fn = g->_last_context_remained;
        g->_last_context_remained = NULL;
        fn(g->_last_context_remained_arg);
        g = tls_task_group;
    }

    // Restore errno
    errno = saved_errno;
    tls_unique_user_ptr = saved_unique_user_ptr;

    *pg = g;

jump_stack()

定义在src/bthread/stack_inl.h 中

inline void jump_stack(ContextualStack* from, ContextualStack* to) {
    bthread_jump_fcontext(&from->context, to->context, 0/*not skip remained*/);
}

bthread_jump_fcontext()其实是汇编函数，在bthread/context.cpp中，功能就是进行栈上下文的切换（跳转）。与之配套的还有一个bthread_make_fcontext()，负责创建bthread的栈上下文。这两个函数是实现栈上下文切换的核心。它们的代码其实并非brpc的原创，而是出自开源项目libcontext。libcontext是boost::context的简化实现。打开bthread/context.h可以看到版权声明：

libcontext - a slightly more portable version of boost::context

Copyright Martin Husemann 2013. Copyright Oliver Kowalke 2009. Copyright Sergue E. Leontiev 2013. Copyright Thomas Sailer 2013. Minor modifications by Tomasz Wlostowski 2016.

Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

其实另外一个C++协程的开源项目libgo中的Context也脱胎于此。

在context.cpp中，定义了各种平台的bthread_jump_fcontext()/ bthread_make_fcontext()实现。__asm代码块是C语言文件中编写汇编语言代码的写法。

#if defined(BTHREAD_CONTEXT_PLATFORM_linux_x86_64) && defined(BTHREAD_CONTEXT_COMPILER_gcc)
__asm (
".text\n"
".globl bthread_jump_fcontext\n"
".type bthread_jump_fcontext,@function\n"
".align 16\n"
"bthread_jump_fcontext:\n"
"    pushq  %rbp  \n"
"    pushq  %rbx  \n"
"    pushq  %r15  \n"
"    pushq  %r14  \n"
"    pushq  %r13  \n"
"    pushq  %r12  \n"
"    leaq  -0x8(%rsp), %rsp\n"
"    cmp  $0, %rcx\n"
"    je  1f\n"
"    stmxcsr  (%rsp)\n"
"    fnstcw   0x4(%rsp)\n"
"1:\n"
"    movq  %rsp, (%rdi)\n"
"    movq  %rsi, %rsp\n"
"    cmp  $0, %rcx\n"
"    je  2f\n"
"    ldmxcsr  (%rsp)\n"
"    fldcw  0x4(%rsp)\n"
"2:\n"
"    leaq  0x8(%rsp), %rsp\n"
"    popq  %r12  \n"
"    popq  %r13  \n"
"    popq  %r14  \n"
"    popq  %r15  \n"
"    popq  %rbx  \n"
"    popq  %rbp  \n"
"    popq  %r8\n"
"    movq  %rdx, %rax\n"
"    movq  %rdx, %rdi\n"
"    jmp  *%r8\n"
".size bthread_jump_fcontext,.-bthread_jump_fcontext\n"
".section .note.GNU-stack,\"\",%progbits\n"
);

这里的汇编是AT&T汇编，和Intel汇编语法不同。比如这里的mov操作，在从左到右看的。movq和popq的q表示操作的单位是四字（64位），如果是32位系统，则是movl和popl了。

    pushq  %rbp
    pushq  %rbx
    pushq  %r15
    pushq  %r14
    pushq  %r13
    pushq  %r12

常规操作，就是把函数调用方的相关寄存器入栈，也就是保存调用方的运行环境。在当前函数执行结束之后要从栈中还原数据到相应的寄存器中，从而让调用方继续执行。所以末尾有出栈操作。

在入栈之后：

leaq  -0x8(%rsp), %rsp

表示：rsp 栈顶寄存器下移 8 字节，为FPU 浮点运算预留。

另外值得一提的是bthread_jump_fcontext()函数在调用的时候是传入了3个参数，但是定义的bthread_jump_fcontext()是可以接收4个参数的。也正是因为这个第4个参数，导致了代码里有了2次跳转，分别跳转到1和2处。

先看一下函数参数和寄存器的关系：

寄存器	对应参数
%rdi	第1个参数
%rsi	第2个参数
%rdx	第3个参数
%rcx	第4个参数

在leaq指令之后，开始判断第四个参数的值。

    cmp  $0, %rcx
    je  1f
    stmxcsr  (%rsp)    // 保存当前MXCSR内容到rsp指向的位置
    fnstcw   0x4(%rsp) // 保存当前FPU状态字到rsp+4指向的位置
1:

如果第四个参数为0则直接跳转到1处(1在这里是一个标记，可以直接jump到对应的代码位置，类似C语言中的goto用法）。也就是跳过stmxcsr、fnstcw这两个指令。对于我们的场景而言，没有第四个参数也就不需要管这个。继续：

1:
    movq  %rsp, (%rdi)
    movq  %rsi, %rsp

我们知道%rdi和%rsi表示的是第一个参数和第二个参数，也就是：&from->context 和 to->context。

这两个movq指令表示的就是栈切换的核心操作，将当前的栈指针(%rsp)存储到第一个参数所指向的内存中。然后将第二个参数的值赋值给栈指针。修改栈指针，就是更改了栈顶，也就是进行了实际的栈切换操作。

接着是不太重要的代码，还是和第四个参数有关的：

    cmp  $0, %rcx
    je  2f
    ldmxcsr  (%rsp)
    fldcw  0x4(%rsp)
2:

也就是说如果第4个参数是0，则跳转到2。跳过的两条指令ldmxcsr、fldcw可以理解为是之前stmxcsr、fnstcw那两个指令的逆操作（也就是还原一下）。

2:
    leaq  0x8(%rsp), %rsp

%rsp 栈顶寄存器上移 8 字节，恢复为 FPU 浮点运算预留空间。

接着还原从栈中各个寄存器，因为是栈，所以逆向出栈。

    popq  %r12
    popq  %r13
    popq  %r14
    popq  %r15
    popq  %rbx
    popq  %rbp

在这6个popq之后还有一个popq，和前面的pushq是没有对应关系的。

    popq  %r8

是将bthread_jump_fcontext()返回之后要执行的指令地址，放到 %r8 寄存器中。展开一下谈谈，在函数A调用函数B的时候，会先把函数的返回值入栈，然后再把函数B的参数入栈。所以对应逆操作，在函数参数都出栈之后，继续出栈的数据就是函数的返回地址！

    movq  %rdx, %rax
    movq  %rdx, %rdi

%rdx表示的是函数的第三个参数，也就是是否：skip remained，当前都是0。先后存入到%rax和%rdi中。

%rax寄存器表示的是返回值。

%rdi表示的是函数第一个参数。也就是给切换完栈之后要调用的函数，准备参数。

    jmp  *%r8

跳转到返回地址，即调用方在调用完bthread_jump_fcontext()后，继续执行的指令位置。