2021-03-26 11:59 已编辑 C++

关注

深入了解C++线程库std::thread

前面个两期讲解了C++的学习路线及开源项目学习方法：

C++的多线程在面试中频繁问及，本期深入讲解下C++的线程库std::hread的使用及其在Linux平台下的实现。

vx搜一搜: look_code_art，更多硬核等你发现，
也可以添加个人 vx: fibonaccii_

观前提醒，本期包含的代码较多，用电脑观看体验更佳。

经过千呼万唤，终于在C++11中，引入了线程库std::thread。本期要完成两个目标：

如何使用std::thread创建线程
深入剖析std::thread的设计原理

使用std::thread

在如下的demo中，在主线程中使用std::thread创建3个子线程，线程入口函数是do_some_word，在主线程运行结束前等待子线程结束。

void do_some_work(int num) { 
std::cout<<"thread: "<<num<<std::endl;
}

int main(int argc, char const *argv[]){ 
int threadNums =3;
std::vector<std::thread> threadList;
threadList.reserve(threadNums);

// 1 创建 threadNums 个线程
for(int idx=0; idx < threadNums; ++idx) { 
threadList.emplace_back(std::thread{do_some_work, idx});
}

std::cout<<"work in main thread"<<std::endl;

// 2 终止 threadNums 个线程
for(int idx=0; idx < threadNums; ++idx) { 
threadList[idx].join();
}

std::cout<<"main thread end"<<std::endl;
return 0;
}

在demo中，在构造线程对象std::thread{do_some_work, idx}的时候，还是建议使用{}而不是()，以防止编译器产生错误的决议，具体原因可以参考前文别再徘徊于{}与()之间了，来学习初始化的正确用法。

三个子线程共享输出缓冲区std::cout，此时没有采取任何机制保护线程间共享数据，因此上面demo的输出可能不符合你的预期，即很可能不是按照如下格式输出：

thread: 1
thread: 2
thread: 3

实际上的输出，可能会非常混乱：

$ g++ -g thread_unitest.cc  -o thread -lpthread && ./thread
thread: thread: 12 // 两个线程的输出融合在一起了，


work in main thread
thread: 0 // 最先启动的线程，却最后输出
main thread end // 子线程都已中止

从输出可以看出：

先创建的线程，未必就先运行；
而且几个线程之间是互相抢档CPU资源的。

线程间数据共享问题及其应对措施，留到后文讲解，下面讲解std::thread的设计。

深入剖析 std::thread

在g++中，thread是基于pthread实现的。本次主要从以下三个方面分析std::thread：

std::thread对象不可复制，只具有移动属性
每个线程具有唯一的标志，即线程id
创建子线程

移动属性

有很多书籍说，std::thread对象的所有权只能传递不能复制。实际上，就是std::thread对象，只具有移动属性，不具有复制属性。std::thread的构造函数如下：

class thread {
private:
id _M_id;
public:
thread() noexcept = default;

template<typename _Callable, 
typename... _Args,
typename = _Require<__not_same<_Callable>>>
explicit thread(_Callable&& __f, _Args&&... __args) {
//...
}

~thread() {
if (joinable())
std::terminate();
}
// 禁止复制
thread(const thread&) = delete;
thread& operator=(const thread&) = delete;

// std::thread 只具有移动属性
thread(thread&& __t) noexcept
{ swap(__t); }

thread& operator=(thread&& __t) noexcept {
if (joinable())
std::terminate();
swap(__t);
return *this;
}
//...
}

可以发现，std::thread禁止了复制构造函数、复制赋值表达式，只留下了移动构造函数、赋值，使得std::thread对象只能移动，不能复制。这就是本文开篇demo中使用emplace_back函数添加std::thread对象的原因，防止触发复制构造函数。

向threadList中添加std::thread对象，有如下三种方式：

threadList.emplace_back(std::thread{do_some_work, idx}); // 1) ok 

std::thread trd{do_some_work, idx};
threadList.push_back(trd); // 2) error
threadList.push_back(std::move(td)); // 3) ok
threadList.emplace_back(std::move(td)); // 4) ok

注意：当push_back接受的是右值时，底层调用的还是emplace_back函数，因此，3)和4)算是等价。

std::thread::id

观察可发现，在std::thread对象中，只有一个成员变量_M_id：

id _M_id;

这个类id全称是std::thread::id，实现如下：

typedef pthread_t native_handle_type;

class id { 

native_handle_type _M_thread;  // _M_thread 即 pthread_t 对象，线程的唯一辨识标志
public:
id() noexcept : _M_thread() { }  // _M_thread 默认值是 0
explicit id(native_handle_type __id) : _M_thread(__id) { }
private:
friend class thread;
friend class hash<thread::id>;

// 为 std::thread::id 对象重载了 == 运算
friend bool operator==(thread::id __x, thread::id __y) noexcept;
friend bool operator<(thread::id __x,  thread::id __y) noexcept;
// 为 std::thread::id 对象重载了 << 操作
template<class _CharT, class _Traits>
friend basic_ostream<_CharT, _Traits>&
operator<<(basic_ostream<_CharT, _Traits>& __out, thread::id __id);
};

因此，这个std::thread::id实际上，就是封装了pthread_t对象，用作每个线程标志。

在构造std::thread对象的时候，如果没有设置线程入口函数，则线程_M_id._M_thread的值是0。

比如下面的demo中，trd没有设置线程入口函数，trd调用默认构造函数时，trd的_M_id._M_thread会被初始化为0。

int main(int argc, char const *argv[]) {

std::thread trd;
std::cout<<trd.get_id()<<std::endl;
return 0;
}

但是，打印线程标志trd.get_id()，输出的是却不是0。这仅仅是std::thread::id在重载<<操作符时的设定，用于提示调用者线程没有启动。

$ g++  thread_.cc -o thread_ && ./thread_
thread::id of a non-executing thread

可以到std::thread::id重载的<<操作符的函数中一探究竟：

template<class _CharT, class _Traits>
inline basic_ostream<_CharT, _Traits>& operator<<(basic_ostream<_CharT, _Traits>& __out, thread::id __id) {
// 线程未启动 
if (__id == thread::id())
return __out << "thread::id of a non-executing thread";
// 线程成功启动
else
return __out << __id._M_thread;
}

// id的相等判断 
inline bool operator==(thread::id __x, thread::id __y) noexcept {
return __x._M_thread == __y._M_thread;
}

因此，判断一个线程是否启动，可如下检测：

bool thread_is_active(const std::thread::id& thread_id) { 
return thread_id != std::thread::id();
}

设置了线程入口函数，_M_id._M_thread才会有值显示。

int main(int argc, char const *argv[]) {

std::thread trd{[]{std::cout<<"wok in sub-thread\n";}};

std::cout<<trd.get_id()<<std::endl;
trd.join();
return 0;
}

输出的是：

$ g++  thread_.cc -o thread_ -lpthread && ./thread_
139794901763840
wok in sub-thread

当设置了显示入口函数时，_M_id._M_thread才是线程的tid值，由pthread_create(&tid, NULL, ...)函数设置。

by the way

在创建std::thread对象trd时，如果设置了线程入口函数，那么就必须使用trd.join()或者trd.detach()来表达子线程与主线程的运行关系，否则在std::thread对象析构时，整个程序会被std::terminate()中止。

没有设置线程入口函数，trd.joinable()返回值就是false，因此不会触发std::terminate()。

~thread() {
if (joinable())
std::terminate();
}

创建子线程

当构造std::thread对象时，设置了线程入口函数，会在相匹配的构造函数里调用pthread_create函数创建子线程。先看整体实现：

// std::thread 构造函数
template<typename _Callable, 
typename... _Args,
typename = _Require<__not_same<_Callable>>>
explicit thread(_Callable&& __f, _Args&&... __args)
{
static_assert( __is_invocable<typename decay<_Callable>::type, 
typename decay<_Args>::type...>::value,
"std::thread arguments must be invocable after conversion to rvalues");

// Create a reference to pthread_create, not just the gthr weak symbol.
auto __depend = reinterpret_cast<void(*)()>(&pthread_create);
// 启动线程
_M_start_thread(_S_make_state(__make_invoker(std::forward<_Callable>(__f), 
std::forward<_Args>(__args)...)),
__depend);
}

再细看构造函数执行流程：

在编译期判断构造std::thread对象时设置的线程入口函数__f及其参数__args能否调用。

比如，下面的demo中，线程入口函数thread_func有个int类型的参数arg，如果传入的参数__args无法隐式转换为int类型，或者没有设置__args，都会触发std::thread构造函数中的静态断言static_assert，报错：error: static assertion failed: std::thread arguments must be invocable after conversion to rvalues 。

void thread_func(int arg) {   }

int main(int argc, char const *argv[]) {
std::thread trd_1{thread_func, "str"};  // arg类型不对
std::thread trd_2{thread_func}; // 缺少 arg

// ...
return 0;
}

将线程入口函数__f及其参数__args进一步封装起来。

这里是使用__make_invoker完成的：

__make_invoker(std::forward<_Callable>(__f), std::forward<_Args>(__args)...))

__make_invoker的作用是返回一个_Invoker对象，_Invoker是个仿函数，通过_Invoker()就可以以指定的参数__args直接执行线程入口函数__f。类似于std::bind：

void print_num(int i) {
std::cout << i << '\n';
}

int main(int argc, const char* argv[]) {
// wrapper
auto invoker =  std::bind(print_num, -9);
// 直接调用 invoker() 就可以以指定参数 -9 调用 print_num
invoker();
}

启动子线程

在调用_M_start_thread函数启动子线程前，执行过程：创建 _State_ptr的对象，来封装_Invoker对象，再传递给_M_start_thread函数。这个过程，由_S_make_state函数完成，_S_make_state最终返回_State_ptr对象。

// 基类
struct _State {
virtual ~_State();          // 虚析构函数
virtual void _M_run() = 0;  // 线程运行函数
};
using _State_ptr = unique_ptr<_State>; // 父类指针

// 子类
template<typename _Callable>
struct _State_impl : public _State {
_Callable _M_func; // 线程入口函数

_State_impl(_Callable&& __f) : _M_func(std::forward<_Callable>(__f))
{ }

void _M_run() { _M_func(); } // 执行线程入口函数
};

// 传入_Invoker对象，返回 _State_ptr 对象
template<typename _Callable>
static _State_ptr _S_make_state(_Callable&& __f)  {
using _Impl = _State_impl<_Callable>;
// 使用子类对象来初始化父类
return _State_ptr{new _Impl{std::forward<_Callable>(__f)}};
}

_S_make_state函数，将线程入口函数__f及其参数__args封装到_State_ptr对象_State_ptr_obj中，这样最后可以通过_State_ptr_obj->_M_run()来调用__f。

下面到了_M_start_thread函数了：

void thread::_M_start_thread(_State_ptr state, void (*)())
{
const int err = __gthread_create(&_M_id._M_thread,
&execute_native_thread_routine, // 线程执行函数
state.get());
if (err)
__throw_system_error(err);
state.release();
}

// 内部调用的是 pthread_create 函数
static inline int __gthread_create(pthread_t *__threadid, void *(*__func) (void*), void *__args)
{
return pthread_create(__threadid, NULL, __func, __args);
}
// 内部执行线程入口函数
static void* execute_native_thread_routine(void* __p)
{
thread::_State_ptr __t{static_cast<thread::_State*>(__p)};
__t->_M_run(); // 运行线程入口函数
return nullptr;
}

因此，在执行完_M_start_thread函数后，才具有_M_start_thread !=0。

好，到此为此已实现了本文开篇提出的两个目标，下一篇将深入剖析如何保护线程间共享数据。

#春招##实习##面经##秋招##C/C++##Linux##学习路径#