C++后端开发进阶
Chapter03

3.10 线程局部存储

对于一个存在多个线程的进程来说，有时候我们需要有一份数据是每个线程都拥有一份的，也就是说每个线程自己操作自己的这份数据，这有点类似 C++ 类的实例属性，每个实例对象操作的都是自己的属性。我们把这样的数据称之为线程局部存储（Thread Local Storage，TLS），对应的存储区域叫做线程局部存储区。

# 3.10.1 Windows 的线程局部存储

Windows 系统将线程局部存储区分成 TLS_MINIMUM_AVAILABLE 个块，每一块通过一个索引值对外提供访问。

//TLS_MINIMUM_AVAILABLE 的默认是 64，在 winnt.h 中定义：
#define TLS_MINIMUM_AVAILABLE 64

1
2

Windows 中使用函数 TlsAlloc 获得一个线程局部存储块的索引：

DWORD TlsAlloc();

如果这个函数调用失败，返回值是 TLS_OUT_OF_INDEXES（0xFFFFFFFF）；如果函数调用成功，得到一个索引，接下来就可以利用这个索引去往这个内存块中存储数据或者从这个内存块中得到数据，分别使用如下两个 API 函数：

LPVOID TlsGetValue(DWORD dwTlsIndex);
BOOL TlsSetValue(DWORD dwTlsIndex, LPVOID lpTlsValue);

1
2

当你不再需要这个存储区域时，你应该释放它，释放调用函数：

BOOL TlsFree(DWORD dwTlsIndex);

当然，使用线程局部存储除了使用上面介绍的 API 函数外，Microsoft VC++ 编译器还提供了如下方法定义一个线程局部变量：

__declspec(thread) int g_mydata = 1;

我们看一个具体例子：

#include <Windows.h>
#include <iostream>

__declspec(thread) int g_mydata = 1;

DWORD __stdcall WorkerThreadProc1(LPVOID lpThreadParameter)
{
    while (true)
    {
        ++g_mydata;
        
        Sleep(1000);
    }
    return 0;
}

DWORD __stdcall WorkerThreadProc2(LPVOID lpThreadParameter)
{
    while (true)
    {       
        std::cout << "g_mydata = " << g_mydata << ", ThreadID = " << GetCurrentThreadId() << std::endl;
        
        Sleep(1000);
    }
    return 0;
}

int main()
{
    HANDLE hWorkerThreads[2];
    hWorkerThreads[0] = CreateThread(NULL, 0, WorkerThreadProc1, NULL, 0, NULL);
    hWorkerThreads[1] = CreateThread(NULL, 0, WorkerThreadProc2, NULL, 0, NULL);
    
    CloseHandle(hWorkerThreads[0]);
    CloseHandle(hWorkerThreads[1]);

    while (true)
    {
        Sleep(1000);
    }
    
    return 0;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

上述代码中全局变量 g_mydata 是一个线程局部变量，因此该进程中每一个线程都会拥有这样一个变量副本，由于是不同的副本，WorkerThreadProc1 中将这个变量不断递增，对 WorkerThreadProc2 的 g_mydata 不会造成任何影响，因此其值始终是 1。程序执行结果如下：

需要说明的是，在 Windows 系统中被声明成线程局部变量的对象，在编译器生成可执行文件时，会在最终的 PE 文件中专门生成一个叫 tls 的节，这个节用于存放这些线程局部变量。

# 3.10.2 Linux 的线程局部存储

Linux 系统上的 NPTL （Native POSIX Threads Library，NPTL is the GNU C library POSIX threads implementation that is used on modern Linux systems）提供了一套函数接口来实现线程局部存储的功能：

int pthread_key_create(pthread_key_t* key, void (*destructor)(void*));
int pthread_key_delete(pthread_key_t key);

int pthread_setspecific(pthread_key_t key, const void* value);
void* pthread_getspecific(pthread_key_t key);

1
2
3
4
5

pthread_key_create 函数调用成功会返回 0 值，调用失败返回非 0 值，函数调用成功会为线程局部存储创建一个新键，用户通过参数 key 去设置（调用 pthread_setspecific）和获取（pthread_getspecific）数据，因为进程中的所有线程都可以使用返回的键，所以参数 key 应该指向一个全局变量。

参数 destructor 是一个自定义函数指针，其签名是：

void* destructor(void* value)
{
    /*多是为了释放value指针指向的资源*/
}

1
2
3
4

线程终止时，如果 key 关联的值不是 NULL，那么 NTPL 会自动执行定义的 destructor 函数；如果无须解构，可以将 destructor 设置为 NULL。

我们来看一个具体例子：

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>

//线程局部存储key
pthread_key_t thread_log_key;

void write_to_thread_log(const char* message)
{
	if (message == NULL)
		return;
	
	FILE* logfile = (FILE*)pthread_getspecific(thread_log_key);
	fprintf(logfile, "%s\n", message);
	fflush(logfile);
} 

void close_thread_log(void* logfile)
{
	char logfilename[128];
	sprintf(logfilename, "close logfile: thread%ld.log\n", (unsigned long)pthread_self());
	printf(logfilename);
	
	fclose((FILE *)logfile);
} 

void* thread_function(void* args)
{
	char logfilename[128];
	sprintf(logfilename, "thread%ld.log", (unsigned long)pthread_self());
	
	FILE* logfile = fopen(logfilename, "w");
	if (logfile != NULL)
	{
		pthread_setspecific(thread_log_key, logfile);
	
		write_to_thread_log("Thread starting...");
	}
	
	return NULL;
} 

int main()
{
	pthread_t threadIDs[5];	
	pthread_key_create(&thread_log_key, close_thread_log);
	for(int i = 0; i < 5; ++i)
	{
		pthread_create(&threadIDs[i], NULL, thread_function, NULL);
	}
	
	for(int i = 0; i < 5; ++i)
	{
		pthread_join(threadIDs[i], NULL);
	}
	
	return 0;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58

上述程序一共创建 5 个线程，每个线程都会自己生成一个日志文件，每个线程将自己的日志写入自己的文件中，当线程执行结束时，会关闭打开的日志文件句柄。

程序运行结果如下：

生成的 5 个日志文件中，其内容都写入了一行“Thread starting...”。

上面的程序首先调用 pthread_key_create 函数来申请一个槽位。在NPTL实现下，pthread_key_t 是无符号整型，pthread_key_create 调用成功时会将一个小于1024 的值填入第一个入参指向的 pthread_key_t 类型的变量中。之所以小于1024，是因为 NPTL 实现一共提供了1024个槽位。如图8-1所示，记录槽位分配情况的数据结构 pthread_keys 是进程唯一的，pthread_keys 结构示意图如下：

和 Windows 一样 Linux gcc 编译器也提供了一个关键字 __thread 去简化定义线程局部变量。例如：

__thread int val = 0;

我们再来看一个示例：

#include <pthread.h>
#include <iostream>
#include <unistd.h>

//线程局部存储key
__thread int g_mydata = 99;

void* thread_function1(void* args)
{
	while (true)
	{
		g_mydata ++;
	}
	
	return NULL;
} 

void* thread_function2(void* args)
{
	while (true)
	{		
		std::cout << "g_mydata = " << g_mydata << ", ThreadID: " << pthread_self() << std::endl;
		sleep(1);
	}
	
	return NULL;
} 

int main()
{
	pthread_t threadIDs[2];	
	pthread_create(&threadIDs[0], NULL, thread_function1, NULL);
	pthread_create(&threadIDs[1], NULL, thread_function2, NULL);
	
	for(int i = 0; i < 2; ++i)
	{
		pthread_join(threadIDs[i], NULL);
	}
	
	return 0;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

由于 thread_function1 修改的是自己的 g_mydata，因此 thread_function2 输出 g_mydata 的值始终是 99。

[root@localhost testmultithread]# g++ -g -o linuxtls2 linuxtls2.cpp -lpthread
[root@localhost testmultithread]# ./linuxtls2
g_mydata = 99, ThreadID: 140243186276096
g_mydata = 99, ThreadID: 140243186276096
g_mydata = 99, ThreadID: 140243186276096
g_mydata = 99, ThreadID: 140243186276096
g_mydata = 99, ThreadID: 140243186276096
g_mydata = 99, ThreadID: 140243186276096
g_mydata = 99, ThreadID: 140243186276096
g_mydata = 99, ThreadID: 140243186276096
g_mydata = 99, ThreadID: 140243186276096
g_mydata = 99, ThreadID: 140243186276096
g_mydata = 99, ThreadID: 140243186276096
...更多输出结果省略...

1
2
3
4
5
6
7
8
9
10
11
12
13
14

# 3.10.3 C++ 11 的 thread_local 关键字

C++ 11 标准提供了一个新的关键字 thread_local 来定义一个线程变量。使用方法如下：

thread_local int g_mydata = 1;

有了这个关键字，使用线程局部存储的代码就可以同时在 Windows 和 Linux 上运行了。示例如下：

#include <thread>
#include <chrono>
#include <iostream>

thread_local int g_mydata = 1;

void thread_func1()
{
	while (true)
	{
		++g_mydata;
	}
}

void thread_func2()
{
	while (true)
	{
		std::cout << "g_mydata = " << g_mydata << ", ThreadID = " << std::this_thread::get_id() << std::endl;
		std::this_thread::sleep_for(std::chrono::seconds(1));
	}
}

int main()
{
	std::thread t1(thread_func1);
	std::thread t2(thread_func2);

	t1.join();
	t2.join();

	return 0;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

需要注意的是，如果读者是在 Windows 平台下，虽然 thread_local 关键字在 C++ 11 标准中引入，但是 Visual Studio 2013 （支持 C++ 11 语法的最低的一个版本）编译器却并不支持这个关键字，建议在 Visual Studio 2015 及以上版本中测试上述代码。

最后，关于线程局部存储变量，我还需要再强调两点：

对于线程变量，每个线程都会有该变量的一个拷贝，并行不悖，互不干扰，该局部变量一直都在，直到线程退出为止。
系统的线程局部存储区域内存空间并不大，所以尽量不要利用这个空间存储大的数据块，如果不得不使用大的数据块，可以将大的数据块存储在堆内存中，再将该堆内存的地址指针存储在线程局部存储区域。

上次更新: 2024/07/08, 00:14:14

← 3.9 多线程使用锁实践经验总结 3.11 C 库的非线程安全函数→