C++后端开发进阶
Chapter04

4.10 如何获取当前 socket 对应的接收缓冲区中有多少数据可读

# 4.10.1 分析

当一个非侦听 socket 可读时，我们想知道其当前接收缓冲区中已经有多少数据可读，类似于 Java JDK 中的 java.io.InputStream.available() 方法的功能。

//class InputStream;
//Returns an estimate of the number of bytes that can be read (or skipped over) from 
//this input stream without blocking by the next invocation of a method for this input stream.
int	available();

1
2
3
4

Windows 和 Linux 操作系统均提供了类似的功能。

在 Windows 系统上可以使用 ioctlsocket() 这个 API 函数，该函数签名如下：

int ioctlsocket(SOCKET s, long cmd, u_long* argp);

参数 s 是需要操作的 socket 句柄，参数 cmd 是对应的操作类型，参数 argp 存储操作后的结果。函数调用成功返回 0，函数调用失败返回非 0 值。

这个函数的功能非常强大的，这里我们只讨论如何获取对应 socket 接收缓冲区中的字节数目，将 cmd 命令设置为 FIONREAD 即可。代码如下：

ulong bytesToRecv;
//clientsock 是需要操作的 socket 句柄
if (ioctlsocket(clientsock, FIONREAD, &bytesToRecv) == 0)
{
	//函数调用成功后，bytesToRecv的值即是当前接收缓冲区中数据字节数目
}

1
2
3
4
5
6

要使用 ioctlsocket() 函数必须使用 Windows Vista 或 Windows Server 2003 及以后版本。

Linux 操作系统可以使用 ioctl() 函数，这个函数签名如下：

#include <sys/ioctl.h>

int ioctl(int d, int request, ...);

1
2
3

其用法和返回值和 Windows 版本的 ioctlsocket() 函数基本相同，这里不再赘述。

我们来看一个完整的例子：

/**
 * 演示如何获取当前 socket 对应的接收缓冲区中有多少数据可读，linux_ioctl.cpp
 * zhangyl 2019.11.12
 */
#include <sys/types.h> 
#include <sys/socket.h>
#include <sys/ioctl.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <fcntl.h>
#include <poll.h>
#include <iostream>
#include <string.h>
#include <vector>
#include <errno.h>

//无效fd标记
#define INVALID_FD  -1

int main(int argc, char* argv[])
{
    //创建一个侦听socket
    int listenfd = socket(AF_INET, SOCK_STREAM, 0);
    if (listenfd == INVALID_FD)
    {
        std::cout << "create listen socket error." << std::endl;
        return -1;
    }
	
	//将侦听socket设置为非阻塞的
	int oldSocketFlag = fcntl(listenfd, F_GETFL, 0);
	int newSocketFlag = oldSocketFlag | O_NONBLOCK;
	if (fcntl(listenfd, F_SETFL,  newSocketFlag) == -1)
	{
		close(listenfd);
		std::cout << "set listenfd to nonblock error." << std::endl;
		return -1;
	}
	
	//复用地址和端口号
	int on = 1;
	setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, (char *)&on, sizeof(on));
	setsockopt(listenfd, SOL_SOCKET, SO_REUSEPORT, (char *)&on, sizeof(on));
	
	//初始化服务器地址
	struct sockaddr_in bindaddr;
	bindaddr.sin_family = AF_INET;
	bindaddr.sin_addr.s_addr = htonl(INADDR_ANY);
	bindaddr.sin_port = htons(3000);
	if (bind(listenfd, (struct sockaddr *)&bindaddr, sizeof(bindaddr)) == -1)
	{
	    std::cout << "bind listen socket error." << std::endl;
		close(listenfd);
	    return -1;
	}
	
	//启动侦听
	if (listen(listenfd, SOMAXCONN) == -1)
	{
	    std::cout << "listen error." << std::endl;
		close(listenfd);
	    return -1;
	}	
	
	std::vector<pollfd> fds;
	pollfd listen_fd_info;
	listen_fd_info.fd = listenfd;
	listen_fd_info.events = POLLIN;
	listen_fd_info.revents = 0;
	fds.push_back(listen_fd_info);
	
	//是否存在无效的fd标志
	bool exist_invalid_fd;
	int n;
	while (true)
	{
		exist_invalid_fd = false;
		n = poll(&fds[0], fds.size(), 1000);
		if (n < 0)
		{
			//被信号中断
			if (errno == EINTR)
				continue;
			
			//出错，退出
			break;
		}
		else if (n == 0)
		{
			//超时，继续
			continue;
		}
		
		int size = fds.size();
		for (size_t i = 0; i < size; ++i)
		{
			// 事件可读
			if (fds[i].revents & POLLIN)
			{
				if (fds[i].fd == listenfd)
				{
					//侦听socket，接受新连接
					struct sockaddr_in clientaddr;
					socklen_t clientaddrlen = sizeof(clientaddr);
					//接受客户端连接, 并加入到fds集合中
					int clientfd = accept(listenfd, (struct sockaddr *)&clientaddr, &clientaddrlen);
					if (clientfd != -1)
					{
						//将客户端socket设置为非阻塞的
						int oldSocketFlag = fcntl(clientfd, F_GETFL, 0);
						int newSocketFlag = oldSocketFlag | O_NONBLOCK;
						if (fcntl(clientfd, F_SETFL,  newSocketFlag) == -1)
						{
							close(clientfd);
							std::cout << "set clientfd to nonblock error." << std::endl;						
						} 
						else
						{
							struct pollfd client_fd_info;
							client_fd_info.fd = clientfd;
							client_fd_info.events = POLLIN;
							client_fd_info.revents = 0;
							fds.push_back(client_fd_info);
							std::cout << "new client accepted, clientfd: " << clientfd << std::endl;
						}				
					}
				}
				else 
				{
					//socket 可读时获取当前接收缓冲区中的字节数目
					ulong bytesToRecv = 0;
					if (ioctl(fds[i].fd, FIONREAD, &bytesToRecv) == 0)
					{
						std::cout << "bytesToRecv: " << bytesToRecv << std::endl;
					}
					
					//普通clientfd,收取数据
					char buf[64] = { 0 };
					int m = recv(fds[i].fd, buf, 64, 0);
					if (m <= 0)
					{
						if (errno != EINTR && errno != EWOULDBLOCK)
						{
							//出错或对端关闭了连接，关闭对应的clientfd，并设置无效标志位	
							std::cout << "client disconnected, clientfd: " << fds[i].fd << std::endl;
							close(fds[i].fd);
							fds[i].fd = INVALID_FD;
							exist_invalid_fd = true;							
						}			
					}
					else
					{
						std::cout << "recv from client: " << buf << ", clientfd: " << fds[i].fd << std::endl;
					}
				}
			}
			else if (fds[i].revents & POLLERR)
			{
				//TODO: 暂且不处理
			}
			
		}// end  outer-for-loop
		
		if (exist_invalid_fd)
		{
			//统一清理无效的fd
			for (std::vector<pollfd>::iterator iter = fds.begin(); iter != fds.end(); )
			{
				if (iter->fd == INVALID_FD)
					iter = fds.erase(iter);
				else
					++iter;
			}
		}	
	}// end  while-loop
 
	//关闭所有socket
	for (std::vector<pollfd>::iterator iter = fds.begin(); iter != fds.end(); ++ iter)
		close(iter->fd);			
	
	return 0;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182

上述程序在 3000端口开启了一个侦听，使用 poll 函数检测侦听 socket 和 clientsocket 上是否有读事件，对于 clientsocket，当触发其可读事件（POLLIN）时表明有数据可读，我们调用 ioctl() 获取当前 socket 接收缓冲区字节数并打印出来（代码 131 行）。我们编译该程序并启动之，然后使用 nc 命令模拟一个客户端进行测试。

客户端输入结果：

[root@myserver ~]# nc -v 127.0.0.1 3000
Ncat: Version 6.40 ( http://nmap.org/ncat )
Ncat: Connected to 127.0.0.1:3000.
hello
world
xxxx

1
2
3
4
5
6
7

服务器端输出结果：

[zhangyl@iZ238vnojlyZ test]$ g++ -g -o linux_ioctl linux_ioctl.cpp 
[zhangyl@iZ238vnojlyZ test]$ ./linux_ioctl 
new client accepted, clientfd: 4
bytesToRecv: 6
recv from client: hello
, clientfd: 4
bytesToRecv: 6
recv from client: world
, clientfd: 4
bytesToRecv: 5
recv from client: xxxx
, clientfd: 4

1
2
3
4
5
6
7
8
9
10
11
12
13

需要注意的是，由于 nc 命令默认以换行符（ \n）结束，因此无论是客户端还是服务器，输出后都多一个空行，每次服务器收到的字符串数目（即 bytesToRecv 值）都是可见字符串部分加上一个换行符的长度，例如 hello\n 的长度是 6。

# 4.10.2 注意事项

关于这个小节有两个需要注意的细节。

对于代码：

   ulong bytesToRecv = 0;
   if (ioctl(fds[i].fd, FIONREAD, &bytesToRecv) == 0)
   {
   	//省略...
   }

1
2
3
4
5

第三个参数 bytesToRecv 是一个输出参数，这对于大多数其他函数来说意味着 bytesToRecv 可以不指定初始化值，因为函数调用成功后会给该变量设置值。但是对于 Linux 的 ioctl() 函数是个例外，必须将 bytesToRecv 初始化为 0，才能在 ioctl() 函数调用成功后得到正确的结果。而 Windows 的 ioctlsocket() 函数没有这个限定。

//对于 Windows，bytesToRecv 可以不进行初始化
ulong bytesToRecv;
if (ioctlsocket(clientsock, FIONREAD, &bytesToRecv) == 0)
{
}

//对于 Linux，bytesToRecv 必须初始化为 0 才能使用 ioctl 得到正确结果
ulong bytesToRecv = 0;
if (ioctl(clientsock, FIONREAD, &bytesToRecv) == 0)
{
}

1
2
3
4
5
6
7
8
9
10
11

有人可能认为在调用 recv 或 read 函数进行收数据之前，可以调用 ioctlsocket 或 ioctl 函数获得数据大小，然后根据大小分配缓冲区。伪码如下：
```
ulong bytesToRecv = 0;
if (ioctl(clientsock, FIONREAD, &bytesToRecv) != 0)
{
	//出错，退出
	return;
}

//根据 bytesToRecv 分配缓冲区大小
char* pRecvBuf = new char[bytesToRecv]; 
//调用recv
int ret = recv(clientsock, pRecvBuf, bytesToRecv, 0);
```
1
2
3
4
5
6
7
8
9
10
11
上述代码逻辑其实是有问题的，因为当你调用完 ioctlsocket 或 ioctl 函数在调用 recv 或 read 函数之前，可能接收缓冲区又新增了一段数据，导致实际调用 recv 可以收到的数据长度大于 bytesToRecv，因此建议读者不要基于这样的认知去做一些逻辑上的假设，以免编写出错误的逻辑来。

实际的网络通信程序，很少会需要预先知道接收缓冲区中有多少可读数据，一般是根据实际业务需求去决定收取多少字节的数据。

上次更新: 2024/07/08, 00:14:14

← 4.9 连接时顺便接收第一组数据 4.11 Linux EINTR 错误码→