当前位置: 首页 > 面试题库 >

Linux上TCP重传的应用程序控制

乜建柏
2023-03-14
问题内容

对于急躁的人:

如何改变的价值/proc/sys/net/ipv4/tcp_retries2在Linux中单个连接,使用setsockopt()ioctl()或者这样,或者是这可能吗?

较长的脱毛:

我正在开发一个使用长时间轮询HTTP请求的应用程序。在服务器端,需要知道客户端何时关闭连接。精度并不严格,但肯定不能为15分钟。接近一分钟就可以了。

对于不熟悉此概念的用户,长轮询HTTP请求的工作方式如下:

  • 客户端发送请求
  • 服务器使用HTTP标头进行响应,但响应保持打开状态。使用分块传输编码,允许服务器在数据可用时发送数据。
  • 发送所有数据后,服务器将发送一个“关闭块”以表示响应已完成。

在我的应用程序中,服务器每隔一秒(默认为30秒)向客户端发送“心跳”。心跳只是作为响应块发送的换行符。这是为了保持线路繁忙,以便我们通知连接丢失。

客户端正确关闭时没有问题。但是当它被强制关闭时(例如,客户端计算机断电),则不会发送TCP重置。在这种情况下,服务器发送心跳,客户端不会对其进行确认。此后,在放弃并向应用程序层(我们的HTTP服务器)报告故障之后,服务器将继续重传数据包大约15分钟。在我的情况下,等待15分钟太长了。

我可以通过写入以下文件来控制重传时间/proc/sys/net/ipv4/

tcp_retries1 - INTEGER
    This value influences the time, after which TCP decides, that
    something is wrong due to unacknowledged RTO retransmissions,
    and reports this suspicion to the network layer.
    See tcp_retries2 for more details.

    RFC 1122 recommends at least 3 retransmissions, which is the
    default.

tcp_retries2 - INTEGER
    This value influences the timeout of an alive TCP connection,
    when RTO retransmissions remain unacknowledged.
    Given a value of N, a hypothetical TCP connection following
    exponential backoff with an initial RTO of TCP_RTO_MIN would
    retransmit N times before killing the connection at the (N+1)th RTO.

    The default value of 15 yields a hypothetical timeout of 924.6
    seconds and is a lower bound for the effective timeout.
    TCP will effectively time out at the first RTO which exceeds the
    hypothetical timeout.

    RFC 1122 recommends at least 100 seconds for the timeout,
    which corresponds to a value of at least 8.

默认值tcp_retries2的确是8,我15分钟(900秒)的重传时间与上面引用的内核文档一致。

tcp_retries2例如,如果我将值更改为5,则连接死亡的速度更快。但是像这样设置它会影响系统中的所有连接,我真的很想只为这个长轮询连接设置它。

引用RFC 1122:

4.2.3.5  TCP Connection Failures

   Excessive retransmission of the same segment by TCP
   indicates some failure of the remote host or the Internet
   path.  This failure may be of short or long duration.  The
   following procedure MUST be used to handle excessive
   retransmissions of data segments [IP:11]:

   (a)  There are two thresholds R1 and R2 measuring the amount
        of retransmission that has occurred for the same
        segment.  R1 and R2 might be measured in time units or
        as a count of retransmissions.

   (b)  When the number of transmissions of the same segment
        reaches or exceeds threshold R1, pass negative advice
        (see Section 3.3.1.4) to the IP layer, to trigger
        dead-gateway diagnosis.

   (c)  When the number of transmissions of the same segment
        reaches a threshold R2 greater than R1, close the
        connection.

   (d)  An application MUST be able to set the value for R2 for
        a particular connection.  For example, an interactive
        application might set R2 to "infinity," giving the user
        control over when to disconnect.

   (e)  TCP SHOULD inform the application of the delivery
        problem (unless such information has been disabled by
        the application; see Section 4.2.4.1), when R1 is
        reached and before R2.  This will allow a remote login
        (User Telnet) application program to inform the user,
        for example.

在我看来,这tcp_retries1tcp_retries2在Linux中对应于R1R2在RFC。该RFC明确指出(在d)项,一个符合标准的实现必须允许设置的值R2,但我一直在使用发现没有办法做到这一点setsockopt()ioctl()还是这样。

另一种选择是在R1超出时获得通知(项目e)。但是,这不如设置好R2,因为我认为R1很快(在几秒钟内)就会被击中,并且R1无法为每个连接设置的值,或者至少RFC不需要它。


问题答案:

看起来是在内核2.6.37中添加的。
从下面的更改日志中提交来自内核Git的差异和摘录;

提交dca43c75e7e545694a9dd6288553f55c53e2a3a3作者:Jerry Chu日期:Fri Aug 27 19:13:28
2010 +0000

tcp: Add TCP_USER_TIMEOUT socket option.

This patch provides a "user timeout" support as described in RFC793. The
socket option is also needed for the the local half of RFC5482 "TCP User
Timeout Option".

TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned

int,
when > 0, to specify the maximum amount of time in ms that transmitted
data may remain unacknowledged before TCP will forcefully close the
corresponding connection and return ETIMEDOUT to the application. If
0 is given, TCP will continue to use the system default.

Increasing the user timeouts allows a TCP connection to survive extended
periods without end-to-end connectivity. Decreasing the user timeouts
allows applications to "fail fast" if so desired. Otherwise it may take
upto 20 minutes with the current system defaults in a normal WAN
environment.

The socket option can be made during any state of a TCP connection, but
is only effective during the synchronized states of a connection
(ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, or LAST-ACK).
Moreover, when used with the TCP keepalive (SO_KEEPALIVE) option,
TCP_USER_TIMEOUT will overtake keepalive to determine when to close a
connection due to keepalive failure.

The option does not change in anyway when TCP retransmits a packet, nor
when a keepalive probe will be sent.

This option, like many others, will be inherited by an acceptor from its
listener.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


 类似资料:
  • 背景资料 我有一个。NET Core 3.1控制台应用程序,这是一个长时间运行的进程。基本上,一些逻辑运行在一个无限循环中,每个循环都有一个延迟。 我目前正在使用Visual Studio将此应用程序打包到docker容器中,然后部署到Azure中的Linux应用程序服务。这一切都有效。我能够将docker容器发布到Azure容器注册表,然后将我的Linux应用程序服务指向该容器。容器成功启动,我

  • 我正在尝试在两个stm32设备之间进行TCP连接。首先,我们在wireshark上看到的行之间一切都是完美的,当TCP客户端重置并尝试发送新数据时,wireshark显示给我重传TCP消息,但当我调试服务器端时,服务器端得到消息,在客户端收到此回显消息后发送回显消息。 为什么重传的消息显示,即使我得到并发送消息给对方? 客户端完整代码:https://paste.ubuntu.com/p/vjhz

  • 我正在上传一个大的应用程序(大约250MB)。从家里,它在1个小时左右上传。 上载应用程序时出错。 服务器错误,状态代码:500,错误代码:0,消息: 有什么想法或建议吗?谢谢!

  • 除了Spark的监控功能,Spark Streaming增加了一些专有的功能。应用StreamingContext的时候,Spark web UI 显示添加的Streaming菜单,用以显示运行的receivers(receivers是否是存活状态、接收的记录数、receiver错误等)和完成的批的统计信息(批处理时间、队列等待等待)。这可以用来监控 流应用程序的处理过程。 在WEB UI中的Pr

  • 介绍 TCP的主要任务是很简单:打包和发送数据。TCP与其他协议的不同之处在于使用滑动窗口来管理基本数据收发过程,同时确保数据流的有效及可靠传输,从而不致发送速率明显快于接收速率。本文将描述TCP是如何确保设备可靠、有效地进行传输的。首先阐述TCP检测丢失片段以及重传的基本方法,之后介绍TCP如何判断一个片段为丢失片段。 更多信息 TCP片段重传计时器以及重传队列: 检测丢失片段并对之重传的方法概

  • 问题内容: 我目前正在开发一个使用多个传感器的android应用程序,在Method中使用了该方法来获取传感器,并在method中使用 了该方法,以便在文本视图中显示加速度计值。 如何添加更多传感器并以相同方式显示其值?我说什么时候程序将如何知道我指的是哪个传感器? 谢谢您的任何提前帮助,Maja 问题答案: 您将需要使用event.sensor.getType()方法检查传感器值是否属于该类型的