tornado异步HTTP坑~

一、基于tornado的异步http在高qps下返回599错误

基于tornado协程实现的异步http,在高qps情况下会返回599错误码。原因就在于ioloop计算超时时间的时候,是从request请求放入queue中时开始计算的。也就是说超时时间= queue等待时间 + min(request建立连接时间 + request请求时间)。如果queue太长,实际上请求大部分时间都卡在queue中。
SimpleAsyncHTTPClient中fetch_impl实现源码:
可以看到超时时间为:self.io_loop.time() + min(request.connect_timeout,request.request_timeout)

def fetch_impl(self, request, callback):
    key = object()
    self.queue.append((key, request, callback))
    if not len(self.active) < self.max_clients:
        timeout_handle = self.io_loop.add_timeout(
            self.io_loop.time() + min(request.connect_timeout,
                                      request.request_timeout),
            functools.partial(self._on_timeout, key, "in request queue"))
    else:
        timeout_handle = None
    self.waiting[key] = (request, callback, timeout_handle)
    self._process_queue()
    if self.queue:
        gen_log.debug("max_clients limit reached, request queued. "
                      "%d active, %d queued requests." % (
                          len(self.active), len(self.queue)))

想要修改也很简单,自己定义一个HTTPClient,继承SimpleAsyncHTTPClient,然后重写fetch_impl跟_connection_class方法。
如何重写呢?
对于异步http,完全可以根据await或者yield关键字等待超时来判断,所以可以直接删掉add_timeout

response = await http_client.fetch("http://www.google.com")

具体实现如下:

class NoQueueTimeoutHTTPClient(SimpleAsyncHTTPClient):
    # 队列
    def fetch_impl(self, request, callback):
        key = object()

        self.queue.append((key, request, callback))
        self.waiting[key] = (request, callback, None)

        self._process_queue()

        if self.queue:
            gen_log.debug("max_clients limit reached, request queued. %d active, %d queued requests." % (
                len(self.active), len(self.queue)))

    # 重写连接
    def _connection_class(self):
        return _HTTPConnection1

二、tornado 204响应码强校验

背景

项目在开发http相关的某个功能的时候,没有按http规范的要求,返回了204的响应码,但是携带content内容。
在用request库发请求的时候,能够正常收到响应,但是在tornado中使用异步http却只收到了599错误码

引言

设计http相关功能的时候,务必要按照http规范进行!!!不然好多坑啊

tornado 204响应码强校验

HTTP1Connection下的_read_body方法中,有一段源码是这样的:

if code == 204:
    # This response code is not allowed to have a non-empty body,
    # and has an implicit length of zero instead of read-until-close.
    # http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.3
    if ("Transfer-Encoding" in headers or
            content_length not in (None, 0)):
        raise httputil.HTTPInputError(
            "Response with code %d should not have body" % code)
    content_length = 0

当响应码是204的时候,会强制校验content_length是否是None或者0,不是的话会抛异常。
tornado在捕获异常之后,_HTTPConnection下_handle_exception处理异常时会返回599的错误码。

def _handle_exception(self, typ, value, tb):
    if self.final_callback:
        self._remove_timeout()
        if isinstance(value, StreamClosedError):
            if value.real_error is None:
                value = HTTPStreamClosedError("Stream closed")
            else:
                value = value.real_error
        self._run_callback(HTTPResponse(self.request, 599, error=value,
                                        request_time=self.io_loop.time() - self.start_time,
                                        start_time=self.start_wall_time,
                                        ))

        if hasattr(self, "stream"):
            # TODO: this may cause a StreamClosedError to be raised
            # by the connection's Future.  Should we cancel the
            # connection more gracefully?
            self.stream.close()
        return True
    else:
        # If our callback has already been called, we are probably
        # catching an exception that is not caused by us but rather
        # some child of our callback. Rather than drop it on the floor,
        # pass it along, unless it's just the stream being closed.
        return isinstance(value, StreamClosedError)

适配方案:
自定义_HTTPConnection1,继承_HTTPConnection,重写headers_received方法,如果收到204响应码,将headers中的content-length强制改为0

class _HTTPConnection1(_HTTPConnection):
    # 添加204content_length处理
    def headers_received(self, first_line, headers):
        if self.request.expect_100_continue and first_line.code == 100:
            self._write_body(False)
            return
        self.code = first_line.code
        self.reason = first_line.reason
        self.headers = headers

        # 设置204 content_length为0
        if self.code == 204:
            length = self.headers.get('content-length', 0)
            self.headers["Content-Length"] = "0"
            gen_log.info("turn headers content-length from {} to 0".format(length))

        if self._should_follow_redirect():
            return

        if self.request.header_callback is not None:
            # Reassemble the start line.
            self.request.header_callback('%s %s %s\r\n' % first_line)
            for k, v in self.headers.get_all():
                self.request.header_callback("%s: %s\r\n" % (k, v))
            self.request.header_callback('\r\n')
#HTTP#
一天一个知识点 文章被收录于专栏

1

全部评论
感谢分享,收藏了
点赞 回复
分享
发布于 2022-08-05 14:06

相关推荐

点赞 收藏 评论
分享
牛客网
牛客企业服务