python使用hyper爬取http2.0协议的网页数据

晋言

2023-12-01

最近想个人研究下港交所数据，使用爬虫时遇到些问题，港交所使用的是https2.0协议，其余的绝大多数是使用http1.1协议，因此导致无法爬取，最后发现使用hyper即可。

先安装：pip install hyper

接着导入hyper：

from hyper import HTTPConnection

API链接地址：https://hyper.readthedocs.io/en/latest/index.html

使用hyper爬取，首页需要加入端口:443,代码：

//加入端口:443
conn = HTTPConnection('www.hkex.com.hk:443')
conn.request('GET', '/chi/stat/smstat/dayquot/d210219c.htm', None, None)
resp = conn.get_response()
//不解码返回的数据，源码中有示例，不传参入则默认utf-8
s = resp.read(decode_content=False)
print s

初步使用这样即可，更深入的使用方法请参考API

python使用hyper爬取http2.0协议的网页数据

相关阅读

相关文章

相关问答

相关文档