爬虫,工具 - Splash

拓拔高畅
2023-12-01

What is it?

Splash is a javascript rendering service. It’s a lightweight web browser with an HTTP API
http://splash.readthedocs.io/en/stable/

用途

爬虫方面可以抓取JS渲染的页面(selenium也可以解决此问题)

用法

  1. 用docker开启Splash服务(可以分布式,在多台机器上用docker开启Splash服务)
  2. Python中用拼接Lua脚本,请求Splash的API
import requests
from urllib.parse import quote

lua = '''
function main(splash)
    return 'hello'
end
'''

url = 'http://localhost:8050/execute?lua_source=' + quote(lua)
response = requests.get(url)
print(response.text)

转载于:https://www.cnblogs.com/allen2333/p/9477406.html

 类似资料: