问题：

python - 为什么微博的图片用 wget 的 UA 可以下载，但是用浏览器的 UA 却无法下载？

艾翼

2024-06-14

测试的图片地址来自： https://weibo.com/3209519182/OixHyvyva

下面的代码无法下载图片

import requestsfrom urllib.parse import urlparsedef extract_domain(url: str) -> str:    parsed_url = urlparse(url)    return parsed_url.netloc# URL of the imageurl = "https://wx1.sinaimg.cn/wap360/bf4d604egy1hqlqupxg7jj20yw1db15q.jpg"# Headers to be included in the requestheaders = {    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:126.0) Gecko/20100101 Firefox/126.0",    "Accept": "*/*",    "Accept-Encoding": "identity",    "Host":extract_domain(url),    "Connection":"Keep-Alive"}# Send a GET request to the URL with the headersprint(headers)response = requests.get(url, headers=headers,timeout=10)# Save the image to a fileif response.status_code == 200:    with open("G18LCI_2023.png", "wb") as file:        file.write(response.content)    print("Image downloaded successfully.")else:    print(f"Failed to download image. Status code: {response.status_code}")

会报错

╰─➤  python -u "/home/pon/code/work/vobile/vobile-it/crawler_console/dev/download_image.py"{'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:126.0) Gecko/20100101 Firefox/126.0', 'Accept': '*/*', 'Accept-Encoding': 'identity', 'Host': 'wx1.sinaimg.cn', 'Connection': 'Keep-Alive'}Failed to download image. Status code: 40

但是下面的代码可以下载图片（只修改了 User-Agent 为 wget ）

import requestsfrom urllib.parse import urlparsedef extract_domain(url: str) -> str:    parsed_url = urlparse(url)    return parsed_url.netloc# URL of the imageurl = "https://wx1.sinaimg.cn/wap360/bf4d604egy1hqlqupxg7jj20yw1db15q.jpg"# Headers to be included in the requestheaders = {    "User-Agent": "Wget/1.21.2",    "Accept": "*/*",    "Accept-Encoding": "identity",    "Host":extract_domain(url),    "Connection":"Keep-Alive"}# Send a GET request to the URL with the headersprint(headers)response = requests.get(url, headers=headers,timeout=10)# Save the image to a fileif response.status_code == 200:    with open("G18LCI_2023.png", "wb") as file:        file.write(response.content)    print("Image downloaded successfully.")else:    print(f"Failed to download image. Status code: {response.status_code}")

输出如下

╰─➤  python -u "/home/pon/code/work/vobile/vobile-it/crawler_console/dev/download_image.py"                                                                                                                                                           130 ↵{'User-Agent': 'Wget/1.21.2', 'Accept': '*/*', 'Accept-Encoding': 'identity', 'Host': 'wx1.sinaimg.cn', 'Connection': 'Keep-Alive'}Image downloaded successfully.

图片.png

共有1个答案

濮冠宇

2024-06-14

header 加个 "Referer":"https://weibo.com/" 就都可以

所以猜测防盗链，禁止浏览器上其他网站使用这个图片 (wget 非浏览器)

类似资料：

图片下载和浏览

从互联网下载图片，并显示。下载图片过程中显示进度条。支持图片缩放，运用了ASI、SDWebImage。支持图片缓存。 [Code4App.com]
selenium下载的验证码图片与浏览器中的图片不同

问题内容：我正在尝试使用Selenium下载一个验证码图像，但是，下载的图像与浏览器中显示的图像不同。如果我尝试在不更改浏览器的情况下再次下载该图像，则会得到另一种图像。有什么想法吗？问题答案：因为图片的链接会在您打开该链接后为您提供一个随机的新验证码图片！可以从屏幕快照中截取屏幕快照，而不是从图像的上下载文件。但是，您需要下载（）并按照此答案中提到的方式使用它：（请注意，我对代码
ua

修改请求头的user-agent字段，可用于模拟各种机器访问，配置方式： pattern ua://newUA newUA为新的ua字符串(中间不能有空格)或者Values里面的{key}。例子： www.ifeng.com ua://Mozilla/5.0 # 把完整UA存在Values里面 www.ifeng.com ua://{test-ua} test-ua: Mozilla/5.
使用Python中的selenium从浏览器下载文件

我知道这个问题已经被问了好几次了，但这些问题的解决方案对我的情况没有帮助。我想从这个网站下载一个数据集：https://datadashboard.fda.gov/ora/cd/inspections.htm 以下是“数据集”的HTML：以下是“整个检测数据集”的 HMTL：下面是我获取数据集的代码：我也尝试过：但我得到这个错误：文件“FDAComplianceDashboardIns
前端 - 图片可正常下载，下载后本地可正常预览，但是将图片上传后无法正常显示？

图片可以正常下载本地可以打开但是进行上传就无法显示我直接将url地址写死放在img标签中也无法显示，转base64，blob依旧无法显示，因为是基于饿了么plus 做的封装在饿了么的官网进行图片上传发现依旧无法正常显示两张图片都是3M左右，唯一的区别可能就是一个是华为手机拍的一个是oppo find x7拍的 oppo find x7的无法正常显示
UA Profiler

通过UA Profiler，我们可以看出浏览器加载页面的快慢，例如下载脚本有没有被阻断，最高可打开多少个链接，是否支持“Data：”路径。不过 UA Profiler 目前已经被 Browserscope 所替换。

python - 为什么微博的图片用 wget 的 UA 可以下载，但是用浏览器的 UA 却无法下载？

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档