当前位置: 首页 > 知识库问答 >
问题:

我试图让BeautifulSoup打开维基百科,但我得到了很多错误

杜凯
2023-03-14

我在pycharm上运行bs4,当我设置代码时,它只是抛出错误

import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'https://www.newegg.com/'

uClient = uReq(my_url)

page_html = uClient.read()

uClient.close()

page_soup = soup(page_html, 'html.parser')

/users/alirahman/pycharmprojects/scraper/venv/bin/python/users/alirahman/pycharmprojects/scraper/app.py追溯(最近一次调用):文件“/library/framework/python.framework/versions/3.7/lib/python3.7/urllib/request.py”,第1317行,在do_open encode_chunked=req.has_header('transfer-encoding')中)文件“3.7/lib/python3.7/http/client.py“,第966行,在send self.connect()文件中”/library/frameworks/python.framework/versions/3.7/lib/python3.7/http/client.py“,第1414行,在connect server_hostname=server_hostname)文件中”/library/frameworks/python.framework/versions/3.7/lib/python3.7/ssl.py“,第423行,在wrap_socket session=session

在处理上述异常时,又发生了一个异常:

回溯(最近一次调用):文件“/users/alirahman/pycharmprojects/scraper/app.py”,第7行,在uClient=uReq(my_url)文件中“/library/frameworks/python.framework/versions/3.7/lib/python3.7/urllib/request.py”,第222行,在urlopen返回opener.open(url,data,timeout)文件中“错误:

进程已完成,退出代码为%1

共有1个答案

祝英博
2023-03-14

使用requests库而不是urllib.request。以下方法应该起作用。

import requests
from bs4 import BeautifulSoup

response = requests.get("https://www.newegg.com/")
soup = BeautifulSoup(response.content, "html.parser")
 类似资料: