嗨,我正试图从这个网站https://vcx-forum.org/score刮表,当我试图使用美丽的soup刮表时,它显示错误'nonetype'对象没有属性'find'
from bs4 import BeautifulSoup
import requests
import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--headless')
driver = webdriver.Chrome(options=options)
driver.get("https://vcx-forum.org/score")
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
key = {}
data = []
html = driver.page_source
soup = BeautifulSoup(html, 'lxml')
for tag in soup.find_all('div', class_="vcx-ranking__body js-vcx-ranking-body"):
for span in tag.find_all('div', class_="t-row"):
for row in span:
model = row.find("div", class_="t_cell colCamera").find("a").text
rating = row.find("div", class_="t_cell colScore colVCX active").find("span",
class_="score_numeric").text
image_quality = row.find("div", class_="t_cell colScore colImageQuality").text
sunny = row.find("div", class_="t_cell colScore colBright").text
indoor = row.find("div", class_="t_cell colScore colMid").text
night = row.find("div", class_="t_cell colScore colImageLow").text
flash = row.find("div", class_="t_cell colScore colFlash").text
zoom = row.find("div", class_="t_cell colScore colZoom").text
perform = row.find("div", class_="t_cell colScore colHandling").text
key = {'model':[model],
'image_quality':[image_quality],
'sunny':[sunny],
'indoor':[indoor],
'night':[night],
'flash':[flash],
'zoom':[zoom],
'perform':[perform]
}
df = pd.DataFrame(key, columns = ['model', 'rating','image_quality', 'sunny',
'indoor', 'night', 'flash', 'zoom', 'perform'])
我尝试了在for span行之后的print(span.text),但它只显示t-row的div类中的所有内容,我希望所有内容都被很好地分隔成列名
编辑:
AttributeError Traceback (most recent call last)
<ipython-input-63-f1da6a7e61dd> in <module>
16 for span in tag.find_all('div', class_="t-row"):
17 for row in span:
---> 18 model = row.find("div", class_="t_cell colCamera").find("a").text
19 rating = row.find("div", class_="t_cell colScore colVCX active").find("span",
20 class_="score_numeric").text
AttributeError: 'NoneType' object has no attribute 'find'
我对你的密码做了一些修改。它现在工作得很好。
from bs4 import BeautifulSoup
import pandas as pd
from selenium import webdriver
import time
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--headless')
driver = webdriver.Chrome(options=options)
driver.get("https://vcx-forum.org/score")
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
key = {}
data = []
html = driver.page_source
soup = BeautifulSoup(html, 'lxml')
for row in soup.find_all('div', class_="t-row")[1:]:
model = row.select_one('.colCamera>a').text
rating = row.select_one(".t-cell.colScore.colVCX.active>.score-numeric").text
image_quality = row.select_one(".colImageQuality").text
sunny = row.select_one(".colBright").text
indoor = row.select_one(".colMid").text
night = row.select_one(".colLow").text
flash = row.select_one(".colFlash").text
zoom = row.select_one(".colZoom").text
perform = row.select_one(".colHandling").text
key = {'model':[model],
'rating':[rating],
'image_quality':[image_quality],
'sunny':[sunny],
'indoor':[indoor],
'night':[night],
'flash':[flash],
'zoom':[zoom],
'perform':[perform]
}
data.append(key)
df = pd.DataFrame(data, columns = ['model', 'rating','image_quality', 'sunny',
'indoor', 'night', 'flash', 'zoom', 'perform'])
print(df)
产出:
model rating image_quality ... flash zoom perform
0 [Xiaomi Mi 10 Pro] [73] [69] ... [68] [71] [80]
1 [Samsung Galaxy S20 Ultra] [77] [76] ... [74] [74] [78]
2 [Samsung Galaxy S20] [75] [74] ... [74] [51] [78]
3 [Huawei Mate 30 Pro] [77] [73] ... [76] [63] [87]
4 [Xiaomi MI note 10 pro] [75] [72] ... [71] [78] [82]
5 [LG G8S ThinQ] [77] [74] ... [71] [42] [82]
6 [LG V50 ThinQ] [76] [75] ... [74] [42] [79]
7 [LG G8 ThinQ] [77] [75] ... [72] [43] [81]
8 [Huawei Mate 20] [73] [71] ... [68] [36] [76]
9 [Huawei Mate 20 Pro] [75] [72] ... [62] [45] [81]
10 [Huawei P20 Pro] [74] [70] ... [67] [52] [83]
11 [Oppo Find X2 Pro] [71] [69] ... [63] [61] [73]
12 [Apple iPhone 11 Pro] [72] [71] ... [73] [41] [74]
13 [Oppo Reno2] [69] [67] ... [65] [42] [75]
14 [Samsung Galaxy Note10] [71] [68] ... [61] [44] [77]
15 [Xiaomi MI 9] [70] [70] ... [70] [48] [71]
16 [Huawei P30 Pro] [72] [68] ... [71] [51] [79]
17 [Huawei P30] [69] [68] ... [70] [50] [71]
18 [LG V40] [72] [71] ... [72] [42] [74]
19 [Huawei P20] [71] [66] ... [65] [34] [83]
20 [HTC U11] [70] [65] ... [69] [15] [82]
21 [Realme 5 Pro] [66] [64] ... [65] [10] [72]
22 [Fairphone 3] [64] [63] ... [72] [25] [65]
23 [Google Pixel 4] [66] [68] ... [65] [43] [63]
24 [Apple iPhone 11 Pro Max] [68] [70] ... [72] [31] [64]
25 [Oneplus 7 Pro] [67] [66] ... [62] [55] [68]
26 [Samsung S10] [68] [66] ... [62] [41] [73]
27 [Samsung Galaxy Note 9] [66] [65] ... [64] [42] [68]
28 [Google Pixel 3] [65] [60] ... [63] [13] [75]
29 [Red Hydrogen One] [68] [63] ... [61] [12] [78]
.. ... ... ... ... ... ... ...
85 [Blackberry Priv] [52] [55] ... [61] [10] [45]
86 [Apple iPhone SE] [52] [51] ... [54] [1] [54]
87 [Apple iPhone 7] [52] [49] ... [50] [7] [60]
88 [Vodafone Smart N10] [49] [44] ... [40] [-8] [61]
89 [Vodafone Smart N8] [48] [45] ... [43] [9] [55]
90 [Vodafone Smart N9] [46] [42] ... [37] [0] [56]
91 [Huawei P Smart] [49] [46] ... [43] [15] [57]
92 [Huawei P20 Lite] [50] [56] ... [57] [16] [37]
93 [Sony Xperia Z3] [46] [44] ... [43] [4] [52]
94 [Microsoft Lumia 650] [47] [44] ... [41] [13] [53]
95 [LG G3] [48] [42] ... [42] [0] [62]
96 [Huawei GX8 (G8)] [50] [45] ... [54] [0] [63]
97 [HTC One M8] [45] [43] ... [45] [0] [52]
98 [Apple iPhone 6S] [47] [46] ... [56] [5] [47]
99 [Apple iPhone 6 Plus] [49] [45] ... [52] [0] [58]
100 [Alcatel (TCT) Idol 3] [43] [46] ... [40] [26] [35]
101 [Sony M4 Aqua] [42] [43] ... [45] [6] [38]
102 [Motorola Moto G 3. Generation] [43] [41] ... [36] [1] [49]
103 [Huawei P8] [43] [43] ... [49] [0] [42]
104 [Huawei P8 lite] [42] [42] ... [47] [13] [40]
105 [Vodafone Smart N9 lite] [39] [39] ... [37] [2] [37]
106 [Vodafone Smart Ultra 7] [40] [39] ... [48] [0] [44]
107 [Vodafone Smart Prime 7] [38] [33] ... [30] [0] [50]
108 [Vodafone Smart Mini 7] [37] [20] ... [0] [0] [77]
109 [Samsung Galaxy J5] [39] [40] ... [46] [0] [37]
110 [Samsung Core prime] [36] [34] ... [36] [0] [41]
111 [Microsoft Lumia 640 XL] [40] [39] ... [38] [0] [41]
112 [LG G4c] [40] [37] ... [33] [0] [48]
113 [HTC Desire 626] [38] [39] ... [30] [0] [35]
114 [LG K4] [33] [24] ... [15] [0] [53]
[115 rows x 9 columns]
我有一小段代码来从web站点中提取表数据,然后以csv格式显示。问题是for循环多次打印记录。我不确定是不是因为 标签。顺便说一句,我是Python新手。谢谢你的帮助!
我已经获得了刮取第一页的代码,但是url从: https://www.expansion.com/empresas-de/ganaderia/granjas-en-general/index.html -- 如何创建从第2页到第65页的循环?非常感谢!
我试图刮此页上Flipkart: http://www.flipkart.com/moto-x-play/p/itmeajtqp9sfxgsk?pid=MOBEAJTQRH4CCRYM 我试图找到的div类"fk-ui-ccarousel超级容器相同的vreco部分reco-carousel-边界-顶部sameHorizontalReco",但它返回空结果。 divs是空的。我使用inspect元
我有一个带有div标签的页面源,如下面的示例页面源。我想像下面的例子一样刮掉所有的网址,并将它们保存在列表中。 示例url: 来自: 我尝试使用下面的代码从href中刮取网址。我试图使用span类来过滤只包含作业卡search__easy飞机的div标签。代码不返回任何网址,只是一个空列表。我对美丽的汤和硒不熟悉。如果有人能指出我的问题是什么,并提出一个解决方案,我会很高兴。特别是如果你也能给出一
如果pk_col值为空,则应打印未定义的主键。但我得到了这个错误。“NoneType”对象没有属性“rdd”。
我想在没有命令的情况下为discord编写一条bot消息。但是我在运行代码时遇到了一个问题。错误为:“非类型”对象没有“发送”属性 Traceback(最近的调用最后):文件"D:/Development/Code Python/Bot Discord/discord-testbot.py",第18行,在my_background_task等待channel.send(通道,'New') 属性错误