我想从以下页面中删除“符号”、“名称”和“收益通话时间”下的所有公司信息:https://finance.yahoo.com/calendar/earnings
到目前为止,我只知道公司名称,但我得到了一个错误:
NoSuchElementException:没有这样的元素:无法定位元素:{“方法”:“xpath”,“选择器”:“/*[@id='cal-res-table']]/div[1]/table/tbody/tr[1]/td[2]”(会话信息:chrome=86.0.4240.198)
from selenium import webdriver
import datetime
tomorrow = (datetime.date.today() + datetime.timedelta(days=1)).isoformat() #get tomorrow in iso format as needed
url = "https://finance.yahoo.com/calendar/earnings?day="+tomorrow
print ("url: " + url)
driver = webdriver.Chrome("C:/Users/jrod94/Downloads/chromedriver_win32/chromedriver.exe")
driver.get(url)
element = driver.find_element_by_xpath("//*[@id='cal-res-table']")
Companies = [a.get_attribute("Company") for a in element]
driver.close()
由于您的问题是关于selenium
:
你应该看看有关硒的信息
在等待HTML源代码中所有元素的显示时,应使用以下代码进行描述:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
def main(url):
driver = webdriver.Firefox()
driver.get(url)
try:
cnames = [x.text for x in WebDriverWait(driver, 10).until(
EC.presence_of_all_elements_located(
(By.CSS_SELECTOR, "td[aria-label='Company']"))
)]
finally:
print(cnames)
driver.quit()
main("https://finance.yahoo.com/calendar/earnings")
输出:
['111 Inc', '360 DigiTech Inc', 'American Software Inc', 'American Software Inc', 'Corporacion America Airports SA', 'Atkore International Group Inc', 'Atkore International Group Inc', 'Helmerich and Payne Inc', 'Amtech Systems Inc', 'Amtech Systems Inc', 'Delta Apparel Inc', 'Delta Apparel Inc', 'Bellring Brands Inc', 'Berry Global Group Inc', 'Beacon Roofing Supply Inc', 'Natural Grocers By Vitamin Cottage Inc', "BJ's Wholesale Club Holdings Inc", 'Entera Bio Ltd', 'SG Blocks Inc', 'SG Blocks Inc', 'BEST Inc', 'Brady Corp', 'BioHiTech Global Inc', 'BioHiTech Global Inc', 'Oaktree Strategic Income Corporation', 'Caleres Inc', 'Pennantpark Investment Corp', 'Geospace Technologies Corp', 'Canadian Solar Inc', 'Oaktree Specialty Lending Corp', 'Matthews International Corp', 'Clearsign Technologies Corp', "Children's Place Inc", 'Elys Game Technology Corp', 'Dada Nexus Ltd', 'ESCO Technologies Inc', 'Euroseas Ltd', 'Fangdd Network Group Ltd', 'Fangdd Network Group Ltd', 'Golden Ocean Group Ltd', 'Hoegh LNG Partners LP', 'Post Holdings Inc', 'Huize Holding Ltd', 'Haynes International Inc', "Macy's Inc", 'OneWater Marine Inc', 'OneWater Marine Inc', 'Woodward Inc', 'StealthGas Inc', 'Maximus Inc', 'Ross Stores Inc', 'Intuit Inc', 'Ooma Inc', 'Williams-Sonoma Inc', 'Precipio Inc', 'NetEase Inc', 'Workday Inc', 'i3 Verticals Inc', 'Knot Offshore Partners LP', 'Maxeon Solar Technologies Ltd', 'Opera Ltd', 'Puxin Ltd', 'Puxin Ltd']
注意:您不需要使用selenium
,因为它会降低您的任务速度。
另外,我发现没有理由导入一个巨大的库,比如熊猫,来读取一个HTML表。
简单地说,您可以通过以下代码拾取目标,在那里您将获得确切的呼叫日期
:
import requests
import re
import json
import csv
keys = ['ticker', 'companyshortname', 'startdatetime']
def main(url):
r = requests.get(url)
goal = json.loads(re.search(r"App\.main.*?({.+})", r.text).group(1))
target = [[item[k] for k in keys] for item in goal['context']
['dispatcher']['stores']['ScreenerResultsStore']['results']['rows']]
with open("result.csv", 'w', newline="") as f:
writer = csv.writer(f)
writer.writerow(keys)
writer.writerows(target)
main("https://finance.yahoo.com/calendar/earnings")
输出:查看-在线
实际上,您的代码给出了一个错误,但与您的代码不在同一行,而是稍后。可能问题是当您尝试访问元素时页面没有加载。在发生错误的线路之前稍微延迟一下可能会解决问题。
from selenium import webdriver
import datetime
import time
tomorrow = (datetime.date.today() + datetime.timedelta(days=1)).isoformat() #get tomorrow in iso format as needed
url = "https://finance.yahoo.com/calendar/earnings?day="+tomorrow
print ("url: " + url)
driver = webdriver.Chrome("C:/Users/jrod94/Downloads/chromedriver_win32/chromedriver.exe")
driver.get(url)
time.sleep(1) # you can increase 1 if it still does not work
element = driver.find_element_by_xpath("//*[@id='cal-res-table']")
Companies = [a.get_attribute("Company") for a in element]
driver.close()
用熊猫怎么样?
import datetime
import pandas as pd
pd.set_option('display.max_column',None)
tomorrow = (datetime.date.today() + datetime.timedelta(days=1)).isoformat() #get tomorrow in iso format as needed'''
url = pd.read_html("https://finance.yahoo.com/calendar/earnings?day="+tomorrow, header=0)
table = url[0]
print(table)
输出:-
Symbol Company Earnings Call Time EPS Estimate \
0 WBAI 500.Com Ltd After Market Close -
1 BRBR Bellring Brands Inc TAS 0.19
2 BKE Buckle Inc Before Market Open 0.54
3 BNR Burning Rock Biotech Ltd TAS -0.12
4 IEC IEC Electronics Corp TAS -
5 GEOS Geospace Technologies Corp TAS -
6 DREM Dream Homes & Development Corp Time Not Supplied -
7 DXLG Destination XL Group Inc Before Market Open -
8 FL Foot Locker Inc Before Market Open 0.61
9 HHR HeadHunter Group PLC TAS 0.14
10 HHR HeadHunter Group PLC Before Market Open 0.14
11 RMR RMR Group Inc Before Market Open 0.39
12 GSX GSX Techedu Inc Before Market Open -0.31
13 GSX GSX Techedu Inc TAS -0.31
14 HIBB Hibbett Sports Inc Before Market Open 0.45
15 HAYN Haynes International Inc TAS -0.7
16 IIIV i3 Verticals Inc TAS 0.18
17 AIHS Senmiao Technology Ltd Before Market Open
我是JSP的新手,我正在尝试创建一个web界面,用户可以在该界面中输入他们想要删除的信息,并且该信息将在数据库表中删除。 在这里,他们应该输入和,然后应该删除具有两个指定ID的任何数据。但是,它不是从表中删除的。我有个例外 数组索引越界 下面是我的代码:
我想从数据库中删除一条特定的记录。
我有个奇怪的问题。我以为这会花我几分钟,但我现在挣扎了几个小时...以下是我得到的: 是在ArrayList中我得到了一些字符串(总共14个左右),其中9个字符串的名称是hardi。 使用上面的代码,我想删除它们。如果我替换为,那么它会打印出一些东西9次,这很好,因为_hardi在ArrayList中有9次。 但是当我使用时,它并没有删除全部9个,而是只删除了几个。我做了一些测试,我也看到了这一点
问题在于,当我点击deletedata-中的userNotes.jsp测试函数中的DELETE按钮时,它会起作用--它会向我显示在按钮上点击的内容的当前值,而不会显示其他内容。它不会从数据库中删除数据,也不会在响应时发送重定向,我需要做什么? userNotes.jsp deletedata-servlet
我得到以下异常: 关于这个案子有什么建议吗?