问题：

R-使用rvest获取谷歌评论

鄂和璧

2023-03-14

作为一个项目的一部分，我正试图从谷歌那里获取完整的评论（在其他网站上的之前尝试中，我的评论被一个More截断，除非你点击它，否则它会隐藏完整的评论）。

我已经为此选择了rvest软件包。然而，我似乎没有得到我想要的结果。

这是我的步骤

library(rvest)
library(xml2)
library(RSelenium)

queens <- read_html("https://www.google.co.uk/search?q=queen%27s+hospital+romford&oq=queen%27s+hospitql+&aqs=chrome.1.69i57j0l5.5843j0j4&sourceid=chrome&ie=UTF-8#lrd=0x47d8a4ce4aaaba81:0xf1185c71ae14d00,1,,,")

#Here I use the selectorgadget tool to identify the user review part that I wish to scrape

reviews=queens %>%
html_nodes(".review-snippet") %>%
html_text()

然而，这似乎不起作用。我这里没有任何输出。

我对这个软件包和网页抓取非常陌生，所以对此的任何意见都将不胜感激。

共有1个答案

戴正阳

2023-03-14

以下是RSelenium和rvest的工作流程：
1。随时向下滚动以获取您想要的内容，记住偶尔暂停一次以加载内容。
2。单击所有“单击更多”按钮并获得完整的评论。
3。获取页面源并使用rvest获取列表中的所有评论

你想刮的东西不是静止的，所以你需要硒的帮助。这应该是有效的：

library(rvest)
library(xml2)
library(RSelenium)

rmDr=rsDriver(browser=c("chrome"), chromever="73.0.3683.68")
myclient= rmDr$client
myclient$navigate("https://www.google.co.uk/search?q=queen%27s+hospital+romford&oq=queen%27s+hospitql+&aqs=chrome.1.69i57j0l5.5843j0j4&sourceid=chrome&ie=UTF-8#lrd=0x47d8a4ce4aaaba81:0xf1185c71ae14d00,1,,,")
#click on the snippet to switch focus----------
webEle <- myclient$findElement(using = "css",value = ".review-snippet")
webEle$clickElement()
#simulate scroll down for several times-------------
scroll_down_times=20
for(i in 1 :scroll_down_times){
    webEle$sendKeysToActiveElement(sendKeys = list(key="page_down"))
    #the content needs time to load,wait 1 second every 5 scroll downs
    if(i%%5==0){
        Sys.sleep(1)
    }
}
#loop and simulate clicking on all "click on more" elements-------------
webEles <- myclient$findElements(using = "css",value = ".review-more-link")
for(webEle in webEles){
    tryCatch(webEle$clickElement(),error=function(e){print(e)}) # trycatch to prevent any error from stopping the loop
}
pagesource= myclient$getPageSource()[[1]]
#this should get you the full review, including translation and original text-------------
reviews=read_html(pagesource) %>%
    html_nodes(".review-full-text") %>%
    html_text()

#number of stars
stars <- read_html(pagesource) %>%
    html_node(".review-dialog-list") %>%
    html_nodes("g-review-stars > span") %>%
    html_attr("aria-label")


#time posted
post_time <- read_html(pagesource) %>%
    html_node(".review-dialog-list") %>%
    html_nodes(".dehysf") %>%
    html_text()

类似资料：

谷歌fiRecovery-获取DuplicateFileException

请给我建议。我刚刚遵循了这个指南，并从这行什么都不工作。错误：任务'： app： transformResourcesSusMergeJavaResForDebug'执行失败。com.android.build.api.transform.TransformException：com.android.builder.packaging.DuplicateFileException：复制在APK中
Selenium Python在获取谷歌评论时无法向下滚动

即使使用上面的代码向下滚动页面，我仍然只得到10个评论。不过我没有得到任何错误。需要帮助如何向下滚动页面，以获得至少20个评论。到目前为止，我只能得到10个评论。根据我对此问题的在线搜索，人们大多使用：“driver.execute_script（”window.scrollto(0,document.body.scrollheight）；“）”若要随时向下滚动页面，请执行以下操作。但对我来说这
谷歌云endpoint多获取

我有一个来自endpoint原始数据存储API的endpoint模型，称为资源，我使用Resource.query（）. fetch（10）请求10个项目。问题是它返回一个资源数组，但根据“创建endpointAPI”的文档，我需要返回一个消息数组。 https://cloud.google.com/appengine/docs/python/endpoints/create_api 使用Goo
用硒向下滚动谷歌评论

我正试图从这个链接中抓取评论: https://www.google.com/search?q=google回顾第二次机会治疗40街对于我使用以下代码加载页面的内容页面加载很好，它没有向下滚动，我已经对其他网站（如 linkedn）使用了相同的代码，并且它在那里工作。
使用rvest进行Web刮取
使用谷歌广告Api获取广告报告

我想得到的活动和广告表现的报告。到目前为止，我已经得到了竞选业绩报告，但我无法得到广告业绩报告。我在客户端库中看到了谷歌广告api和它们的例子。但我无法理解如何获得广告报道。我正在制作一个函数，通过谷歌广告api为我获取报告。谷歌广告Api：https://developers.google.com/google-ads/api/docs/fields/ad_group_ad#ad_grou

R-使用rvest获取谷歌评论

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档