我正在AWS(Ubunut)EC2实例上运行一个脚本。它是一款网页刮板,使用Selenium/ChromeDriver和无头chrome来刮一些网页。我以前运行这个脚本没有任何问题,但是今天我遇到了一个错误。下面是脚本:
options = Options()
options.add_argument('--no-sandbox')
options.add_argument('--window-size=1420,1080')
options.add_argument('--headless')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--disable-gpu')
options.add_argument("--disable-notifications")
options.binary_location='/usr/bin/chromium-browser'
driver = webdriver.Chrome(chrome_options=options)
#Set base url (SAN FRANCISCO)
base_url = 'https://www.bandsintown.com/en/c/san-francisco-ca?page='
events = []
for i in range(1,90):
#cycle through pages in range
driver.get(base_url + str(i))
pageURL = base_url + str(i)
print(pageURL)
当我从ubuntu运行这个脚本时,我得到这个错误:
Traceback (most recent call last):
File "BandsInTown_Scraper_SF.py", line 91, in <module>
driver = webdriver.Chrome(chrome_options=options)
File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
self.service.start()
File "/home/ubuntu/.local/lib/python3.6/site-packages/selenium/webdriver/common/service.py", line 76, in start
stdin=PIPE)
File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child
restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
我确认我正在运行相同版本的ChromeDriver/Chromium浏览器:
ChromeDriver 79.0.3945.130 (e22de67c28798d98833a7137c0e22876237fc40a-refs/branch-heads/3945@{#1047})
Chromium 79.0.3945.130 Built on Ubuntu , running on Ubuntu 18.04
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/usr/lib/python3.6/socket.py", line 745, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 346, in _make_request
self._validate_conn(conn)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 852, in _validate_conn
conn.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 284, in connect
conn = self._new_conn()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x7f90945757f0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
^[[B File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 639, in urlopen
^[[B^[[A^[[A _stacktrace=sys.exc_info()[2])
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 398, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.bandsintown.com', port=443): Max retries exceeded with url: /en/c/san-francisco-ca?page=6 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f90945757f0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "BandsInTown_Scraper_SF.py", line 39, in <module>
res = requests.get(url)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.bandsintown.com', port=443): Max retries exceeded with url: /en/c/san-francisco-ca?page=6 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f90945757f0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
此错误消息...
restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory
...暗示操作系统无法分配内存来启动/生成新会话。
此外,此错误消息...
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.bandsintown.com', port=443): Max retries exceeded with url: /en/c/san-francisco-ca?page=6 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f90945757f0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
base_url = 'https://www.bandsintown.com/en/c/san-francisco-ca?page='
for i in range(1,10):
#cycle through pages in range
driver.get(base_url + str(i))
pageURL = base_url + str(i)
print(pageURL)
https://www.bandsintown.com/en/c/san-francisco-ca?page=1
https://www.bandsintown.com/en/c/san-francisco-ca?page=2
https://www.bandsintown.com/en/c/san-francisco-ca?page=3
https://www.bandsintown.com/en/c/san-francisco-ca?page=4
https://www.bandsintown.com/en/c/san-francisco-ca?page=5
https://www.bandsintown.com/en/c/san-francisco-ca?page=6
https://www.bandsintown.com/en/c/san-francisco-ca?page=7
https://www.bandsintown.com/en/c/san-francisco-ca?page=8
https://www.bandsintown.com/en/c/san-francisco-ca?page=9
self.pid = _posixsubprocess.fork_exec(
args, executable_list,
close_fds, tuple(sorted(map(int, fds_to_keep))),
cwd, env_list,
p2cread, p2cwrite, c2pread, c2pwrite,
errread, errwrite,
errpipe_read, errpipe_write,
restore_signals, start_new_session, preexec_fn)
要检查系统是否已经有一些可用的交换空间,您需要执行以下命令:
$ sudo swapon --show
如果您没有得到任何输出,这意味着您的系统当前没有可用的交换空间。您还可以使用free实用程序验证没有活动交换,如下所示:
$ free -h
如果系统中没有活动交换,您将看到如下输出:
Output
total used free shared buff/cache available
Mem: 488M 36M 104M 652K 348M 426M
Swap: 0B 0B 0B
$ sudo fallocate -l 1G /swapfile
$ ls -lh /swapfile
#Output
$ -rw-r--r-- 1 root root 1.0G Mar 08 10:30 /swapfile
$ sudo chmod 600 /swapfile
$ ls -lh /swapfile
#Output
-rw------- 1 root root 1.0G Apr 25 11:14 /swapfile
这确认只有根用户启用了读和写标志。
现在需要通过执行以下命令将文件标记为交换空间:
$ sudo mkswap /swapfile
#Sample Output
Setting up swapspace version 1, size = 1024 MiB (1073737728 bytes)
no label, UUID=6e965805-2ab9-450f-aed6-577e74089dbf
接下来,您需要启用交换文件,允许系统开始使用它执行以下命令:
$ sudo swapon /swapfile
$ sudo swapon --show
#Sample Output
NAME TYPE SIZE USED PRIO
/swapfile file 1024M 0B -1
$ free -h
#Sample Output
total used free shared buff/cache available
Mem: 488M 37M 96M 652K 354M 425M
Swap: 1.0G 0B 1.0G
问题内容: 我是python的新手,已经学习了几周。但是,现在我刚刚更改了操作系统,现在正在使用ubuntu,并且无法在终端上运行任何脚本。 我确定有, 但是当我去终端输入时,例如 终端显示了这样的错误消息 python:无法打开文件“ test.py”:[Errno 2]没有这样的文件或目录 我该怎么办? 我必须将文件保存在任何特定的文件夹中以使其在终端上运行吗? 问题答案: 这个错误: pyt
问题内容: 我刚从python移植了我的应用程序,所以Go有点新。看来我遇到了记忆问题。 它在ubuntu机器上运行。通过主管。 编辑: 设置解决问题 问题答案: 对于遇到此问题的其他人,这是golang问题中的相关近期问题 对于所有受影响的人,在Linux上得到适当修复之前的临时替代方法可以是以下之一: 启用无条件过量使用: 能够无条件过载:添加交换到你的主机,用它几乎永远不会被使用,但在计算参
问题内容: 我从PHP脚本执行Python脚本时遇到问题。我的客户端使用Bluehost,因此我使用在此描述的easy_install方法为Python安装了第三方模块(numpy):https ://my.bluehost.com/cgi/help/530?step = 530 为了演示我的问题,我创建了两个python脚本和一个PHP脚本。 hello.py包含: hello-numpy.py
对此有什么建议吗?
我在我的Ubuntu VM上安装了Hive和Hadoop。 当我在终端上启动时,我会得到以下信息: SLF4J:类路径包含多个SLF4J绑定。slf4j:在[jar:file:/opt/apache-hive-2.3.5-bin/lib/log4j-Slf4j-impl-2.6.2.jar!/org/slf4j/impl/staticloggerbinder.class]中找到绑定slf4j:在[