当前位置: 首页 > 知识库问答 >
问题:

Selenium Docker容器在EC2上运行,但在AWS Lambda上不运行

郁高韵
2023-03-14

我想通过Docker容器在AWS Lamda上运行selenium脚本。

我正在使用AWS EC2构建容器,然后通过AWS Lambda RIE在本地测试容器。一旦测试成功,容器将在ECR注册,以便馈送AWS Lambda。

尽管RIE在EC2上的本地测试总是成功的,但我无法让Lambda正常工作。Lambda测试当前总是失败,并显示以下错误消息:

{
  "errorMessage": "Message: session not created\nfrom tab crashed\n  (Session info: headless chrome=93.0.4577.63)\n",
  "errorType": "SessionNotCreatedException",
  "stackTrace": [
    "  File \"/var/task/app.py\", line 32, in handler\n    driver = webdriver.Chrome(\n",
    "  File \"/var/task/selenium/webdriver/chrome/webdriver.py\", line 76, in __init__\n    RemoteWebDriver.__init__(\n",
    "  File \"/var/task/selenium/webdriver/remote/webdriver.py\", line 157, in __init__\n    self.start_session(capabilities, browser_profile)\n",
    "  File \"/var/task/selenium/webdriver/remote/webdriver.py\", line 252, in start_session\n    response = self.execute(Command.NEW_SESSION, parameters)\n",
    "  File \"/var/task/selenium/webdriver/remote/webdriver.py\", line 321, in execute\n    self.error_handler.check_response(response)\n",
    "  File \"/var/task/selenium/webdriver/remote/errorhandler.py\", line 242, in check_response\n    raise exception_class(message, screen, stacktrace)\n"
  ]
}

在这里,您可以找到我实际使用的所有代码:

文档

FROM public.ecr.aws/lambda/python:3.8

#Download and install Chrome
RUN curl https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm > ./google-chrome-stable_current_x86_64.rpm
RUN yum install -y ./google-chrome-stable_current_x86_64.rpm
RUN rm ./google-chrome-stable_current_x86_64.rpm

#Download and install chromedriver
RUN yum install -y unzip
RUN curl http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip > /tmp/chromedriver.zip
RUN unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
RUN rm /tmp/chromedriver.zip
RUN yum remove -y unzip

#Upgrade pip and install python dependences
RUN pip3 install --upgrade pip
RUN pip3 install selenium --target "${LAMBDA_TASK_ROOT}"

#Copy app.py
COPY app.py ${LAMBDA_TASK_ROOT}

CMD ["app.handler"]

app.py


from selenium import webdriver
from selenium.webdriver.chrome.options import Options

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

def handler(event, context):

    chrome_options = Options()
    
    chrome_options.add_argument("--allow-running-insecure-content")
    chrome_options.add_argument("--ignore-certificate-errors")
    chrome_options.add_argument("--headless")
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--disable-dev-shm-usage")
    chrome_options.add_argument("--disable-gpu")
    chrome_options.add_argument("--disable-dev-tools")
    chrome_options.add_argument("--no-zygote")
    chrome_options.add_argument("--v=99")
    chrome_options.add_argument("--single-process")

    chrome_options.binary_location = '/usr/bin/google-chrome-stable'

    capabilities = webdriver.DesiredCapabilities().CHROME
    capabilities['acceptSslCerts'] = True
    capabilities['acceptInsecureCerts'] = True
        
    driver = webdriver.Chrome(
        executable_path='/usr/local/bin/chromedriver',
        options=chrome_options,
        desired_capabilities=capabilities)

    if driver:
    
        response = {
            "statusCode": 200,
            "body": json.dumps("Selenium Driver Initiated")
        }
    
        return response

使用RIE进行本地容器测试

$ docker run -p 9000:8080 aws-scraper
  results in > time="2021-09-03T15:24:13.269" level=info msg="exec '/var/runtime/bootstrap' (cwd=/var/task, handler=)"

$ curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{}'
  results in > {"statusCode": 200, "body": "\"Selenium Driver Initiated\""}[

我真的想不出来。我也试图在AWS EC2上跟踪硒的工作,但没有在AWS Lambda上,但没有用。

任何帮助都将非常受欢迎。事先谢谢你。

共有1个答案

微生俊健
2023-03-14

通过从本回购协议中借用dockerfile和selenium webdriver chrome选项来解决:https://github.com/rchauhan9/image-scraper-lambda-container.git

Dockerfile现在看起来如下所示:

# Define global args
ARG FUNCTION_DIR="/home/app/"
ARG RUNTIME_VERSION="3.9"
ARG DISTRO_VERSION="3.12"


# Stage 1
FROM python:${RUNTIME_VERSION}-alpine${DISTRO_VERSION} AS python-alpine

RUN apk add --no-cache \
    libstdc++

# Stage 2
FROM python-alpine AS build-image

RUN apk add --no-cache \
    build-base \
    libtool \
    autoconf \
    automake \
    libexecinfo-dev \
    make \
    cmake \
    libcurl

ARG FUNCTION_DIR
ARG RUNTIME_VERSION

RUN mkdir -p ${FUNCTION_DIR}

RUN python${RUNTIME_VERSION} -m pip install awslambdaric --target ${FUNCTION_DIR}


# Stage 3
FROM python-alpine as build-image2

ARG FUNCTION_DIR

WORKDIR ${FUNCTION_DIR}

COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR}

RUN apk update \
    && apk add gcc python3-dev musl-dev \
    && apk add jpeg-dev zlib-dev libjpeg-turbo-dev

COPY requirements.txt .

RUN python${RUNTIME_VERSION} -m pip install -r requirements.txt --target ${FUNCTION_DIR}

# Stage 4
FROM python-alpine

ARG FUNCTION_DIR

WORKDIR ${FUNCTION_DIR}

COPY --from=build-image2 ${FUNCTION_DIR} ${FUNCTION_DIR}

RUN apk add jpeg-dev zlib-dev libjpeg-turbo-dev \
    && apk add chromium chromium-chromedriver

ADD https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/latest/download/aws-lambda-rie /usr/bin/aws-lambda-rie

RUN chmod 755 /usr/bin/aws-lambda-rie

COPY app/* ${FUNCTION_DIR}
COPY entry.sh /

ENTRYPOINT [ "/entry.sh" ]

CMD [ "app.handler" ]

和应用程序。py现在看起来如下所示;

import json

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

def handler(event, context):

    chrome_options = Options()
    
    chrome_options.add_argument('--autoplay-policy=user-gesture-required')
    chrome_options.add_argument('--disable-background-networking')
    chrome_options.add_argument('--disable-background-timer-throttling')
    chrome_options.add_argument('--disable-backgrounding-occluded-windows')
    chrome_options.add_argument('--disable-breakpad')
    chrome_options.add_argument('--disable-client-side-phishing-detection')
    chrome_options.add_argument('--disable-component-update')
    chrome_options.add_argument('--disable-default-apps')
    chrome_options.add_argument('--disable-dev-shm-usage')
    chrome_options.add_argument('--disable-domain-reliability')
    chrome_options.add_argument('--disable-extensions')
    chrome_options.add_argument('--disable-features=AudioServiceOutOfProcess')
    chrome_options.add_argument('--disable-hang-monitor')
    chrome_options.add_argument('--disable-ipc-flooding-protection')
    chrome_options.add_argument('--disable-notifications')
    chrome_options.add_argument('--disable-offer-store-unmasked-wallet-cards')
    chrome_options.add_argument('--disable-popup-blocking')
    chrome_options.add_argument('--disable-print-preview')
    chrome_options.add_argument('--disable-prompt-on-repost')
    chrome_options.add_argument('--disable-renderer-backgrounding')
    chrome_options.add_argument('--disable-setuid-sandbox')
    chrome_options.add_argument('--disable-speech-api')
    chrome_options.add_argument('--disable-sync')
    chrome_options.add_argument('--disk-cache-size=33554432')
    chrome_options.add_argument('--hide-scrollbars')
    chrome_options.add_argument('--ignore-gpu-blacklist')
    chrome_options.add_argument('--ignore-certificate-errors')
    chrome_options.add_argument('--metrics-recording-only')
    chrome_options.add_argument('--mute-audio')
    chrome_options.add_argument('--no-default-browser-check')
    chrome_options.add_argument('--no-first-run')
    chrome_options.add_argument('--no-pings')
    chrome_options.add_argument('--no-sandbox')
    chrome_options.add_argument('--no-zygote')
    chrome_options.add_argument('--password-store=basic')
    chrome_options.add_argument('--use-gl=swiftshader')
    chrome_options.add_argument('--use-mock-keychain')
    chrome_options.add_argument('--single-process')
    chrome_options.add_argument('--headless')

    chrome_options.add_argument('--user-data-dir={}'.format('/tmp/user-data'))
    chrome_options.add_argument('--data-path={}'.format('/tmp/data-path'))
    chrome_options.add_argument('--homedir={}'.format('/tmp'))
    chrome_options.add_argument('--disk-cache-dir={}'.format('/tmp/cache-dir'))
        
    driver = webdriver.Chrome(
        executable_path='/usr/bin/chromedriver',
        options=chrome_options)

    if driver:
        print("Selenium Driver Initiated")
    
    response = {
        "statusCode": 200,
        "body": json.dumps(html, ensure_ascii=False)
    }

    return response

老实说,我仍然不明白为什么这些修改做了这项工作。任何关于这一点的想法都非常受欢迎!

再次感谢大家的帮助和支持

 类似资料:
  • 问题内容: 我在CI和CD上创建了Jenkinsfile,Dockerfile,Dockerfile.test到CI和CD,在GitHub上构建了我的服务器API,我在Jenkins上构建了该构建,并且构建成功,并且我的docker在Jenkinsfile阶段也在容器上运行,我创建了用于测试和部署在服务器API上,并使用docker作为容器 我也使用docker-compose在docker上运行

  • 尽管CAS服务器在Tomcat下工作得很好,但我有一些问题要使它在WebLogic12c下工作。在Weblogic上部署之前,我遵循以下指南:https://github.com/gentics/gentics-sso-cas/wiki/oracle-weblogic-configuration在webcontent/web-inf/with content中添加文件Weblogic.xml:

  • 我创建了一个JasperReport应用程序,它在tomcat服务器上运行良好。但是当我使用相同的jar在Jboss上运行时,它会显示错误 原因:java.lang.ClassCastException:org.apache.xerces.jaxp.DocumentBuilderFactoryImpl无法强制转换为javax.xml.parsers.DocumentBuilderFactor.ne

  • 问题内容: 我有一个运行在Amazon EC2服务器上的简单meteor应用程序。一切都很好。我通过项目目录中的用户手动启动它。 但是,我想要这个应用程序 开机启动 不受挂断的困扰 我尝试通过运行它,但是当我尝试注销EC2实例时,出现“您有正在运行的作业”消息。继续注销将停止该应用程序。 如何使应用程序在启动时启动并保持运行状态(除非由于某种原因而崩溃)? 问题答案: 永久安装并使用启动脚本。 我

  • 问题内容: 任何人都可以指出以下步骤/资源: 如何在Amazon EC2上部署Java EE应用 实例重新启动后(可能使用amazon-ebs)维护对应用服务器的元数据的更改(部署新应用程序) 问题答案: 如果您还没有运行过它,请先检查一下:http : //docs.aws.amazon.com/gettingstarted/latest/awsgsg- intro/intro.html, 它可