当前位置：首页 > 软件库 > 云计算 > Serverless 系统 >

chrome-aws-lambda

授权协议 MIT License

开发语言 JavaScript

所属分类云计算、 Serverless 系统

软件类型开源软件

地区不详

投递者宦琪

操作系统跨平台

开源组织无

适用人群未知

软件官网

软件文档

官方下载

软件概览

chrome-aws-lambda

Chromium Binary for AWS Lambda and Google Cloud Functions

Install

npm install chrome-aws-lambda --save-prod

This will ship with appropriate binary for the latest stable release of puppeteer (usually updated within a few days).

You also need to install the corresponding version of puppeteer-core (or puppeteer):

npm install puppeteer-core --save-prod

If you wish to install an older version of Chromium, take a look at Versioning.

Usage

This package works with all the currently supported AWS Lambda Node.js runtimes out of the box.

const chromium = require('chrome-aws-lambda');

exports.handler = async (event, context, callback) => {
  let result = null;
  let browser = null;

  try {
    browser = await chromium.puppeteer.launch({
      args: chromium.args,
      defaultViewport: chromium.defaultViewport,
      executablePath: await chromium.executablePath,
      headless: chromium.headless,
      ignoreHTTPSErrors: true,
    });

    let page = await browser.newPage();

    await page.goto(event.url || 'https://example.com');

    result = await page.title();
  } catch (error) {
    return callback(error);
  } finally {
    if (browser !== null) {
      await browser.close();
    }
  }

  return callback(null, result);
};

Usage with Playwright

const chromium = require('chrome-aws-lambda');
const playwright = require('playwright-core');

(async () => {
  const browser = await playwright.chromium.launch({
    args: chromium.args,
    executablePath: await chromium.executablePath,
    headless: chromium.headless,
  });

  // ...

  await browser.close();
})();

You should allocate at least 512 MB of RAM to your Lambda, however 1600 MB (or more) is recommended.

Running Locally

Please refer to the Local Development Wiki page for instructions and troubleshooting.

API

Method / Property	Returns	Description
`font(url)`	`{?Promise<string>}`	Provisions a custom font and returns its basename.
`args`	`{!Array<string>}`	Provides a list of recommended additional Chromium flags.
`defaultViewport`	`{!Object}`	Returns more sensible default viewport settings.
`executablePath`	`{?Promise<string>}`	Returns the path the Chromium binary was extracted to.
`headless`	`{!boolean}`	Returns `true` if we are running on AWS Lambda or GCF.
`puppeteer`	`{!Object}`	Overloads `puppeteer` and returns the resolved package.

Fonts

The Amazon Linux 2 AWS Lambda runtime is no longer provisioned with any font faces.

Because of this, this package ships with Open Sans, which supports the following scripts:

Latin
Greek
Cyrillic

To provision additional fonts, simply call the font() method with an absolute path or URL:

await chromium.font('/var/task/fonts/NotoColorEmoji.ttf');
// or
await chromium.font('https://raw.githack.com/googlei18n/noto-emoji/master/fonts/NotoColorEmoji.ttf');

Noto Color Emoji (or similar) is needed if you want to render emojis.

For URLs, it's recommended that you use a CDN, like raw.githack.com or gitcdn.xyz.

This method should be invoked before launching Chromium.

On non-serverless environments, the font() method is a no-op to avoid polluting the user space.

Alternatively, it's also possible to provision fonts via AWS Lambda Layers.

Simply create a directory named .fonts and place any font faces you want there:

.fonts
├── NotoColorEmoji.ttf
└── Roboto.ttf

Afterwards, you just need to ZIP the directory and upload it as a AWS Lambda Layer:

zip -9 --filesync --move --recurse-paths .fonts.zip .fonts/

Overloading

Since version 8.0.0, it's possible to overload puppeteer with the following convenient API:

interface Browser {
  defaultPage(...hooks: ((page: Page) => Promise<Page>)[])
  newPage(...hooks: ((page: Page) => Promise<Page>)[])
}

interface BrowserContext {
  defaultPage(...hooks: ((page: Page) => Promise<Page>)[])
  newPage(...hooks: ((page: Page) => Promise<Page>)[])
}

interface Page {
  block(patterns: string[])
  clear(selector: string)
  clickAndWaitForNavigation(selector: string, options?: WaitForOptions)
  clickAndWaitForRequest(selector: string, predicate: string | RegExp, options?: WaitTimeoutOptions)
  clickAndWaitForRequest(selector: string, predicate: ((request: HTTPRequest) => boolean | Promise<boolean>), options?: WaitTimeoutOptions)
  clickAndWaitForResponse(selector: string, predicate: string | RegExp, options?: WaitTimeoutOptions)
  clickAndWaitForResponse(selector: string, predicate: ((request: HTTPResponse) => boolean | Promise<boolean>), options?: WaitTimeoutOptions)
  count(selector: string)
  exists(selector: string)
  fillFormByLabel(selector: string, data: Record<string, boolean | string | string[]>)
  fillFormByName(selector: string, data: Record<string, boolean | string | string[]>)
  fillFormBySelector(selector: string, data: Record<string, boolean | string | string[]>)
  fillFormByXPath(selector: string, data: Record<string, boolean | string | string[]>)
  number(selector: string, decimal?: string, property?: string)
  selectByLabel(selector: string, ...values: string[])
  string(selector: string, property?: string)
  waitForInflightRequests(requests?: number, alpha: number, omega: number, options?: WaitTimeoutOptions)
  waitForText(predicate: string, options?: WaitTimeoutOptions)
  waitUntilVisible(selector: string, options?: WaitTimeoutOptions)
  waitWhileVisible(selector: string, options?: WaitTimeoutOptions)
  withTracing(options: TracingOptions, callback: (page: Page) => Promise<any>)
}

interface Frame {
  clear(selector: string)
  clickAndWaitForNavigation(selector: string, options?: WaitForOptions)
  clickAndWaitForRequest(selector: string, predicate: string | RegExp, options?: WaitTimeoutOptions)
  clickAndWaitForRequest(selector: string, predicate: ((request: HTTPRequest) => boolean | Promise<boolean>), options?: WaitTimeoutOptions)
  clickAndWaitForResponse(selector: string, predicate: string | RegExp, options?: WaitTimeoutOptions)
  clickAndWaitForResponse(selector: string, predicate: ((request: HTTPResponse) => boolean | Promise<boolean>), options?: WaitTimeoutOptions)
  count(selector: string)
  exists(selector: string)
  fillFormByLabel(selector: string, data: Record<string, boolean | string | string[]>)
  fillFormByName(selector: string, data: Record<string, boolean | string | string[]>)
  fillFormBySelector(selector: string, data: Record<string, boolean | string | string[]>)
  fillFormByXPath(selector: string, data: Record<string, boolean | string | string[]>)
  number(selector: string, decimal?: string, property?: string)
  selectByLabel(selector: string, ...values: string[])
  string(selector: string, property?: string)
  waitForText(predicate: string, options?: WaitTimeoutOptions)
  waitUntilVisible(selector: string, options?: WaitTimeoutOptions)
  waitWhileVisible(selector: string, options?: WaitTimeoutOptions)
}

interface ElementHandle {
  clear()
  clickAndWaitForNavigation(options?: WaitForOptions)
  clickAndWaitForRequest(predicate: string | RegExp, options?: WaitTimeoutOptions)
  clickAndWaitForRequest(predicate: ((request: HTTPRequest) => boolean | Promise<boolean>), options?: WaitTimeoutOptions)
  clickAndWaitForResponse(predicate: string | RegExp, options?: WaitTimeoutOptions)
  clickAndWaitForResponse(predicate: ((request: HTTPResponse) => boolean | Promise<boolean>), options?: WaitTimeoutOptions)
  fillFormByLabel(data: Record<string, boolean | string | string[]>)
  fillFormByName(data: Record<string, boolean | string | string[]>)
  fillFormBySelector(data: Record<string, boolean | string | string[]>)
  fillFormByXPath(data: Record<string, boolean | string | string[]>)
  getInnerHTML()
  getInnerText()
  number(decimal?: string, property?: string)
  selectByLabel(...values: string[])
  string(property?: string)
}

To enable this behavior, simply call the puppeteer property exposed by this package.

Refer to the TypeScript typings for general documentation.

Page Hooks

When overloaded, you can specify a list of hooks to automatically apply to pages.

For instance, to remove the Headless substring from the user agent:

async function replaceUserAgent(page: Page): Promise<Page> {
  let value = await page.browser().userAgent();

  if (value.includes('Headless') === true) {
    await page.setUserAgent(value.replace('Headless', ''));
  }

  return page;
}

And then simply pass that page hook to defaultPage() or newPage():

let page = await browser.defaultPage(replaceUserAgent);

Additional bundled page hooks can be found on /build/hooks.

Versioning

This package is versioned based on the underlying puppeteer minor version:

`puppeteer` Version	`chrome-aws-lambda` Version	Chromium Revision
`10.1.*`	`npm i chrome-aws-lambda@~10.1.0`	`884014` (`92.0.4512.0`)
`10.0.*`	`npm i chrome-aws-lambda@~10.0.0`	`884014` (`92.0.4512.0`)
`9.1.*`	`npm i chrome-aws-lambda@~9.1.0`	`869685` (`91.0.4469.0`)
`9.0.*`	`npm i chrome-aws-lambda@~9.0.0`	`869685` (`91.0.4469.0`)
`8.0.*`	`npm i chrome-aws-lambda@~8.0.2`	`856583` (`90.0.4427.0`)
`7.0.*`	`npm i chrome-aws-lambda@~7.0.0`	`848005` (`90.0.4403.0`)
`6.0.*`	`npm i chrome-aws-lambda@~6.0.0`	`843427` (`89.0.4389.0`)
`5.5.*`	`npm i chrome-aws-lambda@~5.5.0`	`818858` (`88.0.4298.0`)
`5.4.*`	`npm i chrome-aws-lambda@~5.4.0`	`809590` (`87.0.4272.0`)
`5.3.*`	`npm i chrome-aws-lambda@~5.3.1`	`800071` (`86.0.4240.0`)
`5.2.*`	`npm i chrome-aws-lambda@~5.2.1`	`782078` (`85.0.4182.0`)
`5.1.*`	`npm i chrome-aws-lambda@~5.1.0`	`768783` (`84.0.4147.0`)
`5.0.*`	`npm i chrome-aws-lambda@~5.0.0`	`756035` (`83.0.4103.0`)
`3.1.*`	`npm i chrome-aws-lambda@~3.1.1`	`756035` (`83.0.4103.0`)
`3.0.*`	`npm i chrome-aws-lambda@~3.0.4`	`737027` (`81.0.4044.0`)
`2.1.*`	`npm i chrome-aws-lambda@~2.1.1`	`722234` (`80.0.3987.0`)
`2.0.*`	`npm i chrome-aws-lambda@~2.0.2`	`705776` (`79.0.3945.0`)
`1.20.*`	`npm i chrome-aws-lambda@~1.20.4`	`686378` (`78.0.3882.0`)
`1.19.*`	`npm i chrome-aws-lambda@~1.19.0`	`674921` (`77.0.3844.0`)
`1.18.*`	`npm i chrome-aws-lambda@~1.18.1`	`672088` (`77.0.3835.0`)
`1.18.*`	`npm i chrome-aws-lambda@~1.18.0`	`669486` (`77.0.3827.0`)
`1.17.*`	`npm i chrome-aws-lambda@~1.17.1`	`662092` (`76.0.3803.0`)
`1.16.*`	`npm i chrome-aws-lambda@~1.16.1`	`656675` (`76.0.3786.0`)
`1.15.*`	`npm i chrome-aws-lambda@~1.15.1`	`650583` (`75.0.3765.0`)
`1.14.*`	`npm i chrome-aws-lambda@~1.14.0`	`641577` (`75.0.3738.0`)
`1.13.*`	`npm i chrome-aws-lambda@~1.13.0`	`637110` (`74.0.3723.0`)
`1.12.*`	`npm i chrome-aws-lambda@~1.12.2`	`624492` (`73.0.3679.0`)
`1.11.*`	`npm i chrome-aws-lambda@~1.11.2`	`609904` (`72.0.3618.0`)
`1.10.*`	`npm i chrome-aws-lambda@~1.10.1`	`604907` (`72.0.3582.0`)
`1.9.*`	`npm i chrome-aws-lambda@~1.9.1`	`594312` (`71.0.3563.0`)
`1.8.*`	`npm i chrome-aws-lambda@~1.8.0`	`588429` (`71.0.3542.0`)
`1.7.*`	`npm i chrome-aws-lambda@~1.7.0`	`579032` (`70.0.3508.0`)
`1.6.*`	`npm i chrome-aws-lambda@~1.6.3`	`575458` (`69.0.3494.0`)
`1.5.*`	`npm i chrome-aws-lambda@~1.5.0`	`564778` (`69.0.3452.0`)
`1.4.*`	`npm i chrome-aws-lambda@~1.4.0`	`555668` (`68.0.3419.0`)
`1.3.*`	`npm i chrome-aws-lambda@~1.3.0`	`549031` (`67.0.3391.0`)
`1.2.*`	`npm i chrome-aws-lambda@~1.2.0`	`543305` (`67.0.3372.0`)
`1.1.*`	`npm i chrome-aws-lambda@~1.1.0`	`536395` (`66.0.3347.0`)
`1.0.*`	`npm i chrome-aws-lambda@~1.0.0`	`526987` (`65.0.3312.0`)
`0.13.*`	`npm i chrome-aws-lambda@~0.13.0`	`515411` (`64.0.3264.0`)

Patch versions are reserved for bug fixes in chrome-aws-lambda and general maintenance.

Compiling

To compile your own version of Chromium check the Ansible playbook instructions.

AWS Lambda Layer

Lambda Layers is a new convenient way to manage common dependencies between different Lambda Functions.

The following set of (Linux) commands will create a layer of this package alongside puppeteer-core:

git clone --depth=1 https://github.com/alixaxel/chrome-aws-lambda.git && \
cd chrome-aws-lambda && \
make chrome_aws_lambda.zip

The above will create a chrome-aws-lambda.zip file, which can be uploaded to your Layers console.

Alternatively, you can also download the layer artifact from one of our CI workflow runs.

Google Cloud Functions

Since version 1.11.2, it's also possible to use this package on Google/Firebase Cloud Functions.

According to our benchmarks, it's 40% to 50% faster than using the off-the-shelf puppeteer bundle.

Compression

The Chromium binary is compressed using the Brotli algorithm.

This allows us to get the best compression ratio and faster decompression times.

File	Algorithm	Level	Bytes	MiB	%	Inflation
`chromium`	-	-	136964856	130.62	-	-
`chromium.gz`	Gzip	1	51662087	49.27	62.28%	1.035s
`chromium.gz`	Gzip	2	50438352	48.10	63.17%	1.016s
`chromium.gz`	Gzip	3	49428459	47.14	63.91%	0.968s
`chromium.gz`	Gzip	4	47873978	45.66	65.05%	0.950s
`chromium.gz`	Gzip	5	46929422	44.76	65.74%	0.938s
`chromium.gz`	Gzip	6	46522529	44.37	66.03%	0.919s
`chromium.gz`	Gzip	7	46406406	44.26	66.12%	0.917s
`chromium.gz`	Gzip	8	46297917	44.15	66.20%	0.916s
`chromium.gz`	Gzip	9	46270972	44.13	66.22%	0.968s
`chromium.gz`	Zopfli	10	45089161	43.00	67.08%	0.919s
`chromium.gz`	Zopfli	20	45085868	43.00	67.08%	0.919s
`chromium.gz`	Zopfli	30	45085003	43.00	67.08%	0.925s
`chromium.gz`	Zopfli	40	45084328	43.00	67.08%	0.921s
`chromium.gz`	Zopfli	50	45084098	43.00	67.08%	0.935s
`chromium.br`	Brotli	0	55401211	52.83	59.55%	0.778s
`chromium.br`	Brotli	1	54429523	51.91	60.26%	0.757s
`chromium.br`	Brotli	2	46436126	44.28	66.10%	0.659s
`chromium.br`	Brotli	3	46122033	43.99	66.33%	0.616s
`chromium.br`	Brotli	4	45050239	42.96	67.11%	0.692s
`chromium.br`	Brotli	5	40813510	38.92	70.20%	0.598s
`chromium.br`	Brotli	6	40116951	38.26	70.71%	0.601s
`chromium.br`	Brotli	7	39302281	37.48	71.30%	0.615s
`chromium.br`	Brotli	8	39038303	37.23	71.50%	0.668s
`chromium.br`	Brotli	9	38853994	37.05	71.63%	0.673s
`chromium.br`	Brotli	10	36090087	34.42	73.65%	0.765s
`chromium.br`	Brotli	11	34820408	33.21	74.58%	0.712s

License

MIT

使用案例

如何使用 AWS Lambda 运行 selenium

借助 AWS Lambda 运行 selenium 来爬取网络数据。简介与手动从网站收集数据相比，爬虫可以为我们节省很多时间，对于爬虫的每次请求而言，这相当于 AWS Lambda 的每次函数的运行。 AWS Lambda 是一种将脚本部署到云的简单且价格低廉的服务，如果我们要实现在 AWS Lambda 上运行 selenium 实现数据的爬取，我们需要解决如何在 AWS Lambda 函数
Kappa: 简化AWS Lambda部署

Mitch Garnaat 创建了一个名为Kappa的命令行工具。这个工具简化了将“Lambda函数”部署到AWS Lambda上的操作。AWS Lambda是一个计算服务。这个服务基于事件运行代码，并且自动地管理相关的计算资源。\\ Amazon Web Service (AWS) Lambda服务允许将代码部署到AWS云计算平台中，并将代码与事件相关联，例如网址的点击、在AWS S3 obje

chrome-aws-lambda

chrome-aws-lambda

Install

Usage

Usage with Playwright

Running Locally

API

Fonts

Overloading

Page Hooks

Versioning

Compiling

AWS Lambda Layer

Google Cloud Functions

Compression

License

同类工具

相关阅读

相关文章

相关问答

相关文档