当前位置: 首页 > 工具软件 > Huginn > 使用案例 >

Huginn中常用的Agent

祁凯泽
2023-12-01

Agent 是 Huginn 中的事件,可以理解为派出的一个小机器人,Agent都可以按照频率和定时的时间点来触发,常用的Agent有以下几个:

  1. Website Agent

The Website Agent scrapes a website, XML document, or JSON feed and creates Events based on the results.

是抓取网站的工具,可以解析网页、XML文档和json数据的节点。

以下截自官网说明:

Specify a url and select a mode for when to create Events based on the scraped data, either all, on_change, or merge (if fetching based on an Event, see below).

The url option can be a single url, or an array of urls (for example, for multiple pages with the exact same structure but different content to scrape).

The WebsiteAgent can also scrape based on incoming events.

  • Set the url_from_event option to a Liquid template to generate the url to access based on the Event. (To fetch the url in the Event’s url key, for example, set url_from_event to {{ url }}.)

  • Alternatively, set data_from_event to a Liquid template to use data directly without fetching any URL. (For example, set it to {{ html }} to use HTML contained in the html key of the incoming Event.)

  • If you specify merge for the mode option, Huginn will retain the old payload and update it with new values.

  1. Phantom Js Cloud Agent

This Agent generates PhantomJs Cloud URLs that can be used to render JavaScript-heavy webpages for content extraction.

这个Agent可以生成PhantomJs的链接,用于解析全文,但是要配合 PhantomJs的账号一起使用。

可以在以下网站申请 PhantomJs的账号,一天免费500次。

https://dashboard.phantomjscloud.com/

以下截自官网说明:

This Agent generates PhantomJs Cloud URLs that can be used to render JavaScript-heavy webpages for content extraction.

URLs generated by this Agent are formulated in accordance with the PhantomJs Cloud API. The generated URLs can then be supplied to a Website Agent to fetch and parse the content.

3.Csv Agent

The CsvAgent parses or serializes CSV data. When parsing, events can either be emitted for the entire CSV, or one per row.

解析csv文件

以下截自官网说明:

The CsvAgent parses or serializes CSV data. When parsing, events can either be emitted for the entire CSV, or one per row.

Set mode to parse to parse CSV from incoming event, when set to serialize the agent serilizes the data of events to CSV.

4.Trigger Agent

The Trigger Agent will watch for a specific value in an Event payload.

监控,类似Change Detector Agent,区别是定量,在满足给定条件时,才触发.

以下截自官网说明:

The Trigger Agent will watch for a specific value in an Event payload.

The rules array contains a mixture of strings and hashes.

A string rule is a Liquid template and counts as a match when it expands to true.

A hash rule consists of the following keys: path, value, and type.

The path value is a dotted path through a hash in JSONPaths syntax. For simple events, this is usually just the name of the field you want, like ‘text’ for the text key of the event.

5.Event Formatting Agent

The Event Formatting Agent allows you to format incoming Events, adding new fields as needed.

最常用的。修改或转换event中的某个字段

6.Post Agent

A Post Agent receives events from other agents (or runs periodically), merges those events with the Liquid-interpolated contents of payload, and sends the results as POST (or GET) requests to a specified url. To skip merging in the incoming event, but still send the interpolated payload, set no_merge to true.

强大的agent,发送http post请求。

以下截自官网说明:

A Post Agent receives events from other agents (or runs periodically), merges those events with the Liquid-interpolated contents of payload, and sends the results as POST (or GET) requests to a specified url. To skip merging in the incoming event, but still send the interpolated payload, set no_merge to true.

The post_url field must specify where you would like to send requests. Please include the URI scheme (http or https).

The method used can be any of get, post, put, patch, and delete.

7.Data Output Agent

The Data Output Agent outputs received events as either RSS or JSON. Use it to output a public or private stream of Huginn data.

最常用的,不用说

以下截自官网说明:

The Data Output Agent outputs received events as either RSS or JSON. Use it to output a public or private stream of Huginn data.

This Agent will output data at:

https://localhost:3000/users/1/web_requests/:id/:secret.xml

where :secret is one of the allowed secrets specified in your options and the extension can be xml or json.

You can setup multiple secrets so that you can individually authorize external systems to access your Huginn data.

 类似资料: