搜索或添加rss feed_如何使用platypush从rss feed自动交付定制的新闻通讯

傅长恨
2023-12-01

搜索或添加rss feed

I’ve always been a supporter of well-curated newsletters. They give me an opportunity to get a good overview of what happened in the fields I follow within a span of a day, a week or a month. However, not all the newsletters fit this category. Some don’t think three times before selling email addresses to 3rd-parties — and within the blink of an eye your mailbox can easily get flooded with messages that you didn’t request. Others may sign up your address for other services or newsletters as well, and often they don’t often much granularity to configure which communications you want to receive. Even in the best-case scenario, the most privacy-savvy user may still think twice before signing up for a newsletter — you’re giving your personal email address to someone else you don’t necessarily trust, implying “yes, this is my address and I’m interested in this subject”. Additionally, most of the newsletters spice up their URLs with tracking parameters, so they can easily measure user engagement — something you may not necessarily be happy with. Moreover, the customization junkie may also have a valid use case for a more finely tuned selection of content in his newsletter — you may want to group some sources together into the same daily/weekly email, or you may be interested only in some particular subset of the subjects covered by a newsletter, filtering out those that aren’t relevant, or customize the style of the digest that gets delivered. Finally, a fully automated way to deliver newsletters through 5 lines of code and the tuning of a couple of parameters is the nirvana for many companies of every size out there.

我一直是精心策划的新闻通讯的支持者。 他们使我有机会全面了解我在一天,一周或一个月的跨度内发生的事情。 但是,并非所有新闻通讯都适合此类别。 有些人在向第三方出售电子邮件地址之前不会三思而后行-眨眼间您的邮箱就很容易被您不需要的消息淹没。 其他人也可能会在您的地址上注册其他服务或新闻通讯,而且他们通常没有太多粒度来配置要接收的通信。 即使在最佳情况下,最精通隐私的用户在注册新闻通讯之前可能仍会三思而后行-您是将自己的电子邮件地址提供给您不一定信任的其他人,这意味着“是的,这是我的地址,我对此主题很感兴趣”。 此外,大多数新闻通讯都会使用跟踪参数来为其URL增添趣味,因此它们可以轻松衡量用户的参与度,您可能并不一定对此感到满意。 此外,自定义瘾君子可能还具有有效的用例,可以在其时事通讯中更精细地选择内容-您可能希望将某些来源组合到同一封每日/每周电子邮件中,或者您可能只对某些特定子集感兴趣时事通讯所涵盖的主题,过滤掉不相关的主题,或自定义要传递的摘要的样式。 最后,对于许多规模各异的公司而言,通过5行代码和几个参数进行调整来传递新闻通讯的全自动方法是必杀技。

订阅时事通讯 (Feed up the newsletter)

Those who read my articles in the past may know that I’m an avid consumer of RSS feeds. Despite being a 21-year-old technology, they do their job very well when it comes to deliver the information that matters without all the noise and trackers, and they provide a very high level of integration being simple XML documents. However, in spite of all the effort I put to be up-to-date with all my sources, a lot of potentially interesting content inevitably slips through — and that’s where newsletters step in, as they filter and group together all the content that was generated in a given time frame and periodically deliver it to your inbox.

过去阅读过我的文章的人可能会知道我是RSS提要的狂热消费者。 尽管它已经有21年的技术了,但是在提供重要信息而又没有所有噪音和跟踪器的情况下,它们仍然可以很好地完成工作,并且它们提供了非常简单的XML文档的高度集成。 但是,尽管我已尽一切努力对所有资源进行了更新,但不可避免地会漏掉许多潜在的有趣内容,而这正是新闻通讯介入的地方,因为它们过滤并归纳了所有在给定的时间范围内生成并定期将其发送到您的收件箱。

My ideal solution would be something that combines the best aspects of both the worlds: the flexibility of an RSS subscription, combined with a flexible way of filtering and aggregating content and sources, and get the full package delivered at my door in whichever format I like (HTML, PDF, MOBI…). In this article I’m going to show how to achieve this goal with a few tools:

我的理想解决方案是结合两个方面的最佳方面:RSS订阅的灵活性,灵活的过滤和汇总内容和源的方式,并以我喜欢的任何格式将完整的软件包交付给我。 (HTML,PDF,MOBI…)。 在本文中,我将展示如何使用一些工具实现此目标:

  • One or more sources that you want to track and that support RSS feeds (in this example I’ll use the MIT Technology Review RSS feed, but the procedure works for any RSS feed).

    您要跟踪并支持RSS提要的一个或多个源(在本示例中,我将使用MIT Technology Review RSS提要 ,但是该过程适用于任何RSS提要)。

  • An email address.

    电子邮件地址。
  • Platypush to do the heavy-lifting job — monitor the RSS sources at custom intervals, trigger events when a source has some new content, create a digest out of the new content, and deliver the full package to a list of email addresses.

    Platypush可以完成繁重的工作-以自定义的时间间隔监视RSS源,在源中包含一些新内容时触发事件,从新内容中创建摘要,并将完整的软件包发送到电子邮件地址列表中。

Let’s cover these points step by step.

让我们逐步介绍这些要点。

安装和配置Platypush (Installing and configuring Platypush)

Those who have already read my previous articles may have heard of Platypush — the automation platform I’ve been building in the past few years. For those who aren’t familiar, an advised read is my first Medium post that illustrates some of its capabilities and the paradigm behind it.

那些已经阅读我以前的文章的人可能听说过Platypush ,这是我在过去几年中一直在构建的自动化平台。 对于那些不熟悉的人,有建议的阅读是我的第一篇Medium文章 ,阐述了它的一些功能和背后的范例。

We’ll be using the http.poll backend configured with one or more RssUpdates objects to poll our RSS sources at regular intervals and create the digests, and either the mail.smtp plugin or the google.mail plugin to send the digests to our email.

我们将使用配置了一个或多个RssUpdates对象的http.poll后端,定期轮询我们的RSS源并创建摘要,然后使用mail.smtp插件或google.mail插件将摘要发送到我们的电子邮件。

You can install Platypush on any device where you want to run your logic — a RaspberryPi, an old laptop, a cloud node, and so on. We will install the base package with the rss module. Optionally, you can install it with the pdf module as well (if you want to export your digests also to PDF) or the google module (if you want to send the newsletter from a GMail address instead of an SMTP server).

您可以在要运行逻辑的任何设备上安装Platypush,例如RaspberryPi,旧笔记本电脑,云节点等等。 我们将使用rss模块安装基本软件包。 (可选)您也可以将其与pdf模块一起安装(如果要将摘要也导出到PDF)或google模块(如果要从GMail地址而不是SMTP服务器发送新闻稿)。

The first option is to install the latest stable version through pip:

第一种选择是通过pip安装最新的稳定版本:

The other option is to install the latest git version:

另一个选项是安装最新的git版本:

监视您的RSS feed (Monitoring your RSS feeds)

Once the software is installed, create the configuration file ~/.config/platypush/config.yaml if it doesn't exist already and add the configuration for the RSS monitor:

安装该软件后,请创建配置文件~/.config/platypush/config.yaml如果尚不存在),然后添加RSS监视器的配置:

You can also add more sources to the http.poll requests object, each with its own configuration. Also, you can customize the style of your digest by passing some valid CSS to these configuration attributes:

您还可以将更多源添加到http.poll requests对象,每个源都有其自己的配置。 另外,您可以通过将一些有效CSS传递给以下配置属性来自定义摘要的样式:

The digest_format attribute determines the output format of your digest - you may want to choose html if you want to deliver a summary of the articles in a newsletter, or pdf if you want instead to deliver the full content of each item as an attachment to an email address. Bonus point: since you can send PDFs to a Kindle if you configured an email address, this mechanism allows you to deliver the full digest of your RSS feeds to your Kindle's email address.

digest_format属性决定摘要的输出格式-如果要在新闻简报中提供文章摘要,则可能要选择html ,如果要代替而将每个项目的全部内容作为附件提供,则可能要选择pdf电子邮件地址。 奖励点:由于您可以在配置电子邮件地址的情况下将PDF发送到Kindle,因此该机制可让您将RSS提要的完整摘要发送到Kindle的电子邮件地址。

The RssUpdates object also provides native integration with the Mercury Parser API to automatically scrape the content of a web page - I covered some of these concepts in my past article on how to parse RSS feeds and send the PDF digest to your e-reader. The same mechanism works well for newsletters too. If you want to parse the content of the newsletter as well, all you have to do is configure the http.webpage Platypush plugin. Since the Mercury API doesn't provide a Python binding, this requires a couple of JavaScript dependencies:

RssUpdates对象还提供了与Mercury Parser API的本机集成,以自动抓取网页的内容-我在上一篇文章中介绍了其中一些概念,其中涉及如何解析RSS feed并将PDF摘要发送到您的电子阅读器。 同样的机制也适用于新闻通讯。 如果您还想解析新闻通讯的内容,则只需配置http.webpage Platypush插件。 由于Mercury API不提供Python绑定,因此需要几个JavaScript依赖项:

Then, if you want to parse the full content of the items and generate a PDF digest out of them, change your http.poll configuration to something like this:

然后,如果您想解析项目的全部内容并http.poll生成PDF摘要,请将您的http.poll配置更改为以下内容:

WARNING: Extracting the full content of the articles in an RSS feed has two limitations — a practical one and a legal one:

警告 :提取RSS feed中文章的全部内容有两个限制-实际的限制和合法的限制:

  • Some websites may require user login before displaying the full content of an article. Some websites perform such checks client-side — and the parser API can usually circumvent them, especially if the full content of an article is actually just hidden behind a client-side paywall. Some websites, however, implement their user checks server-side too before sending the content to the client — and in those cases the parser API may return only a part of the content or no content at all.

    在显示文章的全部内容之前,某些网站可能要求用户登录。 一些网站在客户端执行此类检查-解析器API通常可以规避它们,尤其是如果文章的全部内容实际上只是隐藏在客户端付费墙的后面。 但是,某些网站在将内容发送给客户端之前,也实现了其用户在服务器端进行检查-在这种情况下,解析器API可能仅返回部分内容,或者根本不返回任何内容。
  • Always keep in mind that parsing the full content of an article behind a paywall may represent a violation of intellectual property under some jurisdictions.

    切记,在某些司法管辖区中,剖析付费墙后面文章的全部内容可能表示侵犯了知识产权。

配置邮件传递 (Configuring the mail delivery)

When new content is published on a subscribed RSS feed Platypush will generate a NewFeedEvent and it should create a copy of the digest under ~/.local/share/platypush/feeds/cache/{date:time}_{feed-title}.[html|pdf]. The NewFeedEvent in particular is the link you need to create your custom logic that sends an email to a list of addresses when new content is available.

当新内容在订阅的RSS feed上发布时,Platypush将生成NewFeedEvent ,它应在~/.local/share/platypush/feeds/cache/{date:time}_{feed-title}.[html|pdf]下创建摘要的副本~/.local/share/platypush/feeds/cache/{date:time}_{feed-title}.[html|pdf] 。 特别是NewFeedEvent是创建自定义逻辑所需的链接,当新内容可用时,该自定义逻辑将电子邮件发送到地址列表。

First, configure the Platypush mail plugin you prefer. When it comes to sending emails you primarily have two options:

首先,配置您喜欢的Platypush邮件插件。 在发送电子邮件时,主要有两种选择:

  • The mail.smtp plugin — if you want to send emails directly through an SMTP server. Platypush configuration:

    mail.smtp插件—如果您想直接通过SMTP服务器发送电子邮件。 Platypush配置:

  • The google.mail plugin— if you want to use the native GMail API to send emails. If that is the case then first make sure that you have the dependencies for the Platypush Google module installed:

    google.mail插件-如果您想使用本机GMail API发送电子邮件。 如果是这种情况,请首先确保您已安装Platypush Google模块的依赖项:

In this case you’ll also have to create a project on the Google Developers console and download the OAuth credentials:

在这种情况下,您还必须在Google Developers控制台上创建一个项目并下载OAuth凭据:

  • Click on “Credentials” from the context menu > OAuth Client ID.

    从上下文菜单> OAuth客户端ID中单击“凭据”。
  • Once generated, you can see your new credentials in the “OAuth 2.0 client IDs” section. Click on the “Download” icon to save them to a JSON file.

    生成后,您可以在“ OAuth 2.0客户端ID”部分中看到新的凭据。 单击“下载”图标将其保存到JSON文件。
  • Copy the file to your Platypush device/server under e.g. ~/.credentials/google/client_secret.json.

    将文件复制到您的Platypush设备/服务器,例如~/.credentials/google/client_secret.json

  • Run the following command on the device to authorize the application:

    在设备上运行以下命令以授权应用程序:

At this point the GMail delivery is ready to be used by your Platypush automation.

此时,您的Platypush自动化即可使用GMail交付了。

连接点 (Connecting the dots)

Now that both the RSS parsing logic and the mail integration are in place, we can glue them together through the NewFeedEvent event. The new advised way to configure events in Platypush is through native Python scripts - the custom YAML-based syntax for events and procedure was becoming too cumbersome to maintain and write (although it’s still supported), and I feel like going back to a clean and simple Python API may be a better option.

现在,RSS解析逻辑和邮件集成都已就绪,我们可以通过NewFeedEvent事件将它们粘合在一起。 在Platypush中配置事件的新建议方式是通过本机Python脚本-用于事件和过程的基于YAML的自定义语法变得难以维护和编写(尽管仍受支持),我想回到干净的环境。简单的Python API可能是更好的选择。

Create and initialize the Platypush scripts directory, if it doesn’t existing already:

创建并初始化Platypush脚本目录(如果尚不存在):

Then, create a new hook on NewFeedEvent:

然后,在NewFeedEvent上创建一个新的钩子

If you opted for the native GMail plugin you may want to go for:

如果您选择了本机GMail插件,则可能需要:

If instead you want to send the digest in PDF format as an attachment:

相反,如果您要以PDF格式将摘要作为附件发送:

Finally, create your ~/.mail.list file with one destination email address per line and start platypush either from the command line or as a service. You should receive your email with the first batch of articles shortly after startup, and you'll receive more items if a new batch is available after the poll_seconds configured period.

最后,使用每行一个目标电子邮件地址创建~/.mail.list文件,然后从命令行或作为服务启动platypush 。 在启动后不久,您应该会收到包含第一批文章的电子邮件,如果在poll_seconds配置周期之后有新的一批可用,您将收到更多的项目。

翻译自: https://medium.com/swlh/how-to-automatically-deliver-customized-newsletters-from-rss-feeds-with-platypush-8c540a557fa

搜索或添加rss feed

 类似资料: