当前位置：首页 > 软件库 > 应用工具 > 网络爬虫 >

Gather Platform

数据采集平台

授权协议 GPL

开发语言 Java JavaScript HTML/CSS

所属分类应用工具、网络爬虫

软件类型开源软件

地区国产

投递者澹台鸿熙

操作系统跨平台

开源组织无

适用人群未知

软件官网

软件文档

官方下载

软件概览

Gather Platform 数据抓取平台是一套基于 Webmagic 内核的，具有 Web 任务配置和任务管理界面的数据采集平台，一个轻量级的搜索引擎系统。具有以下功能

根据配置的模板进行数据采集
对采集的数据进行NLP处理,包括:抽取关键词,抽取摘要,抽取实体词
自定义任务循环执行周期，一次定义，无人值守，自动采集
在不配置采集模板的情况下自动检测网页正文,自动抽取文章发布时间
动态字段抽取与静态字段植入
已抓取数据的管理,包括:搜索,增删改查,按照新的数据模板重新抽取数据
多数据输出方式：Elasticsearch、JSON文本，Redis

5分钟即可部署完毕,半分钟即可完成一个爬虫,开始数据采集. 不需要进行任何编码就可以完成一个功能强大的爬虫.

爬虫模板配置页面

抓取样例数据效果

爬虫管理页面

循环任务监测

数据搜索与管理页面

网页信息查看

关联信息页

根据域名统计数据页面

具体部署方式参考项目主页README

百度云下载链接密码： v3jm

使用案例

如何在python3.5.2中使用gather

我试图学习如何用python编写asnycron编程，并编写了一个小tornado应用程序，它用sleep命令执行两个asnycron循环。如果我用两个await命令等待两个协程，那么它的行为与预期的一样(第一个循环，而不是第二个循环被执行) 如果我将这两个协同路径与聚集结合在一起，则什么都不会发生。(没有错误，没有打印输出，Web请求从未完成。) 我不明白*await gather发生了什么(
Ansible Automation Platform - 用 Ansible Navigator 和 Execution Environment 镜像开发测试 Playbook

《OpenShift / RHEL / DevSecOps / Ansible 汇总目录》请参考《Ansible Automation Platform - 功能构成》一文了解什么是 Ansible Navigator 和 Execution Environment。安装最新版 Ansible Navigator Ansible Navigator 需要本地有 docker 或 podman
oracle_X$ 统计信息(GATHER_FIXED_OBJECTS_STATS)注意事项

Applies to: Oracle Database - Personal Edition - Version 10.1.0.2 to 11.2.0.3 [Release 10.1 to 11.2] Oracle Database - Enterprise Edition - Version 10.1.0.2 to 11.2.0.3 [Release 10.1 to 11.2] Oracle D
IBM Platform LSF家族安装和配置简介

集群结构较大的集群都会设计单独的登录节点，用户只能ssh到登录节点，不能直接ssh到集群的任何主节点和计算节点。同时配置用户在计算节点之间的ssh互信，为了并行作业的运行。登录节点也安装LSF，配置为LSF 静态Client或者MXJ值为0，也即不运行作业的客户端。集群的WEB节点与办公访问局域网一个网段。如需使用浮动client，主节点网卡需要单纯LSF环境（命令行提交） LSF+PA
AttributeError: module 'tensorflow' has no attribute 'batch_gather'

希望各路好汉走过路过给点意见！！！在做Tensorflow Object detection API的时候，想要把训练好的模型转换为.pb文件，在执行转换程序export_inference_graph.py的时候，出现了AttributeError: module 'tensorflow' has no attribute 'batch_gather'这个问题。我的操作系统是WIN10，另外t
How To Gather The OS Logs For Each Specific OS Platform.

个人总结各个平台的收集系统日志方法,在排错或者诊断 ASM/ASM lib问题的时候用到.上传给oracle support还没有做过. How To Gather The OS Logs For Each Specific OS Platform. [ID 1349613.1] Modified 29-AUG-2011 Type HOWTO Status PUBLISHED
GATHER_STATS_JOB encountered errors出问题

数据库报错 GATHER_STATS_JOB encountered errors. Check the trace file. Errors in file /opt/oracle/diag/rdbms/dbserver1/dbserver1/trace/dbserver1_j003_10544.trc: ORA-20011: Approximate NDV failed: ORA-01476
DBMS_STATS: GATHER_STATS_JOB encountered errors

一、在alert.log文件中發現如下錯誤： *** 2013-06-01 10:00:13.500 DBMS_STATS: GATHER_STATS_JOB: GATHER_TABLE_STATS('"SYS"','"ALERT_HUMAN2"','""', ...) DBMS_STATS: ORA-20011: Approximate NDV failed: ORA-29913: 執行 OD
How to Gather Optimizer Statistics on 11g (Doc ID 749227.1)

In this Document Goal Solution Quick Recreate Recommendation Important Notes Regarding the Gathering of Optimizer Statistics Gathering Object statistics Use a large enough sample size Ga
How to Gather Optimizer Statistics on 9i (Doc ID 388474.1)

In this Document Goal Solution References APPLIES TO: Oracle Server - Enterprise Edition - Version: 9.0.1.4 to 9.2.0.8 - Release: 9.0.1 to 9.2 Information in this document applies to any platfor
How to Gather Optimizer Statistics on 10g (Doc ID 605439.1)

In this Document Goal Solution References APPLIES TO: Oracle Database - Personal Edition - Version 10.1.0.2 to 10.2.0.5 [Release 10.1 to 10.2] Oracle Database - Standard Edition - Version 10.1.0
How to Gather Optimizer Statistics on 12c (Doc ID 1445302.1)

In this Document Goal Solution References APPLIES TO: Oracle Database - Enterprise Edition - Version 12.1.0.1 and later Oracle Database - Standard Edition - Version 12.1.0.1 and later Oracle Dat

Gather Platform

同类工具

相关阅读

相关文章

相关问答

相关文档