当前位置: 首页 > 软件库 > 程序开发 > 搜索引擎 >

paperless-ng

授权协议 GPL-3.0 License
开发语言 C/C++
所属分类 程序开发、 搜索引擎
软件类型 开源软件
地区 不详
投 递 者 吕永嘉
操作系统 跨平台
开源组织
适用人群 未知
 软件概览

ciAnsible RoleDocumentation StatusGitterDocker Hub PullsCoverage Status

Paperless-ng

Paperless (click me) is an application by Daniel Quinn and contributors that indexes your scanned documents and allows you to easily search for documents and store metadata alongside your documents.

Paperless-ng is a fork of the original project, adding a new interface and many other changes under the hood. These key points should help you decide whether Paperless-ng is something you would prefer over Paperless:

  • Interface: The new front end is the main interface for Paperless-ng, the old interface still exists but most customizations (such as thumbnails for the document list) have been removed.0
  • Encryption: Paperless-ng does not support GnuPG anymore, since storing your data on encrypted file systems (that you optionally mount on demand) achieves about the same result.
  • Resource usage: Paperless-ng does use a bit more resources than Paperless. Running the web server requires about 300MB of RAM or more, depending on the configuration. While adding documents, it requires about 300MB additional RAM, depending on the document. It still runs on Raspberry Pi (many users do that), but it has been generally geared to better use the resources of more powerful systems.
  • API changes: If you rely on the REST API of paperless, some of its functionality has been changed.

For a detailed list of changes, have a look at the change log in the documentation, especially the section about the 0.9.0 release.

How it Works

Paperless does not control your scanner, it only helps you deal with what your scanner produces.

  1. Buy a document scanner that can write to a place on your network. If you need some inspiration, have a look at the scanner recommendations page. Set it up to "scan to FTP" or something similar. It should be able to push scanned images to a server without you having to do anything. Of course if your scanner doesn't know how to automatically upload the file somewhere, you can always do that manually. Paperless doesn't care how the documents get into its local consumption directory.

    • Alternatively, you can use any of the mobile scanning apps out there. We have an app that allows you to share documents with paperless, if you're on Android. See the section on affiliated projects below.
  2. Wait for paperless to process your files. OCR is expensive, and depending on the power of your machine, this might take a bit of time.

  3. Use the web frontend to sift through the database and find what you want.

  4. Download the PDF you need/want via the web interface and do whatever you like with it. You can even print it and send it as if it's the original. In most cases, no one will care or notice.

Here's what you get:

Dashboard

If you want to see paperless-ng in action, more screenshots are available in the documentation.

Features

  • Performs OCR on your documents, adds selectable text to image only documents and adds tags, correspondents and document types to your documents.
  • Supports PDF documents, images, plain text files, and Office documents (Word, Excel, Powerpoint, and LibreOffice equivalents).
    • Office document support is optional and provided by Apache Tika (see configuration)
  • Paperless stores your documents plain on disk. Filenames and folders are managed by paperless and their format can be configured freely.
  • Single page application front end.
    • Includes a dashboard that shows basic statistics and has document upload.
    • Filtering by tags, correspondents, types, and more.
    • Customizable views can be saved and displayed on the dashboard.
  • Full text search helps you find what you need.
    • Auto completion suggests relevant words from your documents.
    • Results are sorted by relevance to your search query.
    • Highlighting shows you which parts of the document matched the query.
    • Searching for similar documents ("More like this")
  • Email processing: Paperless adds documents from your email accounts.
    • Configure multiple accounts and filters for each account.
    • When adding documents from mail, paperless can move these mail to a new folder, mark them as read, flag them as important or delete them.
  • Machine learning powered document matching.
    • Paperless learns from your documents and will be able to automatically assign tags, correspondents and types to documents once you've stored a few documents in paperless.
  • Optimized for multi core systems: Paperless-ng consumes multiple documents in parallel.
  • The integrated sanity checker makes sure that your document archive is in good health.

Getting started

The recommended way to deploy paperless is docker-compose. The files in the /docker/compose directory are configured to pull the image from Docker Hub.

Read the documentation on how to get started.

Alternatively, you can install the dependencies and setup apache and a database server yourself. The documenation has a step by step guide on how to do it. Consider giving the Ansible role a shot, this essentially automates the entire bare metal installation process.

Migrating from Paperless to Paperless-ng

Read the section about migration in the documentation. Its also entirely possible to go back to Paperless by reverting the database migrations.

Documentation

The documentation for Paperless-ng is available on ReadTheDocs.

Translation

Paperless is available in many different languages. Translation is coordinated at crowdin. If you want to help out by translating paperless into your language, please head over to https://github.com/jonaswinkler/paperless-ng/issues/212 for details!

Feature Requests

Feature requests can be submitted via GitHub Discussions, you can search for existing ideas, add your own and vote for the ones you care about! Note that some older feature requests can also be found under issues.

Questions? Something not working?

For bugs please open an issue or start a discussion if you have questions.

Feel like helping out?

There's still lots of things to be done, just have a look at open issues & discussions. If you feel like contributing to the project, please do! Bug fixes and improvements to the front end (I just can't seem to get some of these CSS things right) are always welcome. The documentation has some basic information on how to get started.

If you want to implement something big: Please start a discussion about that! Maybe I've already had something similar in mind and we can make it happen together. However, keep in mind that the general roadmap is to make the existing features stable and get them tested.

Affiliated Projects

Paperless has been around a while now, and people are starting to build stuff on top of it. If you're one of those people, we can add your project to this list:

  • Paperless App: An Android/iOS app for Paperless. Updated to work with paperless-ng.
  • Paperless Share. Share any files from your Android application with paperless. Very simple, but works with all of the mobile scanning apps out there that allow you to share scanned documents.
  • Scan to Paperless: Scan and prepare (crop, deskew, OCR, ...) your documents for Paperless.

These projects also exist, but their status and compatibility with paperless-ng is unknown.

  • paperless-cli: A golang command line binary to interact with a Paperless instance.

This project also exists, but needs updates to be compatile with paperless-ng.

  • Paperless Desktop: A desktop UI for your Paperless installation. Runs on Mac, Linux, and Windows.Known issues on Mac: (Could not load reminders and documents)

Important Note

Document scanners are typically used to scan sensitive documents. Things like your social insurance number, tax records, invoices, etc. Everything is stored in the clear without encryption. This means that Paperless should never be run on an untrusted host. Instead, I recommend that if you do want to use it, run it locally on a server in your own home.

  • 推荐大家一个靠谱的论文检测平台。重复的部分有详细出处以及具体修改意见,能直接在文章上做修改,全部改完一键下载就搞定了。怕麻烦的话,还能用它自带的降重功能。哦对了,他们现在正在做毕业季活动, 赠送很多免费字数,可以说是十分划算了!地址是:http://www.paperpass.com/

 相关资料
  • 问题内容: 我正在尝试将字符串标记化为ngram。奇怪的是,在NGramTokenizer的文档中,我没有看到将返回标记化的单个ngram的方法。实际上,我在NGramTokenizer类中仅看到两个返回String Objects的方法。 这是我的代码: 被标记的ngram在哪里? 如何获取字符串/单词的输出? 我希望我的输出像:这是一个测试字符串。这是一个测试字符串。这是一个测试字符串。 问题

  • 问题内容: 我想使用RELAX NG模式来验证XML文档,并且我想使用JAXP验证API。 从谷歌搜索开始,看来我可以使用Jing和ISO RELAX JARV到JAXP Bridge了 。不幸的是,将它们都添加到我的类路径后,我无法使其正常工作。试图实例化工厂时,它只是抛出一个—我在内部查看,显然返回的是空结果。 因此,我希望能回答以下两个问题: 我该如何与Jing和这座桥一起工作? 我应该尝试

  • 问题内容: 基本安装nginx后,您的文件夹只有一个文件: 该文件夹如何工作,我将如何使用它托管多个(单独的)网站? 问题答案: 只需添加另一种方法,您就可以为托管的每个虚拟域或站点使用单独的文件。您可以将默认副本作为每个副本的起点,并为每个站点进行自定义。 然后在启用站点的站点中创建符号链接。这样,您可以通过添加或删除符号链接并发布服务nginx重新加载来访问站点。 在进行站点维护时,您可以发挥

  • 问题内容: 我在Docker容器上安装了Nginx,并且正在尝试像这样运行它: 问题在于Nginx的工作方式,即初始进程会立即产生一个主要的Nginx进程和一些工作程序,然后退出。由于Docker仅监视原始命令的PID,因此容器将暂停。 如何防止容器停止?我需要能够告诉它绑定到第一个子进程,或者阻止Nginx的初始进程退出。 问题答案: 像所有行为良好的程序一样,可以配置为不自我守护。 使用htt

  • 问题内容: 我使用nginx作为前端服务器,我修改了CSS文件,但是nginx仍在使用旧文件。 我试图重新启动nginx,但没有成功,我已经用Google搜索,但是找不到清除它的有效方法。 一些文章说我们只能删除缓存目录:,但是我的服务器上没有这样的目录。 我现在该怎么办? 问题答案: 我遇到了完全相同的问题-我在Virtualbox中运行了Nginx。我没有打开缓存。但是看起来像已经设置好了,这

  • 问题内容: 我是Angular 4的新手,所以谁能解释在Angular 4中的使用方式和位置? 实际上,我想从父组件覆盖子组件的某些CSS属性。此外,它在IE11上受支持吗? 问题答案: 通常, 可使用组合器将样式强制降低到 。这个选择器有一个别名>>>,现在还有另一个叫做:: ng-deep的别名。 由于 已弃用,建议使用 例如: 和 它将应用于子组件