modify scripts
This commit is contained in:
63
docker/paperless/plugins/paperless.txt
Normal file
63
docker/paperless/plugins/paperless.txt
Normal file
@ -0,0 +1,63 @@
|
||||
我提供的文件,是 paperless 的SQLite数据库的关键表。现在我们编写它的 PAPERLESS_POST_CONSUME_SCRIPT。需求如下:
|
||||
|
||||
1, 我们提供的pdf文件格式为 {publish_date}_{report_type}_{org_sname}_{industry_name}_{stock_name}_{title}.pdf
|
||||
2,我们提取上面的各个字段,然后:
|
||||
1) report_type 对应到 documents_documenttype.name 所以我们要查询 documents_documenttype 表,如果对应的name不存在,则插入一条记录;然后得到对应的 documents_documenttype.id
|
||||
2) org_sname 对应到 documents_correspondent.name 所以我们要查询 documents_correspondent 表,如果对应的name 不存在,则插入一条记录,然后得到对应的 documents_correspondent.id
|
||||
3) 检查 documents_customfield 表是否包含 '行业' 和 '股票名称' 字段,如果不存在,则创建; 查到他们分别对应的 documents_customfield.id , 记为 stockname_id, industry_id
|
||||
3,我们开始更新数据表:
|
||||
1) 更新 documents_document 表对应的记录, reated = publish_date, correspondent_id = documents_correspondent.id , document_type_id = documents_documenttype.id, title={title}
|
||||
2) 向 documents_customfieldinstance 两条记录,分别为 (document_id, stockname_id, stock_name) 和 (document_id, industry_id, industry_name)
|
||||
|
||||
好了,请你根据以上需求,完成这个python脚本。注意异常情况的处理,以及日志输出。如果文件名无法匹配以上的格式,则忽略,不用处理。
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Paperless makes use of the Django REST Framework standard API interface. It provides a browsable API for most of its endpoints, which you can inspect at http://<paperless-host>:<port>/api/. This also documents most of the available filters and ordering fields.
|
||||
|
||||
The API provides the following main endpoints:
|
||||
|
||||
/api/correspondents/: Full CRUD support.
|
||||
/api/custom_fields/: Full CRUD support.
|
||||
/api/documents/: Full CRUD support, except POSTing new documents. See below.
|
||||
/api/document_types/: Full CRUD support.
|
||||
/api/groups/: Full CRUD support.
|
||||
/api/logs/: Read-Only.
|
||||
/api/mail_accounts/: Full CRUD support.
|
||||
/api/mail_rules/: Full CRUD support.
|
||||
/api/profile/: GET, PATCH
|
||||
/api/share_links/: Full CRUD support.
|
||||
/api/storage_paths/: Full CRUD support.
|
||||
/api/tags/: Full CRUD support.
|
||||
/api/tasks/: Read-only.
|
||||
/api/users/: Full CRUD support.
|
||||
/api/workflows/: Full CRUD support.
|
||||
/api/search/ GET, see below.
|
||||
All of these endpoints except for the logging endpoint allow you to fetch (and edit and delete where appropriate) individual objects by appending their primary key to the path, e.g. /api/documents/454/.
|
||||
|
||||
The objects served by the document endpoint contain the following fields:
|
||||
|
||||
id: ID of the document. Read-only.
|
||||
title: Title of the document.
|
||||
content: Plain text content of the document.
|
||||
tags: List of IDs of tags assigned to this document, or empty list.
|
||||
document_type: Document type of this document, or null.
|
||||
correspondent: Correspondent of this document or null.
|
||||
created: The date time at which this document was created.
|
||||
created_date: The date (YYYY-MM-DD) at which this document was created. Optional. If also passed with created, this is ignored.
|
||||
modified: The date at which this document was last edited in paperless. Read-only.
|
||||
added: The date at which this document was added to paperless. Read-only.
|
||||
archive_serial_number: The identifier of this document in a physical document archive.
|
||||
original_file_name: Verbose filename of the original document. Read-only.
|
||||
archived_file_name: Verbose filename of the archived document. Read-only. Null if no archived document is available.
|
||||
notes: Array of notes associated with the document.
|
||||
page_count: Number of pages.
|
||||
set_permissions: Allows setting document permissions. Optional, write-only. See below.
|
||||
custom_fields: Array of custom fields & values, specified as { field: CUSTOM_FIELD_ID, value: VALUE }
|
||||
|
||||
|
||||
以上是paperless提供的api。我们现在使用 http://localhost:8000 来访问它。那么,我想对编号为19的文档进行查询,以及更新操作,应该如何写对应的python代码?
|
||||
|
||||
|
||||
Reference in New Issue
Block a user