Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent updating Tags & Correspondents #248

Closed
winnieXY opened this issue Jan 27, 2025 · 7 comments
Closed

Prevent updating Tags & Correspondents #248

winnieXY opened this issue Jan 27, 2025 · 7 comments
Labels

Comments

@winnieXY
Copy link

Describe the bug
Even though I explicitly state that I do not want the tags and the correspondent being changed by the model, new correspondents are created and the tags on the document are modified.

paperless ai assigns currently all available tags to the document.

This is my (slightly modified) prompt:

You are a personalized document analyzer. Your task is to analyze documents and extract relevant information.

Analyze the document content and extract the following information into a structured JSON object:

1. title: Create a concise, meaningful title for the document
2. correspondent: Do not mofiy the correspondent - just analyse them.
3. tags: Do not modify the tags - just analyse them.
4. document_date: Extract the document date (format: DD-MM-YYYY)
5. language: Determine the document language (e.g. "de" or "en")
      
Important rules for the analysis:

For tags:
- Just analyse tags - do not add, delete or modify tags on the document. IMPORTANT!

For the title:
- Short and concise, NO ADDRESSES
- Contains the most important identification features
- For invoices/orders, mention invoice/order number if available
- The output language is the one used in the document! IMPORTANT!

For the correspondent:
- Do not modify the correspondent! IMPORTANT!
- Do not add new correspondents. IMPORTANT!

For the document date:
- Extract the date of the document
- Use the format YYYY-MM-DD
- If multiple dates are present, use the most relevant one

For the language:
- Determine the document language
- Use language codes like "de" for German or "en" for English
- If the language is not clear, use "und" as a placeholder
@clusterzx
Copy link
Owner

Hey Winnie,

With the next release there will be a fine tune option to set what will be created, updated or touched.
Stay tuned and thank you for your valuable inputs here.

Greetings from Cologne

@clusterzx
Copy link
Owner

This issue has been marked as stale due to inactivity. Please respond to keep it open.

@clusterzx clusterzx added the stale label Feb 4, 2025
@dadino
Copy link

dadino commented Feb 4, 2025

Hey Winnie,

With the next release there will be a fine tune option to set what will be created, updated or touched. Stay tuned and thank you for your valuable inputs here.

Greetings from Cologne

Is the "update" present? I can only choose to set the fields or not, I can't find the option to only create missing fields, but disallow updates of existing ones

@clusterzx
Copy link
Owner

Hey Winnie,
With the next release there will be a fine tune option to set what will be created, updated or touched. Stay tuned and thank you for your valuable inputs here.
Greetings from Cologne

Is the "update" present? I can only choose to set the fields or not, I can't find the option to only create missing fields, but disallow updates of existing ones

Thats not what the issue was about. Can you explain more precisely what you want?

@clusterzx
Copy link
Owner

@winnieXY your wish is fulfilled.

@dadino
Copy link

dadino commented Feb 4, 2025

Hey Winnie,
With the next release there will be a fine tune option to set what will be created, updated or touched. Stay tuned and thank you for your valuable inputs here.
Greetings from Cologne

Is the "update" present? I can only choose to set the fields or not, I can't find the option to only create missing fields, but disallow updates of existing ones

Thats not what the issue was about. Can you explain more precisely what you want?

A selector for each of the fields (title, tags, etc) with: "never create", "always create" and "create only if not already existing". This way when I have rules to auto-add tags to documents or when I upload a file with the correct name, PaperlessAI won't override the previously set value for that field.

EDIT: right now I've only set PaperlessAI to analyze files uploaded via the "consume" directory, which I'm sure have no correct title.

@clusterzx
Copy link
Owner

Hey Winnie,
With the next release there will be a fine tune option to set what will be created, updated or touched. Stay tuned and thank you for your valuable inputs here.
Greetings from Cologne

Is the "update" present? I can only choose to set the fields or not, I can't find the option to only create missing fields, but disallow updates of existing ones

Thats not what the issue was about. Can you explain more precisely what you want?

A selector for each of the fields (title, tags, etc) with: "never create", "always create" and "create only if not already existing". This way when I have rules to auto-add tags to documents or when I upload a file with the correct name, PaperlessAI won't override the previously set value for that field.

EDIT: right now I've only set PaperlessAI to analyze files uploaded via the "consume" directory, which I'm sure have no correct title.

That is not planned for now sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants