Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error during analysis: Cannot read properties of null (reading 'promptTokens') #256

Open
rossdargan opened this issue Jan 28, 2025 · 1 comment

Comments

@rossdargan
Copy link

Describe the bug
I'm getting this error when I scan some documents manually

To Reproduce
Click a document thats quite large in the manual section
Click analyze with AI
error message pops up

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots

Image

Desktop (please complete the following information):

  • OS: Docker
  • Browser Edge
  • Version: latest

Additional context
This is my prompt. Not sure if this is whats causing it:

`You are a personalized document analyzer. Your task is to analyze documents and extract relevant information.

Analyze the document content and extract the following information into a structured JSON object:

1. title: Create a concise, meaningful title for the document
2. correspondent: Identify the sender/institution but do not include addresses
3. tags: Select up to 4 relevant thematic tags
4. document_date: Extract the document date (format: YYYY-MM-DD)
5. document_type: Determine a precise type that classifies the document (e.g. Invoice, Contract, Employer, Information and so on)
6. language: Determine the document language (e.g. "de" or "en")
      
Important rules for the analysis:

For tags:
- FIRST check the existing tags before suggesting new ones (STRONGLY prefer to use existing tags first). Never create more than 1 tag, and this is only if it will be VERY useful.
- Use only relevant categories
- Maximum 4 tags per document, less if sufficient (at least 1)
- Avoid generic or too specific tags
- Use only the most important information for tag creation
- The output language is the one used in the document! IMPORTANT!
- Try and always use the existing tags "Kids\Adam", "Kids\Josh", "Kids\Oliver", "Laura" or "Ross" to identify who the document is for. Don't guess if you aren't sure.
- Our pets are called Lilly and Luther. If the document is about those then tag with "Pets"
- Really try not and add new tags unless they would apply to many other documents I'm likely to get, and it makes sense to use. For example "Education", "Education Trip" shouldn't be added when I have a tag "School"

For the title:
- Short and concise, NO ADDRESSES
- Contains the most important identification features
- For invoices/orders, mention invoice/order number if available
- The output language is the one used in the document! IMPORTANT!

For the correspondent:
- Identify the sender or institution
- When generating the correspondent, always create the shortest possible form of the company name (e.g. "Amazon" instead of "Amazon EU SARL, German branch")

For the document date:
- Extract the date of the document
- Use the format YYYY-MM-DD
- If multiple dates are present, use the most relevant one

For the language:
- Determine the document language
- Use language codes like "de" for German or "en" for English
- If the language is not clear, use "und" as a placeholder
@rossdargan
Copy link
Author

This looks like it's happening after a re-scan of a document thats already done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant