Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lucene index different behavior on save / rebuild #8784

Closed
MikeKry opened this issue Mar 8, 2021 · 5 comments
Closed

Lucene index different behavior on save / rebuild #8784

MikeKry opened this issue Mar 8, 2021 · 5 comments
Labels

Comments

@MikeKry
Copy link
Contributor

MikeKry commented Mar 8, 2021

Hi,

I have a problem with lucene that I am not sure if its a problem with my implementation or with way that OC updates indexes.

In short:
I needed to create faceted search, usually I would use Elastic search, but since OC already had Lucene.Net implemented, I had decided to give it a try (so I would save some time that I would need to spend on admin UI etc.). I have found old project that does exactly what I needed but for older version of lucene.net, so I have updated it => aspcodenet/MultiFacetLuceneNet#4

Problem:
Everything works perfect when content items are created after index is created. But on older items, it throws null reference (if you want to look at the code its in method GetFacetCountFromMultipleIndices, fails on creating DISI)

Question:
I do not expect that someone will look at the code and will try to fix it, more likely I would like to know, if there are any differences when I try to reindex whole index vs. when I save content item.

Because if I would save all content items, one by one, then it starts to work. But if I reset/rebuild index, it is still broken.

I have tried to compare code by myself but I do not see anything that could lead me anywhere.

@Skrypt
Copy link
Contributor

Skrypt commented Mar 8, 2021

I'd need to understand what is the goal you are trying to achieve first. I probably need to read more about the MultiFacetLuceneNet specifications. When we rebuild an index it removes the files of the index entirely. When we do a reset, it reindex everything without removing any files.

So the issue you are experiencing is that your Content Items are not added to the IndexingTask table because you enabled it after creating some content items. There is an issue about this already. We would basically need to have a method to reimport all current "published" or "drafted" content items to that IndexingTask table. Else, you can do it manually by doing a simple SQL Query if they are missing.

INSERT INTO IndexingTask VALUES (
(SELECT ContentItemId, CreatedUtc, 0 FROM ContentItemIndex WHERE ContentItemId NOT IN (SELECT ContentItemId FROM IndexingTask) AND (ContentItemIndex.Published = 1 OR ContentItemIndex.Latest = 1))
)

To be tested, I've just composed this really quick.

@MikeKry
Copy link
Contributor Author

MikeKry commented Mar 8, 2021

@Skrypt
Thanks, I will try if that would help. Do you know number of existing issue, so I could watch it in case that is the problem?

@MikeKry
Copy link
Contributor Author

MikeKry commented Mar 9, 2021

@Skrypt

Ok, apparently all content items are in this table correctly, only thing that I have noticed is that correctly stored index (after manually resaving items) has little less files than broken one:

image

but not sure if that can help with anything.

@MikeKry
Copy link
Contributor Author

MikeKry commented Mar 9, 2021

Also index is reported as invalid, when I try to open it with Luke (but that applies to both of them).

@MikeKry
Copy link
Contributor Author

MikeKry commented Mar 9, 2021

I have found my problem and solved it, there was problem with faceted search when documents have splitted to different segments.

But there are still remaining issues:

  • documents are not added to index if they are created before index
  • Luke can not open indices (i am sure it worked before)

closing this issue as it is being solved in another issue #5466

@MikeKry MikeKry closed this as completed Mar 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants