-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hibernate-search] Introduce Hibernate Search framework and implement indexing page #6218
base: hibernate-search
Are you sure you want to change the base?
Conversation
babe208
to
a2724c0
Compare
ecbe5bb
to
103a456
Compare
@matthias-ronge : a hopefully short general question: is it possible to use different indices with Hibernate-Search? Currently this is possible through different values with the |
The index names for the individual objects are contained in the annotations as a string. I cannot estimate whether it is even possible to use variables here, or whether these have to be hard-coded strings at compile time; but I suspect the latter. Index access is controlled via properties such as port. You could install several index services on different ports and set the port at runtime before the program starts, or change the index data directory (as a symbolic link). Such a feature is currently not in the scope of our development. |
Thank you @matthias-ronge for the explanation. I know and I did not expect that this usage scenario is part of the current development to use different hibernate search indices. Edit: Maybe indexlayout-strategy-custom is a way to archive this. But this is nothing for now. |
3d33e50
to
6f8657d
Compare
ec00505
to
945cc24
Compare
945cc24
to
c1bbea7
Compare
Kitodo-DataManagement/src/main/java/org/kitodo/data/database/beans/Property.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is an @Indexed(index = "kitodo-folder")
annotation not missing like in the other bean files?
Kitodo-DataManagement/src/main/java/org/kitodo/data/database/beans/User.java
Show resolved
Hide resolved
hibernate.search.enabled=true | ||
hibernate.search.backend.hosts=localhost:9205 | ||
hibernate.search.backend.protocol=http |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above at the first hibernate.properties
file.
Kitodo/src/main/java/org/kitodo/production/services/index/ServerConnectionChecker.java
Show resolved
Hide resolved
@@ -37,6 +37,9 @@ | |||
<property name="hibernate.connection.verifyServerCertificate">false</property> | |||
<property name="hibernate.connection.useSSL">false</property> | |||
|
|||
<!-- Hibernate search --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are here the other Hibernate Search parameters are missing like used URI, port, ... which are added in the hibernate.properties
file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm that I also notice the similarity. Needs testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if the Hibernate-Search properties are stored in one place / file. If this is not possible it would be bad at least for me.
Kitodo/src/test/java/org/kitodo/production/services/command/CommandServiceTest.java
Show resolved
Hide resolved
hibernate.search.enabled=true | ||
hibernate.search.backend.hosts=localhost:9205 | ||
hibernate.search.backend.protocol=http |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment on the first hibernate.properties
file.
hibernate.search.enabled=true | ||
hibernate.search.backend.hosts=localhost:9205 | ||
hibernate.search.backend.protocol=http |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment on the first hibernate.properties
file.
@matthias-ronge I checked out your branch, and took notes of my testing experience.
Unfortunately, at this state, it is not possible to do further testing. @matthias-ronge In case you have not done this yet, please test your branch with a large amount of test data. Otherwise, let me know, and I will try to figure out why pages are loading so slowly on my machine. |
I tried to start the indexing. Some entities were indexed within a few seconds. The remaining entities (processes, projects, tasks, templates) stay at 0% for at least the last 5 minutes. After ~10 minutes all entities except processes and tasks were indexed at 100%. Processes and tasks have only 60 indexed entities (of 80.000 and 4.000 respectively). |
Thank you for this testing and your insights. However, this is not as I expected. I have not tested with such large data yet, I will have to inspect it myself first. General assumption is that framework works reasonably well, it could be due to some small thing. If I can confirm it works for large data, I will let you know. The code is not manually creating an index at startup, but I also saw it delay first, but only a few seconds. It is clear that I have to check this. |
I logged the SQL statements having checked out the branch and just scrolling through the list of processes (10 per page) floods my database with queries. I have around 1000 processes in my database. It takes very long to jump to the next 10 entries. Hundreds of requests are made for one page:
from time to time (while issuing many smaller queries as well) really complex queries are fired. |
I can reproduce the error: For me it doesn't start with a larger database (8000 processes) either - or rather, it's still taking a while, I'm just waiting. I don't know why that is, it must be coming from the framework. It's not any code that I programmed that is being executed. I don't think it's good that it takes so long to start. I'm just waiting. |
Issue #5760 2a) and 2b)
Follow-up pull request to #6209 (immediate diff)
The three numbers before the slash in “Indexed entries” represent the number of objects that Hibernate has already loaded from the database, the number of objects that have been prepared as indexable documents (JSONs), and finally the number of indexed documents.
Basic experience: Hibernate Search and lazy loading don't mix. It looks like we have to accept that. As a result, I have deactivated lazy loading wherever the number of members of a set is typically small (< 25). This affects most sets, e.g. projects of a template, tasks, users or properties of a template or a process, etc. If the set can typically be large (> 1000), the elements of the set are not indexed. Example: Processes of a batch. Consideration: If the number of subelements to be indexed in an object is very large, the findability of the object approaches infinity (it becomes increasingly likely that it will be found with any search query). Such indexing also makes the index enormously large. Therefore, it can be considered justifiable not to index these fields.