You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try to crawl a webpage (full domain) but never will be crawled more than the startpage. In the Datasources UI I tried http and https, with www and without, with trailing slash and without. It never works. I would expect that the crawler will follow the links found in the startpage. I have no idea why it does not work as expected.
on some pages it worked fine for me. but then I run into the same as you described with https://hudoc.echr.coe.int . to see if there are similarities in the structure it might be helpful to name your pages.
Hello,
I try to crawl a webpage (full domain) but never will be crawled more than the startpage. In the Datasources UI I tried http and https, with www and without, with trailing slash and without. It never works. I would expect that the crawler will follow the links found in the startpage. I have no idea why it does not work as expected.
(The whole installation was made on bullseye with "one command" as documented in https://opensemanticsearch.org/doc/admin/install/search_server/ )
The text was updated successfully, but these errors were encountered: