Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add bwc layer for 'romanian' analyzer
The 'romanian' language analyzer has been improved in Lucene 10 in two important ways. First, the snowball stemmer has been modified to work with s-comma and t-comma characters but only with their cedilla forms used when Romanian didn't have full Unicode support (snowballstem/snowball#177). Second, the analyzer now contains a normalization step to map cedilla forms to forms with comma. In order to maintain backwards compatibility with existing indices, this change moves the Lucene 9 stemmer over to the analysis module was a deprecated variant and creates the analyzer for existing indices with the "old" stemmer and without the normalization step. New indices automatically run with the improved behaviour.
- Loading branch information