Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve pre-existing index templates (#1900). #1912

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fressi-elastic
Copy link
Contributor

Preserve pre-existing index templates (#1900).

When opening an EsMetricsStore, it creates the index template if any of following:

  • the index template doesn't exist
  • reporting/overwrite_existing_templates option is true

It will preserve existing template on all the other cases.

@gareth-ellis gareth-ellis requested a review from a team February 5, 2025 14:02
@fressi-elastic fressi-elastic changed the title Issue/1900 Preserve pre-existing index templates (#1900). Feb 5, 2025
@gareth-ellis
Copy link
Member

Could we get an update to the docs done too, to explain this feature? Obviously we want to have it tested too before we merge

@gareth-ellis gareth-ellis added enhancement Improves the status quo highlight A substantial improvement that is worth mentioning separately in release notes labels Feb 6, 2025
@fressi-elastic
Copy link
Contributor Author

Could we get an update to the docs done too, to explain this feature? Obviously we want to have it tested too before we merge

I did tested and it works as expected in all cases except one: one the overwrite_existing_templates is false it looks like it act as it is true. I guess the code is parsing it as string instead of a boolean. I am investigating on it.

@fressi-elastic fressi-elastic force-pushed the issue/1900 branch 5 times, most recently from 6e1cffa to 2ed976a Compare February 7, 2025 15:32
@fressi-elastic fressi-elastic force-pushed the issue/1900 branch 4 times, most recently from 9ed48b4 to c0c373a Compare February 13, 2025 09:29
@gareth-ellis gareth-ellis requested a review from a team February 13, 2025 09:47
@fressi-elastic fressi-elastic force-pushed the issue/1900 branch 2 times, most recently from 1fa76d8 to d1e7004 Compare February 13, 2025 14:47
@gbanasiak
Copy link
Contributor

In addition to documentation I would also try adding new modules in here to enable full scope of mypy checks:

rally/pyproject.toml

Lines 200 to 206 in 2ef33eb

[[tool.mypy.overrides]]
module = [
"esrally.mechanic.team",
"esrally.utils.modules",
"esrally.utils.io",
"esrally.utils.process",
]

Context: Until #1798, so recently on Rally timescale, we had no annotations. They were introduced very surgically to reduce the scope. Then in #1859 we introduced mypy overrides that restore full set of checks for selected modules - either new ones, or the ones we are revisiting. As we're planning to broaden annotations it would be best if all new modules were added there. Ultimately the overrides should be removed.

@gbanasiak
Copy link
Contributor

CI failure will be addressed by elastic/rally-tracks#748.

@fressi-elastic fressi-elastic force-pushed the issue/1900 branch 4 times, most recently from 95de3dd to fef436a Compare February 23, 2025 05:27
@fressi-elastic
Copy link
Contributor Author

I enabled some more typing check in more modules (including config.py) and improved unit tests. While doing that I made some cleanup in the Config class to simplify it. Its unit test is still a mess on my opinion but I would avoid refactoring it in this change because it is out of scope and this is getting too big. I am considering splitting this PR in more parts to speed up the review process. Do you think it would help?

@fressi-elastic fressi-elastic force-pushed the issue/1900 branch 2 times, most recently from 3ef0c5d to e07cf14 Compare February 23, 2025 05:58
@gbanasiak
Copy link
Contributor

I am considering splitting this PR in more parts to speed up the review process. Do you think it would help?

I think config.py simplification and its merge with types.py should go into a separate PR with a more representative title.

@gbanasiak
Copy link
Contributor

For example, it seems we're dropping some useful tests from types_test.py? I think making sure the list of literals is sorted, or making sure that every literal is actually present somewhere in the code are useful. But this discussion should not block index template change I think.

@fressi-elastic
Copy link
Contributor Author

fressi-elastic commented Feb 27, 2025

For example, it seems we're dropping some useful tests from types_test.py? I think making sure the list of literals is sorted, or making sure that every literal is actually present somewhere in the code are useful. But this discussion should not block index template change I think.

@gbanasiak @gareth-ellis @favilo I think using Literals for defining sections and keys in the config file has been kind of naive and incomplete solution. A better one could be adopted (like for example using dataclasses and ad hoc section importers from a generic ini file parser).

We could have for example a loader for every section implemented near the code that is going to use it (for preserving code modularity). The ini file is first converted to a dict of dict and then passed to a dataclass before being used in the code. In this way we do have regular linters protecting us from refractory errors. There are no strong reasons why not to have independent conversion (dict to object) functions for every individual section, given a common dict of dict made from the original file.

For the purpose of scopes it is enough to made up the final objects from multiple set of dictionaries (one for every scope) and in the right order to keep scope priority.

I tried to reduce the complexity of the code, but this is still a naive solution, this testing I removed is very weak and it doesn't allow easy further refractory of the code.

Could we please forget about this test module and plan a true refactor of this part of the project after this change?

@fressi-elastic fressi-elastic force-pushed the issue/1900 branch 3 times, most recently from dba9e78 to 13a3fa4 Compare February 28, 2025 12:26
Copy link
Contributor

@gbanasiak gbanasiak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works great, thank you. I left some comments inline.

I think the same logic should apply to remaining templates - rally-results, rally-races, and rally-annotations, for consistency. WDYT?

Regarding config refactor my personal preference would be to simply avoid it in this PR due to its size and the fact the PR title does not match the content. I would simply use convert.to_bool() like in all other places (datastore.overwrite_existing_templates is not the first true/false setting in the INI file) and work on the refactor in another PR. Note we are note quite there yet, as now we should convert all the boolean config reads to not use convert.to_bool() which will extend the scope further.

I don't have strong feelings about types_test.py. I think its form is due to the context in which it came to be. There were no annotations at all and they were inserted surgically. Some of these tests were just to confirm that code edits were complete.

Comment on lines 313 to 311
if v is None or v.scope.value <= scope.value:
self._opts[section][key] = _V(value, scope)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice and important simplification. We lose configuration values with broder scope (lower numerical value), but we don't seem to need them? That's the type of thing I would prefer to see in a PR with a different title. Do we need scopes at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized the lower scope value where stored but never read. Therefore I preferred to filter out lower scope value when putting new values instead when retrieving them. This should open to many more simplifications later on. For example at this point because we are using a dictionary of dictionaries we could probably use 3rd party code for the inner implementation as it is the way typically ini file are being loaded (I.E https://docs.python.org/3/library/configparser.html).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see how lower scope could be useful if we say iterated through multiple benchmarks with slightly different settings, and wanted to override settings per-benchmark, then remove this override, then work with another override. But we're not doing this today. I think I'm OK with it unless @favilo has different opinion? We can return to this discussion in the expected new incoming PR.

@fressi-elastic
Copy link
Contributor Author

fressi-elastic commented Mar 1, 2025

I am still processing pending comments. I am going to split this PR.

@fressi-elastic fressi-elastic force-pushed the issue/1900 branch 8 times, most recently from 5ea8073 to 7eb55d2 Compare March 3, 2025 13:50
@fressi-elastic fressi-elastic force-pushed the issue/1900 branch 5 times, most recently from 8e8fbb2 to 79b3907 Compare March 4, 2025 14:03
When opening an `EsMetricsStore`, it creates the index template if any of following:
 - the index template doesn't exist
 - `reporting/datastore.overwrite_existing_templates` option is `true`

It will preserve existing template on all the other cases.
It adds a new method for getting boolean configuration options.
It logs a warning when an existing index template is being replaced.
It highlights index template differences between the existing one (if any) and the
configured one (according to rally.ini).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves the status quo highlight A substantial improvement that is worth mentioning separately in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants