You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Dotty.__getitem__(self, item) method is decorated with @lru_cache(maxsize=32). When applied to a method, the lru_cache implementation uses the self argument as a part of the cache key, which result in at least one call of the __hash__ method for it, and maybe some calls of __eq__ too. The implementation of __hash__ for Dotty is hash(str(self)), and that str(self) ends up calculating str(self._data), which is rather expensive for a large dict — in fact, it is likely to be more expensive than running the __getitem__ code without caching.
For some real world data, in QMK firmware the caching of Dotty.__getitem__ actually slows down CLI commands like qmk find -f 'split.enabled=true' command by more than 50% (25 vs 39 seconds on my machine).
Should the __getitem__ caching be completely removed, or are there some important cases where it provides a meaningful speedup? If that's the case, maybe the caching could be made optional somehow.
The text was updated successfully, but these errors were encountered:
#45, which added that caching, claims that it's “up to 3-4 times faster”… I guess that it highly depends on the particular use case (if you have small dicts and lots of repeated lookups, you will get completely different results compared to large dicts and mostly unique lookups). So this kind of caching really needs to be tunable, but the existing API does not provide any way to reconfigure the cache, apart from accessing Dotty.__getitem__.__wrapped__ directly.
The
Dotty.__getitem__(self, item)
method is decorated with@lru_cache(maxsize=32)
. When applied to a method, thelru_cache
implementation uses theself
argument as a part of the cache key, which result in at least one call of the__hash__
method for it, and maybe some calls of__eq__
too. The implementation of__hash__
forDotty
ishash(str(self))
, and thatstr(self)
ends up calculatingstr(self._data)
, which is rather expensive for a large dict — in fact, it is likely to be more expensive than running the__getitem__
code without caching.For some real world data, in QMK firmware the caching of
Dotty.__getitem__
actually slows down CLI commands likeqmk find -f 'split.enabled=true'
command by more than 50% (25 vs 39 seconds on my machine).Should the
__getitem__
caching be completely removed, or are there some important cases where it provides a meaningful speedup? If that's the case, maybe the caching could be made optional somehow.The text was updated successfully, but these errors were encountered: