Read out the supported and unsupported attribute of a device at configure #101

elupus · 2024-07-26T14:59:39Z

Read out the supported and unsupported attribute of a device at configure to be able to only list available attributes in GUI.
Don't swallow exceptions during configure of clusters in debug logs.

elupus · 2024-07-26T15:01:46Z

zha/zigbee/cluster_handlers/__init__.py

@@ -624,6 +631,46 @@ async def write_attributes_safe(
                    f"Failed to write attribute {name}={value}: {record.status}",
                )

+    async def _discover_attributes_all(


Wan't sure if this should go here in ZHA or in ZIGPY. The cache of unsupported attributes does exist in ZIGPY.

elupus · 2024-07-26T15:02:53Z

zha/zigbee/cluster_handlers/__init__.py

+
+    async def discover_unsupported_attributes(self):
+        """Discover the list of unsupported attributes from the device."""
+        attribute_info = await self._discover_attributes_all()


Also i'm not fully sure what type of errors devices might respond with here. We may want to gracefully handle some things. For example, the GeneralCommand.Default_Response response is sent by some quirks here which seem utterly wrong.

dmulcahey · 2024-07-26T15:05:55Z

zha/zigbee/cluster_handlers/__init__.py

@@ -441,6 +444,10 @@ async def async_configure(self) -> None:
            if ch_specific_cfg:
                self.debug("Performing cluster handler specific configuration")
                await ch_specific_cfg()
+
+            self.debug("Discovering available attributes")
+            await self.discover_unsupported_attributes()


nit: the name of the method and the comment as written are opposites (available vs unsupported)

dmulcahey · 2024-07-26T15:08:22Z

zha/zigbee/endpoint.py

@@ -204,7 +204,7 @@ async def _execute_handler_tasks(
        results = await gather(*tasks, return_exceptions=True)
        for cluster_handler, outcome in zip(cluster_handlers, results):
            if isinstance(outcome, Exception):
-                cluster_handler.debug(
+                cluster_handler.warning(


unrelated change? assume this was for quick visual in log during development?

Yes unrelated. But valid. Should not hide full exceptions of general type.

In the reconfigure dialog we show the results of the calls with success / failure. I need to update the joining UI to do something similar. It has the capability to flag failures w/ color and messaging it just isn't working atm.

@TheJulianJES we should never generically hide base Exception types. That hides programming errors (as i found out the hard way). Error in programming caused an attribute error that was hard to track down due to this.

Its fine to swallow know errors if we know their cause.

Ps. I move this out separate.

puddly · 2024-07-26T15:09:07Z

I believe this is already handled by zigpy, no? When we read an attribute, it is recorded as unsupported if the device responds as such (https://github.com/zigpy/zigpy/blob/37f4ee65ec19f7b8af318d4ed2dc5325029020d9/zigpy/zcl/__init__.py#L553-L554). ZHA cluster handlers list out the attributes to be read and react appropriately.

Or do you mean for the frontend's cluster configuration interface? If so, this change would probably need to be moved into a special debug mode, as every extra attribute read increases the chances of initialization failing, especially for battery-powered devices.

TheJulianJES · 2024-07-26T15:10:54Z

Some devices, like Hue motion sensors, can report an attribute as "unsupported" during and shortly after pairing. This affects the occupancy timeout attribute they use, for example. It should not be hidden.

The clusters menu is also a debug UI. Personally, I wouldn't hide any attributes. They can also become available after a firmware update. Although we could add a symbol or change the color depending on if an attribute is known available or unsupported.

Reading out all attributes on a cluster during pairing also seems like it could cause issues with battery-powered devices that go to sleep fast. We could miss the endpoints/attributes we actually care about.
EDIT: I guess this is improved by using the discover attribute command instead of reading them all? I'm not sure if it works correctly for all devices though.

Furthermore, quirks can define new attributes after a device is already paired. We don't reinterview the device then, so whether the added attribute is shown or not might be inconsistent depending on when a device was paired.

Lastly, many devices fail on cluster configuration, as they simply don't support some things. That's why it's a debug message at the moment. I'd keep it that way.

dmulcahey · 2024-07-26T15:11:06Z

I believe this is already handled by zigpy, no? When we read an attribute, it is recorded as unsupported if the device responds as such (https://github.com/zigpy/zigpy/blob/37f4ee65ec19f7b8af318d4ed2dc5325029020d9/zigpy/zcl/__init__.py#L553-L554). ZHA cluster handlers list out the attributes to be read and react appropriately.

Or do you mean for the frontend's cluster configuration interface? If so, this change would probably need to be moved into a special debug mode, as every extra attribute read increases the chances of initialization failing, especially for battery-powered devices.

would this be a good use for the scan device functionality? we could schedule it post join and attempt it n many times until it succeeds or we give up and make it manually triggered in the UI?

elupus · 2024-07-26T15:11:13Z

Yes its mostly for frontend. But i dont see why we read each attribute to find out if its supported instead of asking the device what is supported.

TheJulianJES · 2024-07-26T15:18:12Z

But i dont see why we read each attribute to find out if its supported instead of asking the device what is supported.

It would probably require some extra support for quirked devices with fake ZCL attributes. The device doesn't know we show them to zigpy/zha.
Zha-quirks tries to be "invisible" for the most part, but we can't really (and shouldn't IMO) modify the "discover attributes" command for all quirks there, so it would require some special handling in zigpy/zha.

dmulcahey · 2024-07-26T15:21:26Z

Yes its mostly for frontend. But i dont see why we read each attribute to find out if its supported instead of asking the device what is supported.

I agree with this. We just need to figure out where it makes sense to put it. @puddly's point on device pairing is spot on. Especially for battery powered devices.

elupus · 2024-07-26T16:10:49Z

@TheJulianJES i think it is unlikely HUE would skip report those attributes in the discover call even at start. That list is very likely a static list in the firmware so using this service is better than looking at the response value of the attribute read.

It is also much faster to use this service since it will report many services in one call compared to trying to read them.

We could wrap the discover call in zigpy to take quirked attributes into account to force them to be included in the list.

@puddly i could not find a place where we iterate over ALL attributes in the clusters to figure out what the device support. Doing that would also be really bad due to the number of requests.

This allow us to only display available attributes.

elupus · 2024-07-30T03:28:56Z

Still some issues with tests. Some tests seem to be using same mock on multiple clusters. Either that or we are configuring same cluster multiple times for some reason.

dmulcahey · 2024-07-30T12:55:01Z

https://github.com/mdeweerd/zha-toolkit/blob/main/custom_components/zha_toolkit/scan_device.py

We should implement something like this to handle this functionality in ZHA. We can potentially implement this either before or after the device has completed initial configuration and initialization. I'd suggest running it in a background task here if we want it before config / init:

zha/zha/application/gateway.py

Line 397 in e897841

or we can implement it here in a chained task if we think after makes more sense:

zha/zha/application/gateway.py

Line 422 in e897841

self._device_init_tasks[device.ieee] = init_task = self.async_create_task(

maybe we can make the initial task set an event that allows the scan task to run next after initi is done if we want to do it post config / init.

we also need to differentiate between a new join of a new device and a device just returning to the network (zigbee devices can leave and rejoin the network at will). The way this is currently being implemented here will affect new joins and rejoins. On rejoins there is no need to do this again. We also will need to figure out how to handle rescans post firmware upgrades where we should force a complete rediscovery and reconfiguration of the device as signatures (clusters, attrs etc) often change with firmware upgrades.

Changing where this is being implemented will also give us finer grained control over logging so we can avoid the flying blind issue you experienced and we can suppress logs where we deem them not helpful.

dmulcahey · 2024-07-30T13:10:02Z

I'd suggest running it in a background task here if we want it before config / init:

zha/zha/application/gateway.py

Line 397 in e897841

If we want to do this before the device is fully initialized we may want to move this into Zigpy to avoid race conditions between events. not sure if this quoted suggestion would cause a race or not but it may be possible. Would really need to poke at it to be sure.

dmulcahey · 2024-07-30T15:32:01Z

Still some issues with tests. Some tests seem to be using same mock on multiple clusters. Either that or we are configuring same cluster multiple times for some reason.

Debugged this... it's failing because @patch it patching request on the cluster class not the instances and as a result devices with identify clusters on multiple endpoints end up w/ more calls in the list.

Context from docs:
Additionally, mock provides a patch() decorator that handles patching module and class level attributes within the scope of a test, along with sentinel for creating unique objects. See the quick guide for some examples of how to use Mock, MagicMock and patch().

With this change disco tests pass:

Anyways, I think we need to discuss where this functionality belongs as outlined above.

elupus · 2024-07-30T19:50:07Z

So i've moved this to async_initialize, as far as i can tell that is only done when we initially join. I have some tests i need to figure out. I don't think we need to chain it to any other task?

dmulcahey · 2024-07-30T20:10:14Z

So i've moved this to async_initialize, as far as i can tell that is only done when we initially join. I have some tests i need to figure out. I don't think we need to chain it to any other task?

Initialize is called at startup, integration reload, and device pairing. For example: https://github.com/elupus/zha/blob/744a031a47694c426b78bd467b63b373b3066a0b/zha/application/gateway.py#L363

I think we want this completely separate from device joining because this can fail and be run in the background until it works without impacting device functionality.

elupus · 2024-07-30T20:47:09Z

So i've moved this to async_initialize, as far as i can tell that is only done when we initially join. I have some tests i need to figure out. I don't think we need to chain it to any other task?

Initialize is called at startup, integration reload, and device pairing. For example: https://github.com/elupus/zha/blob/744a031a47694c426b78bd467b63b373b3066a0b/zha/application/gateway.py#L363

I think we want this completely separate from device joining because this can fail and be run in the background until it works without impacting device functionality.

In theory it can effect function since some entities depend on which attributes are marked as supported (ie inverse of unsupported). I dont think it should be background..its a fast and quick thing as soon as you have connectivity.

During pairing we should be in quick poll (are we?) on the devices so this should not have any major effect on startup.

codecov · 2024-07-31T13:52:47Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.66%. Comparing base (dfe5ae7) to head (c16d356).

Additional details and impacted files

@@            Coverage Diff             @@
##              dev     #101      +/-   ##
==========================================
+ Coverage   95.65%   95.66%   +0.01%     
==========================================
  Files          61       61              
  Lines        9255     9283      +28     
==========================================
+ Hits         8853     8881      +28     
  Misses        402      402

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

puddly · 2024-08-01T17:37:28Z

During pairing we should be in quick poll (are we?) on the devices so this should not have any major effect on startup.

Fast polling can only be enabled for devices that have a Poll Control cluster. I don't remember off the top of my head how common this is but it's definitely only something that started appearing in newer Zigbee 3.0 devices. Many devices don't support it and once joined, they enter long polling mode and are difficult to reach.

We definitely should enable fast polling for a minute or two for any joining devices that support the cluster (and potentially every time we want to communicate with the device?). This needs to be done within zigpy, as it needs to happen as early as possible (possibly before this).

This sort of requires a communication overhaul for zigpy: we need a more reliable way to actually communicate with sleepy end devices. Right now, we rely on radio retries and it is not reliable for devices that poll infrequently.

elupus commented Jul 26, 2024

View reviewed changes

dmulcahey reviewed Jul 26, 2024

View reviewed changes

elupus added 3 commits July 30, 2024 00:30

Update unsupported attribute on configure

4962ff5

This allow us to only display available attributes.

Rename to unsupported in log

48c5111

Mock discover attributes to list all

1110a5a

elupus force-pushed the unsupported branch from 42e57ff to 1110a5a Compare July 29, 2024 23:43

Attempt to fix tests

744a031

elupus added 3 commits July 30, 2024 21:30

Patch each instance of cluster

b1aba9a

Only discover attributes on initialize

474925c

Drop uneeded patch import

0fbbad4

elupus added 2 commits July 31, 2024 15:48

Support marking attributes as unsupported in tests

f97d8b3

Merge remote-tracking branch 'origin/dev' into unsupported

882e038

Only grab unsupported when we are not using cache

c16d356

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read out the supported and unsupported attribute of a device at configure #101

Read out the supported and unsupported attribute of a device at configure #101

elupus commented Jul 26, 2024

elupus Jul 26, 2024

elupus Jul 26, 2024

dmulcahey Jul 26, 2024

dmulcahey Jul 26, 2024

elupus Jul 26, 2024

dmulcahey Jul 26, 2024

elupus Jul 26, 2024

elupus Jul 26, 2024

puddly commented Jul 26, 2024

TheJulianJES commented Jul 26, 2024 •

edited

Loading

dmulcahey commented Jul 26, 2024 •

edited

Loading

elupus commented Jul 26, 2024

TheJulianJES commented Jul 26, 2024 •

edited

Loading

dmulcahey commented Jul 26, 2024

elupus commented Jul 26, 2024

elupus commented Jul 30, 2024

dmulcahey commented Jul 30, 2024

dmulcahey commented Jul 30, 2024

dmulcahey commented Jul 30, 2024 •

edited

Loading

elupus commented Jul 30, 2024

dmulcahey commented Jul 30, 2024

elupus commented Jul 30, 2024

codecov bot commented Jul 31, 2024 •

edited

Loading

puddly commented Aug 1, 2024

Read out the supported and unsupported attribute of a device at configure #101

Are you sure you want to change the base?

Read out the supported and unsupported attribute of a device at configure #101

Conversation

elupus commented Jul 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

puddly commented Jul 26, 2024

TheJulianJES commented Jul 26, 2024 • edited Loading

dmulcahey commented Jul 26, 2024 • edited Loading

elupus commented Jul 26, 2024

TheJulianJES commented Jul 26, 2024 • edited Loading

dmulcahey commented Jul 26, 2024

elupus commented Jul 26, 2024

elupus commented Jul 30, 2024

dmulcahey commented Jul 30, 2024

dmulcahey commented Jul 30, 2024

dmulcahey commented Jul 30, 2024 • edited Loading

elupus commented Jul 30, 2024

dmulcahey commented Jul 30, 2024

elupus commented Jul 30, 2024

codecov bot commented Jul 31, 2024 • edited Loading

Codecov Report

puddly commented Aug 1, 2024

TheJulianJES commented Jul 26, 2024 •

edited

Loading

dmulcahey commented Jul 26, 2024 •

edited

Loading

TheJulianJES commented Jul 26, 2024 •

edited

Loading

dmulcahey commented Jul 30, 2024 •

edited

Loading

codecov bot commented Jul 31, 2024 •

edited

Loading