-
-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failing to reach ring server #1506
Comments
Please upgrade to a supported version of NodeJS, ideally latest v20.18.0, and reupload logs. However, I will note that there is clearly a network issue going on here as every single fetch fails, even those not made by homebridge-ring itself, for example, the message "Failed to connect push notification receiver", which means push-receiver could also not connect, and it also uses fetch. Something is blocking HTTPS network connections from this container. I've had one report of this with a pfSense firewall that was running Snort, which somehow identified the traffic as bad and added a dynamic blocking rule. |
@tsightler i updated to the latest 20.18 and the issue remains. Just made a downgrade to 12.1 and now everthing works after setting up all cameras again. |
Downgrading to an old version doesn't really help to troubleshoot the problem. Older versions use an external library for making HTTP requests, because versions of Node prior to v18 did not have native fetch. Version 13 and later support native fetch, which is now natively supported in NodeJS v18 and later, but for some reason that is failing for you. The only way to solve the problem is to troubleshoot why fetch is failing for you, otherwise no future version will ever work again. Also, old versions are unlikely to stay working with ding and push notifications because the old push notification format is deprecated. Don't know when Ring will completely kill it, but it will happen at some point (at least for some people it seems to have already happened). If you are unable/unwilling to work on troubleshooting the issue, I will close it. |
@tsightler don't understand me wrong of course i want to help solving the issue but in first step it should run everything like before. what can i do please advise i am not the specialist in networking and co. |
@tsightler i can setup a second docker with only ring plugin for testing and evaluating, because there will be the same issue |
@pcziesch Thanks. Can you share some additional details of your setup, for example, above you state Ubuntu 20.04.5 (pretty old), what version of Docker are you running? Does that system run as a VM, if so, what is the hypervisor host and version? Also, what is the networking configuration, what hardware, firewall/gateway, etc? |
ok docker ist the QNAP docker station in version Version 3.0.8.981 (2024/09/13). Image is the oznu/homebridge image in latest version installed. homebridge docker is bridged to the ip of the NAS. I hope this helps for info |
I’m having the exact same issue; everything has been down for me for a few days now… Docker 24.0.7 Networking config: AT&T fiber modem/router combo in bridge mode with an Eero router as the main network. Raspi has a static IP and no other plugins have issues. I can provide more details on the OS setup if you provide commands to run, but it’s just a basic install of HomeBridge on a raspi. |
@tsightler i tried today another docker homebridge with ubuntu 22.04.5 and ring plugin 13.1 and had same issue so seems to be not an ubuntu issue. after dwongrade to 12.1 everthing is fine |
While I appreciate sharing your details, you did not provide any logs so I will not assume that you have the exact same issue as "everything has been down" can be caused by many different issues so you may have the exact same symptom, but not the same issue, and that just causes confusion. Please post logs showing showing you have the exact same issue or your comment will be marked as off-topic. |
@pcziesch I've been doing some research to try to understand what might cause this issue since ring-client-api and push-receiver are just using native fetch in NodeJS so there's not really much that can be done from a code perspective as the error is coming from the fetch function call itself. Underneath the covers, native fetch in NodeJS is based on a minimal version of Undici, and there are a few similar reports on that project. The vast majority of these reports boil down to DNS or IPv6 issues, or some combination of interaction between those two. It seems that native fetch is not as robust as some of the other http clients out there, and will default to trying IPv6. If IPv6 is enabled, but not configured to actually work, it takes too long to timeout and will never fall back to IPv4, ending up with the connection error. Are you comfortable enough with the command line to get access to the running container to run some commands in it? |
@tsightler so you want me to enter some telnet commands on the nas right. i will give it a try if i get a proper advise for advanced users. maybe it will take some time in some cases because i have to read first and than try out |
@pcziesch I'm trying to understand your setup clearly in the hope that maybe I can somewhat reproduce this issue. You say you are using QNAP Docker Station, but then there's also an Ubuntu VM? Are you running Ubuntu VM on QNAP then Docker inside Ubuntu VM? I was under the impression that Docker Station would run the docker containers directly. |
@pcziesch Maybe another question, do you have Discord or some other way to direct message? Would you be willing to setup a time for a remote session so that I remotely see and understand your setup and perhaps try to troubleshoot interactively? Note that I understand if you are not willing to do this, but it might save some time on both sides. I'm very interested in getting to the bottom if this issue as it just makes very little sense to me. |
@tsightler what i see is an app in qnap called docker station where you can install docker like himebridge all in wysiwyg. there is also an vm station on qnap which is not installed in my case |
@tsightler of course we can do an online session and i am willing to get of this issue. but i am not familar with discord , ms teams is on board and i am next week on a business trip so i cannot participate until end of next week maybe at the weekend |
@pcziesch OK, thanks for being willing to share. I have some ideas to look into, maybe a remote session would not be needed, but it's good to have that as a possible option. I'm also travelling for work next week, so understand the challenge there. You've provided me some clues. For example, when you are stating the Ubuntu version, I believe you are referring to the version inside the container (i.e. what is shown in Homebridge Web UI), so that is more clear now and it sounds like you are using 'host' network mode. The onzu/homebridge container is outdated so it would be good to use the official homebridge/homebridge container for all testing. I don't have a QNAP, but I can try to emulate this setup as closely as possible and see if I can figure out any way to reproduce. |
@tsightler you Right with your assumptions. I already have setup a docker with official Homebridge Image and one Ring device. So if there is something to Test Go for it. |
HomeBridge makes it fairly difficult to copy logs when on mobile, but it's the same style of errors in logs; failed to reach Ring (on the same subscribe route as the description of the issue), fetch failed, trying again in 5 seconds. There's literally no other details in the logs aside from that error over and over again with different numbers for the path param following Any other specifics you need? |
@SpencerKaiser Yes, I need to full logs from plugin startup until the first error, nothing else is really useful. |
And redact that path param I assume? Anything else non-obvious I should redact related to ring? |
You are free to redact whatever you feel is required, personally, I don't think there is anything of concern in the paths, they are just ids (I wouldn't paste, for example, my token, but people do that sometimes too). Ring clearly doesn't see device/location IDs as sensitive information or they wouldn't be in the URL to begin with (some are even visible in the web based Ring console). If you have serious concerns you can feel free to send logs directly to my email, same username as here, but gmail. |
@pcziesch I have something I would like you to try when you get a chance. It's a bit of a long shot, but it will help me to remove one possible cause that keeps coming up in searches on this issue. Below are the steps:
Almost every single case I can find of the 'UND_ERR_CONNECT_TIMEOUT' indicates that it is an issue with IPv6 being misconfigured, perhaps something like IPv6 being available but the firewall configured to drop all IPv6 traffic, just as an example (which is a common default in some devices). I actually don't see these Ring hosts resolve to IPv6 addresses in my environment, so I'm not convinced that is the issue here, but perhaps there are cases where they do as they appear to be using AWS load-balancing services. This should make the system prefer IPv4 addresses in all cases. |
Here is the log after reboot [10/14/2024, 5:21:59 AM] [HB Supervisor] Restarting Homebridge...
[10/14/2024, 5:22:00 AM] Homebridge v1.8.4 (HAP v0.12.2) (Homebridge xxxx) is running on port 51792. NOTICE TO USERS AND PLUGIN DEVELOPERS
[10/14/2024, 5:22:13 AM] [Ring] Found the following locations: ring camera works. after installing the 13.1 plugin version i had the same fetching failures |
@pcziesch Sorry for not being clear, I was looking for logs of the 13.1 version. Mainly I want to see if push-receiver also fails in 13.1 after making the change, or just Ring API (basically, is there any change in behavior at all from that setting). |
@SpencerKaiser Can you tell me a bit more about your install? You state you have static IP, do you have both IPv4 and IPv6 setup? What are the DNS server settings? What specific OS is installed and what model RPi? Which docker image are you using and what instructions did you follow it install docker? I have both a spare RPi3 and RPi4, so I'm thinking I might be able to duplicate your setup more closely as I don't own a QNAP. That being said, I don't think these are the issue, I think it is something firewall related instead as the failures are happening at the TCP connection level, way lower than any code written here. |
@tsightler sorry i will Dad the log with 13.1 later this day. |
@tsightler sorry for the delay, I had a crazy weekend and haven't had time to sit down with the laptop. I'll dig in this afternoon and get logs and all the environmental stuff you asked about above. Quick unexpected update: everything works again....... I didn't change anything, didn't restart, and did nothing on the HB/software side of things. I feel like there's zero chance this is relevant, but just in case: I replaced batteries in my smart lock (which was dead) and replaced the battery in my doorbell (which was ~10%) and immediately after that everything was working again 🫣 |
[10/28/2024, 6:29:14 PM] Loaded config.json with 0 accessories and 2 platforms.
[10/28/2024, 6:29:16 PM] Homebridge v1.8.5 (HAP v0.12.3) (Homebridge 8601) is running on port 51563. NOTICE TO USERS AND PLUGIN DEVELOPERS
[10/28/2024, 6:29:38 PM] [Ring] Found the following locations: |
At @pcziesch, thanks for that. Did you happen to cut anything out of this log? Specificaly, on the "[cause]:" line I would expect there to be a list of IP addresses that were tried. |
@tsightler nothing cut out except of location id |
Thanks @pcziesch. Guess I need to check that, maybe it only logs IP on newer NodeJS versions. |
@pcziesch I have been doing a lot of research on this issue and, unfortunately, I have to ask a few more questions.
|
@tsightler |
@pcziesch Thanks. To clarify point 3, I wasn't asking you to change anything or go through any effort, I was just asking if you were using something additional for mesh wifi, like the Eero, as I found some documents describing issues there when they have their advanced app security features turned on. The fact that your standard installation works fine would imply that the firewall is seemingly not the issue. I found one matching issue on the NodeJS project page which was fixed in later versions of Undici, but those are not yet merged into most released versions of NodeJS (only NodeJS 22.10.0 has the fix so far). However, the fact that your container instance also can't reach registry.npmjs.org, which has nothing to do with homebridge-ring, indicates to me that something is more widely impacting network connectivity within that container, I'm just not sure what. |
@tsightler i assume the Same that something is with the Container App . I‘ll Check if i can Downloads the Version to Test a previous Version |
Yeah, I doubt that it is the container itself, my guess is something networking wise is causing it, I'm just not sure what as I'm not that familiar with the QNAP. One thing you can do is run with NODE_DEBUG=net. Just add NODE_DEBUG to your Docker environment for the container, set the value to "net" and restart the container. This will dramatically change the debug logging out, but will show the actual TCP level connection attempts. This might tell us something. |
@tsightler a lot to read , sorry <-- Snipped logs --> |
Hi @pcziesch, thanks so much for all of your efforts here. I removed the logs from the post (saved locally) just because they were so long and make it difficult to read this thread, but they are very interesting. Initially, there are clearly few successful connections to the Ring API, but then, after 3-4 successful connections, it appears to stop immediately after a request to prd-api-us.prd.rings.solutions, which never appears to resolve, and then no other request after that successfully connect. What is weird is there's never even an actual connection attempt logged. I think it's because DNS stops responding (NodeJS doesn't cache any DNS requests by default so every request has to be resolved by the local DNS server). Does the QNAP have any firewall services on it? Are you using any unusual/custom DNS configuration, for example something like PiHole? I believe that QNAP allows you to set custom DNS for docker0 network. Do you have anything configured there? Could you perhaps try using something like Google DNS there (8.8.8.8 and 8.8.4.4) and see what happens? |
[@tsightler] you're welcome i hope at the end we'll find a solution which maybe help others too. QNAP has its own firewall QUFirewall App which i deinstalled already which did not have any chaneg in behaviour of homebridge ring plugin. so still same issue. |
Just for reference in case anyone else has clues here are some small snippets from the network debug logs. Initially, you can see a few connections:
The log above shows what you would expect, NodeJS issues a low-level connect request for api.ring.com, DNS resolve, the connection shows as "isConnecting" to show that it is in progress, the code attempts to connect to the address, the connection attempt is successful (status 0), read starts once socket goes into connecting state. There's ~4 of these successful connections, and then a different pattern emerges:
At this stage it never even attempts to connect, but I have no idea why. I'm assuming it could be because DNS is not able to resolve the address, but I don't know this for a fact, nor do I understand why it would work a few times and then stop working. This is very low-level function in NodeJS itself. I'm really struggling for ideas here. |
@pcziesch Yet another request, while running the container with the 13.1.0 plugin in the non-working state, can you please open the terminal via the Homebridge Web UI and run the following commands, and post the results:
and then
I would expect both of these to return a 404, but any results would be helpful. Then I would ask that you run the following commands as well:
and then
These commands should produce no output if they are successful, but I'd be very interested in any errors. |
@tsightler see output below root@xxx:/var/lib/homebridge $ wget https://api.ring.com root@xxx:/var/lib/homebridge $ wget https://prd-api-us.prd.rings.solutions root@xxx:/var/lib/homebridge $ node -e 'fetch ("https://api.ring.com");' |
@pcziesch Hmm...more interesting as that all shows success so it doesn't look like a network or firewall issue, it's like something in NodeJS itself is failing. Can you post the output of:
and
|
root@845eea546da1:/var/lib/homebridge $ uname -a |
@tsightler but why is it running with 12.1 ? that is the only difference |
@pcziesch As mentioned previously, versions of the plugin <13 use an external library for HTTPS requests (specifically, an older, outdated version of the got library). This was done because, prior to NodeJS 18, there was no native fetch() implementation, as is common in browsers, implemented in NodeJS. However, over the years we've had plenty of similar issues with got, especially due to conflicts with older plugins that used different client libraries, etc. It was always a bit of a pain. Node introduced the native fetch() API in Node 16 behind an experimental flag, and it was available by default in Node 18 and later. Many packages had already switched to it, including some upstream dependencies like push-receiver, so we decided v13 was a good opportunity to switch as well. Ideally, it would lead to less conflicts since it is a native feature. So far, success is high in most cases. To give an example, the project I maintain (ring-mqtt) which depends on ring-client-api has over 10,000 installs of which >95% have been running v13 API for almost 2 months, and zero reports of this problem. However, clearly there are some subtle cases where problems occur, I just can't wrap my brain around how. The code is quite simple here, actually the commands I had you run above are basically all the code is doing, but for some reason, it worked when you ran them individually, but fails in the main code after a period of time. At this point it feels like it has to be some kind of NodeJS issue, but I've not been able to find anything that really matches in the project Github, although there are a few reported issues that are somewhat similar. What is more confusing is why does it work for 99% of cases, what is the environmental trigger. I just don't know. |
@pcziesch I would like to request that you try NodeJS 22, which just went LTS a few days ago. You should be able to upgrade easily by running this from the terminal:
Then restart homebridge using the UI. These changes are not persistent, if you restart the container they will be lost, so make sure to only use the restart option in the UI. I'm curious if the Undici updates in NodeJS 22 might address this issue. |
@tsightler updated to v22.11 issue is Remaining 🤷♂️ |
@pcziesch Thanks for testing. I didn't really expect it to change, but since it was easy to test, thought it would be worth the try just in case as fetch was marked fully stable in NodeJS 22. I've still got a few more things from the terminal when you get a chance.
Maybe even run this 3-4 times if you can, then a simple ping:
I'm really running out of ideas. Well, actually, I have quite a few more ideas of the potential cause, but I'd really like to figure out how to reproduce the issue because the other stuff I need to try requires modifying code to test things and that's quite hard to do without access to the system in question. |
root@xxx:/var/lib/homebridge $ netstat -ant4 --- a92f83bba7a172eaf.awsglobalaccelerator.com ping statistics --- |
@pcziesch Just so I'm 100% clear, the system you installed on standalone Ubuntu works fine with 13.1 plugin, correct? |
Yes absolutely no errors at all on Raspberry pi with 13.1 |
OK, I'm going to stop after this one:
|
root@xxx:/var/lib/homebridge $ node -e 'const res = await fetch ("https://raw.githubusercontent.com/tsi |
@pcziesch Hmm...somehow in the copy and paste of the URL was partially changed so this didn't test exactly what I was trying to test. Both the path is wrong and and there are periods instead of dashes in the filename. I was trying to get that specific URL because I'm trying to simulate a larger response as, from the logs, I can tell that the first few requests that work have very small response sizes, but the call to ring_devices is a much longer response and it seems like it's after that point that it hangs. so the test above is pulling a small image. |
@tsightler tried it again with this result root@xxx:/var/lib/homebridge $ node -e 'const res = await fetch ("https://raw.githubusercontent.com/tsightler/ring-mqtt/dev/images/ring-mqtt-logo.png");console.log(res)' |
Unfortunately, I don't think I'm going to be able to solve this issue unless someone can give me access to a system that is experiencing the problem. I know that is a big ask, but the next level of troubleshooting will require experimenting with various code tweaks to force some behaviors and there's no way I will be able to come up with one-liners for all of that. Is there any chance that someone that is having an issue can provide me remote access to a system with the problem? We can do this a couple of ways, either something like Tailscale VPN, or even just a port forward that goes the Homebridge web UI on a test system (I have a static IP so it can be overall quite secure). I can even use a test system without an active token as I can try with my own account. |
Is there an existing issue for this?
Describe The Bug
sinces some days i recognized that my cams and all other devices from ring not shown in my apple home which wasn't an issue at all in the past. the homebridge runs on an qnap NAS via docker which didn't change since years also. i do not have any clue why this happens no and rebooting of NAS , homebrodge, Router etc doesn't fix anything
To Reproduce
No response
Expected behavior
plugin should connect to ring api server
Relevant log output
Screenshots
No response
Additional context
No response
OS
Ubuntu Focal (20.04.5 LTS)
Node.js Version
18.13.0
NPM Version
where to find ?
ring-client-api
13.1.0
Operating System
Docker
The text was updated successfully, but these errors were encountered: