-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration doesn't always properly reconnect to the bridge #43
Comments
Thank you very much for your work! I updated it and now it connects properly. I will let you know if there are any issues, but it will probably be fine now! Again, nice job @litinoveweedle and @michaelarnauts ! |
Thank you. I will start testing the updated Hacs integration today. |
For now it looks rock solid! Very nice job guys!! [edit 03-04-2024 07:40] |
Having some issues here. |
Hello all, As you are probably aware, the fix provided solves situation where your HA looses connection the the ComfoConnect gateway. In the past, the integration was not able to handle such state completely gracefully, without raising nuch of errors and warnings. As about the root cause, I was finally able to find it and fix it also based on information provided by @michaelarnauts. Disclaimer: I did never used phone app with my gateway, as I didn't had a reason to do so. Also my gateway is not connected to the network with internet access - so no cloud connection. But due to this I was not aware, that there is new updated firmware - as you get firmware update prompt during registering of your gateway with mobile application. As I discovered original shipped firmware version U1.1.10 was having following issue: My standard DHCP server has lease time 10 minutes. 5 minutes after getting IP (DHCP ACK) gateway always asked for DHCP renew (DHCP RENEW). At this moment, even if the gateway kept responding to the ping, it immediately ceases all data communication to ANY client (both Home Assistant as well as mobile app). After 5-20 sec Comfoconnect HA integration timeout, raised exception and restarted communication. At that time gateway is already able to reestablish communication - but only to repeat whole story 5 minutes later. I did validated this behavior with tcpdump and also by increasing DHCP lease time. I was able to solve this DHCP renew related disconnect by upgrading my gateway to firmware U1.5.1. This can be done from mobile app. Please see firmware release notes:
So for anyone previously having disconnect/reconnect issue please check your firmware version and also network stability. After firmware upgrade and update of the integration everything works perfect for last few days. I am also planning to simulate some network instability artificially to test improvements in the integration - as right now my connection is rock stable. Again huge thanks to @michaelarnauts! |
This is not stable at all for me. After one day it is not possible anymore to control the comfoAirQ. |
Ok, after several more days of using integration I had almost no issues. That is probably mostly due to removing the root cause of the instability (by firmware upgrade). But I observe really crazy chain of exceptions to be triggered, which required HA restart. Something went really crazy and I did not tried to trigger this, it simple happened somehow.... I am attaching filtered HA log: |
Would it be please possible to describe your problem in more technical manners? Did you checked your FW version? Do you have errors in the home assistant log? Please try to collect more information first, than lets see what we can do to help. |
My integration is running fine. Fingers crossed. Some "specs": Home Assistant (docker) Zehnder ComfoAirQ integration via HACS ComfoAir Q350 B R ST ERV Prem @litinoveweedle Hope this helps? |
v0.1.9 is still the old version. You'll need to select master to have the latest changes. |
And that unfortunately goes as well for me, mea culpa. I just installed 'Master' branch. So please disregard my previous report. |
Here it's working flawlessly.
I added a +/- 5h log from yesterday. |
I just switched to the 'master' version. Thanks for the hint! |
I'll try to create a 0.1.10 version or something to avoid more confusion. |
I will try... :) What I see in the logs now (still on the default integration): I will switch back now to the HACS integration and try to collect some logs. Should I turn on 'enable defug logging' and collect these logs? Here are some details about my configuration: Home Assistant (Synology VM) |
I did not get the master version as well. So now starting to test the master version. edit: I noticed that installing the master version from scratch it is not possible to connect to the ComfoAir system (using GUI). So I first downloaded 0.1.9 again, connected it and after that did a 'redownload' to the master version + restart |
To avoid all the confusion, I've released 0.1.10. It's easy to go back to the previous release anyway. |
The home assistant log is filled with errors... Logger: homeassistant Error doing job: Exception in callback ComfoConnect._unhold_sensors() Logger: homeassistant Error doing job: Task exception was never retrieved Logger: homeassistant Error doing job: Task exception was never retrieved During handling of the above exception, another exception occurred: Traceback (most recent call last): Logger: aiocomfoconnect.bridge Timeout while connecting to bridge 192.168.0.35 Logger: homeassistant Error doing job: Task exception was never retrieved Logger: homeassistant.helpers.entity Update for select.comfoairq_balance_mode fails |
Can you please check/share/confirm the versions you use? |
I have never seen this: ConnectionRefusedError: [Errno 111] Connect call failed ('192.168.0.35', 56747) I would guess your bridge is running an old Firmware, or you have some odd network issues going on? |
No it is not old firmware.... Home Assistant (Synology VM) |
#9 (comment) |
Thanks! But anyway, I have no clue what could be going on. I understand this can be frustrating, but remember I'm just another user like all of you. I don't work for Zehnder and I need to carefully plan the free time I have. Feel free to experiment with network settings, cable lengths or any other changes you can try to pinpoint the cause. It might even be an idea to contact Zehnder and ask them why the bridge is disconnecting? |
What I tried today is replacing the interface cable to the lan c module. I noticed that Zehnder has a specific recommendation on what cable it needs to be. And the cable that I used was not the same. There might have been an instability because of that resulting in unusual errors... Fingers crossed. |
@joshuavandermeulen would you be able to capture traffic to/from gateway on your HA interface using tcpdump? It would be great to see, what is going on. |
No clue how to do this... First testing the default home assistant integration with the new cable. If that is stable (so far so good) I wil try the hacs integration again and check the logs |
How did you update the Q350 firmware? I'm using Q450 SI TR ERV but the latest firmware is R1.10.0 |
I have a personal Comfoconnect Cloud account to use in the App |
You can request an account with an e-mail to |
I would not expect unit FW to have direct impact on the issue. Regarding how to use tcpdump, it depends on the type of the HA installation. As I would expect you are using passion OS, then you can find the way how to install it here: https://community.home-assistant.io/t/tcpdump-installation/250287/4 Tcpdump should help to observe and understand network traffic between HA and getaway. At least in my case it was an eye opener. If anyone would need help we can do it together remotely. |
Is this for an installer account so you can do firmware updates? |
Yes |
Here are some details about my configuration: Home Assistant (HP ProDesk 400 G4, Proxmox, HA OS VM) as I see, I have no issues in the last two weeks, when I updated to the "master" version... I hope it helps. One more thing regarding the network: I use many VLANs. The Proxmox host has only one physical network card, so I use trunk port on my managed switch for it. As my network expanded, I added more and more VLANs, which appeared as a new virtual interface in the HA OS... The interesting thing was that every network card also had a default route, which resulted in more network "default" routes (because HA OS should have only 1 default route, i.e. gateway). This caused many network anomalies, because HA sometimes sent out a packet in one direction, but there, e.g. the firewall blocked it, while other times, when the package went in the right direction by chance, it managed to reach its destination. When restarting the HA, the default route of the HA sometimes changed arbitrarily... I had to manually delete the unnecessary default routes, so the route table finally works well based on the networks address for each interfaces. If someone is also using several VLANs in a similar way, I recommend that you look into it! If you need more info about this issue and its solution, let me know, I'm happy to give more details. |
All problems with my installation were resolved by replacing the cable between my comfoconnect lan c module and the Zehnder ventilation system with a single core cable (instead of a multi core). There must have been connection issues between the two devices... It is stable for 2 weeks now. |
Hello guys, anyone faced this message in HA recently?
home-assistant/core#73679 (comment) Its appear after HA restart, but only if HA was running at least for few hours. (need to test more to find more precise timeframe). The working hypothesis is , that some integration is blocking main loop so recorder could not correctly close open SQLite database. As there was a lot of work around threading in comfoconnect recently I just want to check with you first. I am unfortunately not precisely aware when this appeared... |
I am still troubleshooting my HA shutdown issues and I observed random exceptions raised by integration during HA shutdown. On Shutdown
|
Since there are a lot of similar issues, I'll create a new issue and link them all to this one.
This integration sometimes doesn't reconnect properly after the connection was dropped to the bridge. The exact reason for these disconnects are unknown, but it might be related to one of those:
Since these disconnects happen, it is important that we have a proper reconnect logic in place. Unfortunately, when the bridge disconnects (maybe it just crashes and restarts?), it seems it doesn't properly close the TCP connection, so we don't know that we got disconnected until we make a request and don't get a reply in due time. That's why there is a keepalive that is send every 30 seconds.
More information about the new reconnect logic is written here: michaelarnauts/aiocomfoconnect#23
Please try the latest code in Home Assistant with HACS. I've created a release for 0.1.9 for the release before these changes, in case you want to rollback. If you want to test the new reconnect logic, please select the "master" branch in HACS when (re)installing.
The text was updated successfully, but these errors were encountered: