Forum Replies Created
-
AuthorPosts
-
jagosParticipantI originated this thread but don’t recall getting an email about @Lostsoulfly’s post even though I have notify checked.
I very much like the idea of getting the MQTT OPEN/CLOSED status upon detected change. This would be better than being able to change the heartbeat time.
But I also would very much not want the heartbeat version to go away, at least as an on/off option if not always on. The reason is that a change can be missed if, for example, the MQTT server is down when it happens like during a reboot of the MQTT server. Having to handle this scenario almost makes the auto-generated version worthless. My code is very simple with the current version.
jagosParticipantToday I updated to firmware v1.1.2 mostly because I wanted MQTT username/password. The update was successful.
I noted after reboot that under Options/Advanced that the Host Name: field was not retained and was replaced with OG_XXXXXX where formerly I had it as simply og. I reset this to og and rebooted and it appears to take affect.
I noticed however that when the device connects to my MQTT server, it is still using the OG_XXXXXX as clientID rather than the Host Name I specified whereas before this update, the clientID was my previously specified Host Name. Bug?
jagosParticipantSounds to me like you have diagnosed it pretty well. If touching the wires together triggers it but not the device there is a problem.
Now it could be the relay or it could be the connectors or their connection to the relay. So there is more than one possible reason but they all would appear to be related to the device or you are not getting good connection of the wires to the device.
jagosParticipantYou should not set the threshold right at the distance you sometimes measure. You say roughly 91 to open door and 193 to car.
You should set door thresh to half way between open and car, in your case 140 will do.
And set car thresh to half way between car and no car (ie floor) I will guess 360 for you so 270 would be a fine setting.
The idea is if you get a distance less than door thresh it will deduce door is open. If not and it is less than car thresh then it is closed with car. Otherwise it is closed with no car.
jagosParticipantYes, the only hassle is trying to get 2 wires into those tiny holes. But it can be done and it works.
The OG just has a relay that closes the connection in parallel with the wall buttons connection. There is no polarity to the OG wires so it does not matter which of its wires goes into which hole (among the two proper holes of course).
jagosParticipantGood that you found the problem. But the OG’s should not mess up like that when the server rejects the connection.They should display a useful error message in the Integration page and retry periodically.
jagosParticipantI have MQTT working though I am still on 1.1.0. I just put in the IP addr and that is it. I run the MQTT server on a local linux box on a different subnet. I have Door Open and Close checked as well as Notify me for both Open longer than 10 and at UTC time 0. I have witnessed messages for all of those cases.
I do not use IFTTT nor Blynk though the Blynk domain and port were set by default and I did not change that.
No idea what your problem is but I have included an image of my integration page.
Attachments:
You must be logged in to view attached files.
jagosParticipantAt https://github.com/OpenGarage/OpenGarage-Firmware/tree/master/Compiled change the branch to dev-1.1.1 to have access to this version.
I do not understand why 1.1.1 has not been properly released.
jagosParticipantSometimes this happens with a bad or inadequate power supply/adapter. Do you have another you can try to see if it makes a difference?
Less likely, but if the power supply/adapter does not do the trick, maybe the power cable is at fault. Try changing it.
jagosParticipantThat is indeed bad. You could try to prove whether the OG is doing the opening by disconnecting one of the relay wires and see if you still get spontaneous opening. The log would still record the opening as it depends on the distance sensor. That depends a bit on how often it happens.
jagosParticipantDo you know for certain that the door has opened or maybe it was just a bad distance read? How does the distance read look in the log on open at such times compared to what its value is when you know for sure the door is open?
If it is a bad read, try power cycling the device and if that does not work, try a different power adapter.
There are other threads on flapping distance values.
jagosParticipantidxman01, there is already a watchdog feature in the OG which checks WiFi.status. My point is that there are cases where it fails to detect the lack of connection. Whether my case is unusual is not the point. As I pointed out in a post above, if there is one such case, who knows what other cases may not have been found or overlooked. I am proposing an optional feature which I have boiled down to a fairly simple change that has no impact if not used.
I have an EdgeRouter Lite and as backup an EdgeRouter X. I am quite familiar with Ubiquity’s routers though I have not used their APs and am one of the top 40 solution providers on their forums (different username there). My ASUS works just fine and does what I want. I am not going to buy new hardware which would at best fix the single case I have identified but would do nothing for any other case that disconnects but is not detected.
No, I do not change SSIDs often but again what I am after is a more robust way to ensure auto reconnection.
jagosParticipantOn github, change the branch to dev-1.1.1
- This reply was modified 5 years, 5 months ago by jagos.
jagosParticipantIf you have not read the post immediately before this one, please read that one first.
I have thought about this a bit more and I realized that I do not know if a ping with timeout is even available on this platform. If it is not, then I guess end of discussion. So for now I will assume it is.
I thought about the code change and I think it would be really simple. There is of course the need for the UI to get the pingIP. Other than that, the following pseudo-code should work, a one line change:
replace line 1448
if(WiFi.status() == WL_CONNECTED) {
with
if((WiFi.status() == WL_CONNECTED) && (!havePingIP || pingSucceeds()) ) {
and that is it. If havePingIP is false then !havePingIP is true so the || is true thus the test resolves to WiFi.status check as current. If havePingIP is true then !havePingIP is false and the overall test requires both WiFi.status and pingSucceeds to be true else it starts the failure timer. Generally both will be true. If one of the two fails, it is very likely to be the one to continue to fail. But even if one fails and then after 60 seconds it is the other that fails, it is still worth a reboot. Also, if one fails and then later both succeed the no need for reboot. I really do not think the logic needs to be any more complicated.
It is probably worth having a boolean variable havePingIP as to whether the pingIP has been defined since this is in the main loop. I have no idea how on this platform you can test a ping with timeout so as a placeholder I merely used pingSucceeds() for that.
I trust this makes sense and you see it is a rather straightforward modification. Thanks again for your consideration. I am trying to make this as easy and well thought out as I can. If you see any issues, I would love to try to resolve them.
ETA: A way to make this safer still, when the user submits a ping IP address, before it is saved, a test ping is issued. If it succeeds then it is saved. If it fails, it is not saved and the user is informed.
- This reply was modified 5 years, 6 months ago by jagos.
jagosParticipantFirst, I would like to emphasize that the ping test would be optional contingent on a user supplied IP address. If no such address is supplied, there is no change to the current behavior.
Before, I suggested an optional interval setting but I no longer think this is necessary. I would set the ping IP address to the IP address of my router in the same subnet as the OG. I tested the speed of this from my desktop and a successful ping takes 2 ms (ping -c 1 -W 1000 192.168.193.1). Other environments will differ but choosing a local ip in the subnet should be quite fast.
I would suggest putting the test right after line 1448 WiFi.status test. If the WiFi.status test fails then the code branches and the ping test is not considered. If WiFi.status succeeds, then the ping test is only done if an ip address for it has been provided. If it succeeds, then the code proceeds as before. If it fails, then it can wait and try again just like WiFi.status failure. There would need to be a way to mark which kind of failure occurred and which to retest. A second failure would again cause a reboot.
I do not think the ping test ever needs to be done during boot. This is to catch conditions during normal running where connectivity has been lost but not detected by WiFi.status. And I do not understand why this happens in my case but it clearly does. There have been a few other times (no more than 5 and maybe less) over the 11 months or so I have had the OG where I have found it non-responsive but I cannot remember the circumstances and some or all may have been on older firmware. But on v1.1.0 I can reliably reproduce the problem I have described. If there is one such case, who is to say there are not other as yet unidentified scenarios? And it is bad if you are away from home, want to check the OG, and find it non-responsive. I think this approach can improve such cases making the OG more robust.
There might be a bit of an issue if the user puts in a bad IP address choice and not have much time after a reboot to correct it. But I am not sure this is any worse than if the user puts in bad static Device IP, Gateway IP, subnet, or DNS. In such a case the user would need to reset to defaults and start over.
I understand your reluctance when you cannot generate a test case. But you could simulate the ping failing by having the OG ping another host on your network and then taking that host down for a desired amount of time. This way you could verify the code. If you made such a test firmware that passed such a test properly and you believed would not brick my OG, I would be willing to give it a try in my situation and give you feedback.
I understand this takes effort on your part for a case you have never seen so it might seem not to be worth it. I do believe it would be a good addition to make things more robust and should fix the problem I know I have. It is certainly your choice. I just hope I have made a compelling case.
Thanks for taking the time to consider this.
jagosParticipantFirst, I use an ASUS router in AP mode. My router is an EdgeRouter Lite from Ubiquity which has no WiFi.
I am on the 1.1.0 firmware and have been since shortly after it came out.
As I stated earlier, my AP is not powered down. This problem occurs when I enable or disable one of my 6 guest SSIDs – not the one that the OG is connected to. In this case the router has to do some reconfiguration of its bridge and then I run a script to reconfigure the bridges and VLANs so that each enabled guest SSID is on its own VLAN.
Connectivity is lost as soon as I click the Enable button on the web page for the guest SSID. And it remains lost indefinitely after I run the script. Note the AP does not reboot during this change. During this I never see the OG blue led flash quickly indicating a reboot. Yet once I power cycle the OG forcing the reboot, it connects and works just fine with the new AP guest SSID configuration.
I thought it might be related to my VLAN stuff. I separate all guest SSIDs to individual VLANs so that I can use firewalls to isolate each one. When the ASUS enables or disables a guest SSID, it reconfigures bridges and VLANs but puts all the guest SSIDs in the same VLAN as the regular SSIDs. That is why I have to run a script to rebuild the bridges and VLANs to separate them. I have a reserved IP address for the OG within the VLAN subnet it is connected to. I was thinking during the reconfiguration from enable / disable of a guest SSID maybe the OG got a different IP address from the wrong VLAN and then once I ran the script to fix the VLANs, the OG was simply on the wrong IP address. But this would require the OG to have rebooted and I have seen no evidence of that. So I dismissed this idea but explained it for completeness.
When I have lost connectivity to the OG, I can neither get a web page nor ping it successfully. And I have waited at least 10 minutes in some tests.
I do not believe when the OG is in this disconnected state that a ping from the OG will succeed. But I have no way to prove that short of new firmware.
jagosParticipantOn the test, yes I accidentally copied the code 10 lines up (line 1438) from where I meant (line 1448). The line I meant just tests the status and not the localIP.
Your test case is clearly detected and the code recovers appropriately.
But my case is a clear example, I think, that there are cases where connectivity is lost and it is not detected. I did not see any evidence of reboot yet in every case, a power cycle regains connectivity immediately suggesting that the AP is fine at that time. And I have waited as much as 10 minutes without regaining connectivity.
This is why I think an optional ping test is a more reliable test. If the user provides a ping IP address, then ping it with timeout every interval number of seconds where that can be user specified as well but defaults to something reasonable say between 60 and 300 seconds. Doing a ping test with 1 second wait every 60 seconds should not compromise other functionality.
What do you think about this idea? It is really unfortunate when you are away from home and find you have lost connectivity.
I appreciate your excellent customer service and being so responsive.
jagosParticipantI understand that logic for connection testing. But there is still a failure.
I used my phone to reboot the OG and watched while it rebooted. I saw the blue LED flash quickly a number of times and then settle into one blink every 5 seconds. I did not hear any sounds – I thought I should. Anyway, this was a way I could identify the reboot by the quick flashes. And here I still had WiFi access to the OG once rebooted.
Next I enabled one of my disabled guest SSIDs which reconfigures the ASUS AP and causes a loss of connection with the OG. I continuously watched the OG for more than 2 minutes looking for the quick flashes indicating reboot. I did not see this. It continued to flash once every 5 seconds. I could not get the OG web page. I power cycled the OG, saw the quick LED boot flashes and then connectivity was restored. I tried this several times.
So it would appear to me that the OG test “if(WiFi.status() == WL_CONNECTED && WiFi.localIP())” may be failing to detect the disconnected status.
Have you tried disabling your WiFi AP while the OG is up and running and accessible to see if the disconnection is in fact detected and causing reboot? And if once the AP is restored the OG connects again?
Maybe there should be a ping with timeout test periodically to make sure there is actual connectivity, say every 60 or 120 seconds. This could even be optional with a user provided ip address to ping and possibly a user provided interval.
jagosParticipantOK then, that demonstrates my musical ignorance. What were those musical designers thinking? They sure were not software engineers.
jagosParticipantHow about also adding a field for NTP IP address that is settable independent of static or DHCP? My router acts as an NTP server and I might point the OG to that. May be too late for 1.1.1 but maybe in future?
ETA: Are the new fields also added to the http API for reading and setting?
- This reply was modified 5 years, 6 months ago by jagos.
jagosParticipantThe closer you put it to the closed garage door, the sooner part of that door will be below the sensor when it begins to open.
jagosParticipantI second this suggestion. I have often thought the same.
jagosParticipantI strongly support keeping the device as independent as possible with all resources hosted locally. I have my device completely walled off from the internet with only access from my local network. It is even on its own VLAN. For remote access, I have my phone connected 24/7 via the WireGuard VPN to my home network and I can do the same on demand for my MacBook. For automation, I run an MQTT server on an SBC linux box (on which I run many services) so an MQTT client on my phone is always available to get updates thru the VPN. The only packets allowed to leave my device are responses to local requests, NTP, and MQTT only to the MQTT server. This gives me ultimate security. I do not want to be loading resources from the internet as a security hole.
jagosParticipantThe new OGAPI1.1.0.pdf still states 1.0.9 API at the top with a date of Sep 10, 2018. The contents seem to be updated. So the in document title is confusing and should be corrected.
jagosParticipantI understand GMT+0 time. Of course, that means you need to adjust for daylight savings time twice a year. No biggy.
Still not clear on the input field format. If I want it to close at 5:30pm EST (-5), then that would be 22:30 GMT+0 so do I put in 22:30 or 2230 or what?
-
AuthorPosts