Search This Blog

Wednesday, October 30, 2019

Waking up devices in Hass.io (Home Assistant)


This is a quick post on a challenge I had to overcome while integrating my SmartTV (an LG TV 55UJ620V) with Hass.io.

I wanted Hass.io to be able to turn on the TV (as such allowing automations to be built on top of it, or for example turning it on through a voice command via Google Assistant). As such I first resorted to using the Home Assistant Wake on LAN built in integration (https://www.home-assistant.io/integrations/wake_on_lan/). It kind of worked, but was not very reliable (perhaps 2 out of 5 times it would work).

I knew that by definition, the way that (Wake-up On LAN) WOL is implemented is inherently unreliable: essentially the target (dormant) device is expecting a frame with a specific pattern of bytes. If it receives that frame, it wakes up the host, otherwise nothing happens. The device will normally scan for that pattern of bytes in the frame regardless of the type of transport level protocol it may be on top of. In the case of WiFi in particular, there is the probability (high or low, depending on the network conditions) of that single frame not reaching the destination. This probability increases with the more hops we have in between.

With this Hass.io implementation I was requiring the magic packet to be sent from a host (the Raspberry Pi where I keep Hass.io running) that is in a separate router vlan, from where the TV is (these are connected via Ethernet and WiFi respectively):



In order to target reliability I scrapped this approach altogether. As both my router and repeater are OpenWRT based, I started by installing the etherwake package onto the repeater, given that this is the closest note to the TV set:


Tested the tool manually and confirmed that it worked:


It was now a matter of scripting it, and for improved reliability, make sure that the script tests the connection after it tries to turn on the TV.

As a quick dirty approach and knowing that the OpenWRT repeater runs a web server, I went for writing a simple cgi-bin shell script that could be called from another host (making a very basic REST web service). This script would just have to send the WOL packet, and for a while check if the TV comes online. If it doesn't it sends the WOL packet a few more times, and tests the TV again:

root@griffinnet-zh-repeater:/www/cgi-bin# cat wol.sh
#!/bin/sh

MAX_RETRIES=10
NUM_PINGS=20

echo "Content-type: application/json"
echo ""

if [ -z "$QUERY_STRING" ]; then
    echo "{\"status\":\"error\"; \"message\":\"MAC and IP addresses not specified.\"}"
    exit 1
else
    MAC_ADDR=$(echo "$QUERY_STRING" | sed -n 's/^.*mac=\([^&]*\).*$/\1/p' | sed "s/%20/ /g")
    IP_ADDR=$(echo "$QUERY_STRING" | sed -n 's/^.*ip=\([^&]*\).*$/\1/p' | sed "s/%20/ /g")
fi

i=0

while [ $i -lt "$MAX_RETRIES" ]; do
    WOL_RESULT=$(etherwake -D -i br-lan $MAC_ADDR)

    if [ ! $? -eq 0 ]; then
        echo "{\"status\":\"error\"; \"message\":\"Wake-up On LAN command failed.\"}"
        exit 1
    fi

    ping -c $NUM_PINGS $IP_ADDR > /dev/null

    PING_RESULT=$?

    if [ $PING_RESULT -eq 0 ]; then
        echo "{ \"status\" : \"success\"; \"message\" : \"WOL: $WOL_RESULT; Ping Result: $PING_RESULT\" }"
        exit 0
    else
        i=$(($i+1))
    fi
done

echo "{\"status\":\"error\"; \"message\":\"Exhausted retries. Unable to wake up device.\"}"
exit 1

On the Hass.io side, it was a matter of registering for this REST endpoint, and use it as the service that turns on the TV, instead of the built-in Wake on LAN integration:

rest_command:
  wake_up_device:
    url: 'http://192.168.1.200/cgi-bin/wol.sh?mac={{tv_mac}}&ip={{tv_ip}}'
    method: GET
    headers:
      accept: 'application/json, text/html'
    content_type: 'application/json; charset=utf-8'
    timeout: 60

media_player:
  - platform: webostv
    host: 192.168.1.30
    name: TV Sala
    filename: webostv.conf
    timeout: 5
    turn_on_action:
      - service: rest_command.wake_up_device
        data:
          tv_mac: 'B4:E6:2A:??:??:??'
          tv_ip: '192.168.1.30'
    customize:
      sources:
        - livetv
        - youtube
        - netflix

Given that the this REST endpoint takes some time to respond (due to the number of pings it performs to test the connection to the TV), the response timeout was increased to 60 seconds.

This is all that it was needed, and so far the feature have been working flawlessly.

Monday, October 28, 2019

Building a kick-ass home automation by reflashing the Sonoff devices with Tasmota and getting it all working with hass.io

For some time I have been gradually bringing more devices to my house, which are either designed or having features allowing these to be integrated to a home automation system.

In spite of all the concerns that can arise from bringing smart/connected devices to the place where you expect personal privacy to exist, the convenience of having these ends up speaking louder overall..

It all started with having a set of unrelated devices in the house, each featuring connectivity and some cloud-based features provided by the vendor. This is the case for the Xiaomi Rockrobo vacuum cleaner, the Sonoff switches, the multimedia devices such as the TV set (an LG smartTV), and also the Google Chromecast and Assistant devices.


I wanted to integrate these, and acquire the potential for doing some home automation, even if to a modest extent. As such, and given the fact that I already had some hardware, notably a Raspberry Pi lying around, I decided to give the Home Assistant project a try, in its easier to setup form, the Hass.io package (https://www.home-assistant.io/hassio/).

All it took was an available Raspberry Pi 2 device, and a MicroSD card with sufficient space (a 64 GB one did fine).

Then it was a matter of installing Hass.io according to the documented procedure, and configuring the integrations. Mostly a walk in a park for someone used to integrating systems, and also the most popular integrations are well documented. For most of the issues that can appear, there is good help in forums as well, as these are previously carved paths for some people.

When I reached the point of integrating the couple of Sonoff switches (one Sonoff TH10 and a Sonoff Dual R2),


I first started with the stock firmware and went for an integration that would sit on top of it. I tried this project:

https://github.com/peterbuga/HASS-sonoff-ewelink

It is a great solution for starters, and it provides some level of funcionality in Home Assistant, most notably being able to turn the relays on and off.

But in particular for the heater in my living room (in which case I replaced the original faulty control board by the Sonoff TH10), I wanted the thermostat feature to be controllable via the Hass.io, which wasn't possible using this "over-the-top" integration (unless I defined a "soft thermostat" in Hass.io itself but I didn't wanted to go with that approach due to the network dependency - if it breaks you are not sure if the heater will stay on or off - better be safe than sorry).


That's where I found that the only option was to reflash the Sonoff's, and completely remove the dependency on the chinese cloud (the original app and the sonoff devices communicate with a cloud-based backend), and no longer require these indirect integrations.

The Tasmota project (https://github.com/arendst/Tasmota) was the obvious way to go. Aimed at running in most Sonoff ESP8266 based devices, this solution has a number of powerful features, making it a good option for those who want to have full control of their IoT devices, and control the integration from both ends (Hass.io and the device).

Flashing the devices was not complicated. First had to open the case, and add the 4-pin header for the serial port. These ESP microcontrollers have the characteristic of not requiring special programming hardware - a regular RS-232 serial at 3.3 Volt logic levels is all that is required:



For flashing the image, I used this tool:

https://github.com/marcelstoer/nodemcu-pyflasher

There is a self-contained Windows binary, that makes it easy to proceed with the flashing.

As the USB - Serial converter I used a FTDI FT232RL based board. You can get these anywhere, and these have the advantage of providing both 3.3 and 5 volts of both power and logic levels for the RS-232 communication:


For both Sonoff modules (the TH10 and the Dual R2), I have chosen the "Dual Output - DOUT" mode, and made sure to select "yes - wipes all data" (in the TH10 I forgot to do that, and the device wouldn't boot).


The rest was a matter or proceeding with the well documented initial configuration steps:

https://github.com/arendst/Tasmota/wiki/Initial-Configuration

Once it was up, it was nice to see that the firmware boots up quite quicly, and a simple yet functional Web UI can be accessed:


The relevant challenge for me (and probably the best take away of this blog post) was the setup of the thermostat:

Tasmota doesn't have a ready to use thermostat feature. Rather, it has a relatively powerful rules engine which can be used to define some authomations in the device itself.

Each rule is a set of  "ON trigger DO command [ENDON | BREAK]" clauses, and several rules can be configured, as long as there is free flash memory to do so.

I started from an example in the rules cookbook (https://github.com/arendst/Tasmota/wiki/Rule-Cookbook), and made the needed adaptations in order to have the thermostat working in my setup.

I made a first iteration based on the example in the cookbook, but as I was integrating with Hass.io, I realized that the later is currently unable (in its climate MQTT component) to map the predefined climate control status such as 'off' and 'heat' to the '0' and '1' values that the rule was initially expecting.

As such I ended up having to modify the rule accordingly. In my particular setup, I am using anAM2301 sensor (it has both a temperature and an humidity sensor built in):



so the rule had to also be adjusted to point to the correct sensor type.

The rule turns out to be the following:

Rule1
on system#boot do RuleTimer1 70 endon
on Switch1#State do event toggling1=%mem1% endon 
on event#toggling1=off do mem1 heat endon 
on event#toggling1=heat do mem1 off endon 
on Rules#Timer=1 do backlog var1 0; RuleTimer1 70; power1 0 endon 
on tele-AM2301#Temperature do backlog var1 0; RuleTimer1 70; event ctrl_ready=heat; event temp_demand=%value% endon 
on event#ctrl_ready=%mem1% do var1 1 endon 
on event#temp_demand<%mem2% do power1 %var1% endon on event#temp_demand>%mem3% do power1 0 endon

After the rule is set (the rule can be added to Tasmota through the Web UI console or via the MQTT appropriate topic. E.g. "cmnd/sonoff/Rule1"), the one off initialization procedure must be sent:

backlog SwitchMode1 3; Rule 1; Rule 4; TelePeriod 60; SetOption26 1; SetOption0 0; poweronstate 0; mem1 off; mem2 10; mem3 10; var1 0

This resets the configurations and variables to the correct initial state.

For this device, I have also chose to have the blue status LED be always on, allowing me to know if the heater is energized:

LedPower 1

Basically this rule starts by initializing a timer RuleTimer1 when the device boots up. This timer expires every 70 seconds. Every time this timer expires, the following rule entry is evaluated:

on Rules#Timer=1 do backlog var1 0; RuleTimer1 70; power1 0 endon

It causes the timer to reset, and  variables and power status to be reset. It acts like a watchdog if nothing else is triggered, causing the heater to be turned off.

The second rule entry causes the physical switch in the Sonoff to create an event that will be used to toggle the thermostat through the next two entries:

on event#toggling1=off do mem1 heat endon
on event#toggling1=heat do mem1 off endon

Everytime the temperature sensor is read (every 60 seconds, as set by the TelePeriod 60 command), the corresponding rule entry is evaluated:

on tele-AM2301#Temperature do backlog var1 0; RuleTimer1 70; event ctrl_ready=heat; event temp_demand=%value% endon

This causes the temp_demand variable to be set according to the current temperature (it will later be used to decide if the heater should be turned on or off). The watchdog is reset and the control variables set.

The next rule entry:

on event#ctrl_ready=%mem1% do var1 1 endon

Is used to decide if the thermostat should stay in operation, if mem1 (which is used from MQTT to enable or disable the thermostat) matches the indended operational state (which is "heat" if a temperature event was received).

The last two rule entries are the control loop itself:

on event#temp_demand>%mem2% do power1 0 endon 
on event#temp_demand<%mem3% do power1 %var1% endon

Based on the temperature being above the upper setpoint or below the lower setpoint (which can be controlled by the mem2 and mem3 variables - also exposed via MQTT), it decides if the heater must be turned on or not.

In sum:

mem1 - controls the operational state of the thermostat. Setting it to 'off' turns on the heater. Setting it to 'heat' turns on the heater.

mem2 - controls the lower temperature setpoint in ºC.

mem3 - controls the upper temperature setpoint in ºC.

The other important aspect in this work was defining how to integrate with Hass.io. Again, the Tasmota firmware is comprehensive in this respect, as it provides more than a single option:


At first I attempted the "Belkin WeMo" approach, but upon the lack of success with this approach, soon turned to using MQTT, as in spite of its complexity, it seemed like the more promising one.

The MQTT protocol is a messaging protocol that dates back to well before the home automation hype. It was designed as a more lightweight alternative to traditional messaging protocols, making it useable in low bandwidth scenarios (its first use was to monitor an oil pipeline in the desert).

Enabling the MQTT in Tasmota allows the device to communicate with other agents through a broker.

On the Hass.io side, I have installed the Mosquitto broker (https://home-assistant.io/addons/mosquitto/). Internally (like many modules in Hass.io) it runs in its own docker container:


With this type of architecture, the central system (Hass.io) doesn't need to be aware of the network details of the individual devices, in order to publish or receive messages from these. This makes it ideal for home automation devices, as these can be plenty, and an easy setup is desired.

I will not go into details in respect to the MQTT setup, as it is well documented in the Hass.io pages. At the end I defined a user in Hass.io, and have it configured for read/write access to topics in the MQTT broker:

/share/mosquitto/acl.conf:

acl_file /share/mosquitto/accesscontrollist

/share/mosquitto/accesscontrollist:

user homeassistant
topic readwrite #

user sonoff-device
topic readwrite #

this user (sonoff-device) is created in Home Assistant (under Configuration > Users).


In each device (in the Tasmota Web UI), one has to define where the MQTT broker is (host and port), and the credentials of the user that have been created in Hass.io. The topic name should also be defined to distinguish this device from other devices in the network:


Once this is all set, the device should be able to communicate with the broker:


On the broker side it is also possible to confirm that the connection is correctly established:


Finally it is time to proceed with the actual integration of the component (the heater in this case). In this case we want to use the "climate" component for an MQTT source (https://www.home-assistant.io/integrations/climate.mqtt/)

In order to add this component, we have to:

  • open the configuration file - /config/configuration.yaml
  • add the entry for the mqtt climate:

climate:
  - platform: mqtt
    name: living-room-heater
    modes:
      - 'off'
      - 'heat'
    mode_command_topic: 'cmnd/living-room-heater/mem1'
    mode_state_topic: 'stat/living-room-heater/RESULT'
    mode_state_template: '{{ value_json["Mem1"] }}'
    temperature_low_command_topic: 'cmnd/living-room-heater/mem2'
#    temperature_low_state_topic: 'stat/living-room-heater/RESULT'
#    temperature_low_state_template: '{{ value_json["Mem2"] }}'
    temperature_high_command_topic: 'cmnd/living-room-heater/mem3'
#    temperature_high_state_topic: 'stat/living-room-heater/RESULT'
#    temperature_high_state_template: '{{ value_json["Mem3"] }}'
    current_temperature_topic: 'tele/living-room-heater/SENSOR'
    current_temperature_template: '{{ value_json.AM2301.Temperature }}'
    action_topic: 'stat/living-room-heater/RESULT'
    action_template: "{{ 'cooling' if value_json['POWER1'] == 'OFF' else 'heating' if value_json['POWER1'] == 'ON' }}"
    qos: 1
    payload_on: 'heat'
    payload_off: 'off'
    payload_available: 'Online'
    payload_not_available: 'Offline'


Here we need to tell Hass.io three types of information: i) where to obtain the current state of the component; ii) where to send commands; and iii) what fields and values in the data map to values that Hass.io expects.

The first thing is the modes that our component supports. This is detailed in the documentation, but our heater/thermostat supports the 'off' and 'heat' modes (complete HVAC systems have more possible modes).

The other aspect is where we tell the device to change mode. In this case it is the cmnd/living-room-heater/mem1 mqtt topic (mem1 is the variable we have set in our rule). The line:

mode_command_topic: 'cmnd/living-room-heater/mem1'

defines just that.

The two other aspects, namely:


temperature_low_command_topic: 'cmnd/living-room-heater/mem2'
temperature_high_command_topic: 'cmnd/living-room-heater/mem3'

allow to control the low and high setpoints of the thermostat, as we mentioned in the Tasmota rule part.

Then we also want the component to know the current temperature as reported by the device, and for that we define:

current_temperature_topic: 'tele/living-room-heater/SENSOR'
current_temperature_template: '{{ value_json.AM2301.Temperature }}'

Lastly, we want to know what the heater is actually doing (i.e. if the resistor is turned on or not). For this purpose, we define:

action_topic: 'stat/living-room-heater/RESULT'
action_template: "{{ 'cooling' if value_json['POWER1'] == 'OFF' else 'heating' if value_json['POWER1'] == 'ON' }}"

in order to parse the power status from the messages sent to the topic.

At the end we should have a card in our dashboard that looks like this:



Clicking in the top right corner of the component we obtain more detailed information:


In the future, more improvements are interesting to obtain. For instance, currently the setpoints are optimistically defined in Hass.io. Even though the device reports the feedback of the setpoint having been changed, by publishing to the 'stat/living-room-heater/RESULT' topic:


this information is not interpreted in the climate component, because the later still does not support templates for all the state topics. Ideally the commented code should be supported and allow this feedback to be mapped:

    temperature_low_command_topic: 'cmnd/living-room-heater/mem2'
#    temperature_low_state_topic: 'stat/living-room-heater/RESULT'
#    temperature_low_state_template: '{{ value_json["Mem2"] }}'
    temperature_high_command_topic: 'cmnd/living-room-heater/mem3'
#    temperature_high_state_topic: 'stat/living-room-heater/RESULT'
#    temperature_high_state_template: '{{ value_json["Mem3"] }}'

Until then we only have this optimistic approach. E.g. if setpoints are changed outside Hass.io, its UI will be out of sync with the device.

Sunday, October 20, 2019

Consumer grade WiFi gear - when fixing the root cause is not at reach

Some time ago, I had to improve the performance and coverage of my home network, so as to be able to use the several devices around the house flawlessly, regardless of the location. Some of these devices have a certain demand for consistent bandwidth, as is the case of the SmartTV for watching IPTV and Netflix, and others such as the smartphones and tablets.

As always I tend to be frugal with spending money in hardware, trying to go with what performs well and is just about enough for the job.

This led me to aim for WiFi gear that would both be somewhat popular and low cost, while at the same time having some hope of being hackable and reflashed to OpenWRT in the future. This was the reasoning when I decided to buy a couple of TP-LINK TL-WR841N routers (with v9 hardware at the time).

At first I set these up and played with the stock firmware, configuring one to play the roles of  NAT, DHCP, DNS, firewall and so on, and the other to act solely as a WDS repeater, allowing WiFi coverage to be extended to the rest of the house.



But later I wanted to play home automation, and more and more felt the need to have these pieces of equipment more manageable. As such I became adventurous, and engaged in the twofold task of managing the frustration of the household users during the periods of service interruption while modifying the routers, and the technical change itself.

The flashing of these devices was a simple step, as the OpenWRT project provides packages which can readily be uploaded and flashed to the devices using the original Web UI. There is no need for setting up any serial connections or anything as such.

Details can be found in the official page:

https://openwrt.org/toh/tp-link/tl-wr841nd

Given the hardware limitations of these devices (4 MB of Flash and 32 MB of RAM), the sweet spot between features and stability, was using version 15.05.1.

The devices work correctly and the firmware performs quite well in this version, but I found that after a few days of operation and depending on network usage, both the router and the repeater get to a point where the network performance drops dramatically. CPU and memory is not the problem, as during degradation it is possible to patiently open an ssh session and monitor these (e.g. using "top" and "free" commands) and confirm that these are nominal.

Once degradations sets in, it is no longer possible to recover, except by forcing the device restart.

With this in mind and after exhausting the research on corrective solutions, I decided to make a rather generic watchdog script that all it does is measure the ping to a list of destinations, and if none of the destinations is reachable or the minimum ping time is above a given threshold for all destinations, it restarts the device. This allows the impact to be minimized, and reduces the need for manual intervention.

I made this script is available on GitHub, and is generic enough that can be used in other systems where the same kind of problem needs to be dealt with (instead of the reboot command you may configure the appropriate action in your case):

https://github.com/teixeluis/network-watchdog

Once the script is copied to the device, you only need to adjust the environment variables to your scenario:

RTT_THRESHOLD - the minimum (ICMP) ping RTT in milliseconds needed for a connection to a given node be considered degraded;

IP_LIST - the list of IP addresses to be tested by the script;

MIN_DEGRADED - the minimum number of nodes from the list that are necessary to have a degraded connection, to assume that the problem is from the device where the script is being executed.

Lastly you need to configure the crontab to run the script at the desired frequency (in this case once every minute):

root@griffinnet-zh-router:~/scripts# crontab -l
* * * * * /root/scripts/network_watchdog.sh >/dev/null 2>&1