Ad

Sunday, October 18, 2020

Improving an RPI 2 based Home Assistant server for reliability and performance

For some time I have been using the same Raspberry Pi 2 v1.1 as the infrastructure for my Hass.io instance. It proved to perform quite reliably over the approximately 18 months I have been using it 24x7. From that time during approximately 1 year I used the same SanDisk Ultra XC I 64 GB MicroSD card:


Just for precaution I have later switched to a similar card, a SanDisk Ultra HC I 32 GB, and moved my Hass.io installation onto it:


But now, as I assign more functions to this platform and given that it has a few relevant roles such as hazards and intrusion detection, I decided to move one step further in making sure the platform be less likely to fail. Such failures can range from an MicroSD card becoming corrupted or worn out, or in the event of a power outage, loosing access to the instance and eventual data loss.

I haven't went as far as building a redundant setup, albeit this would be a very interesting challenge to pursue, as it seems to be rather uncharted territory at least as far as Hass.io and HassOS in a Raspberry Pi is concerned.

But for now, my ambition was to take the storage fragility out of the way, survive power outages and be able to perform a clean and automated shutdown if needed.

The first point raises the question of why is the default storage option - a MicroSD card - a fragile one? Well, the reliability of storing data in a MicroSD card depends on the pattern of how the card is used.

MicroSD (and other consumer-oriented similar storage media) have been designed assuming  predominant use cases as the driver to allow certain heuristics to be applied, and as such reduce costs. For instance, when these storage devices appeared in the market, the most prevalent application would be in digital cameras, portable digital assistants (PDAs), and later in smartphones. As you probably know, the type of memory these devices are based on (called flash memory), is a type of solid state storage (silicon chips - just as with RAM and conventional ROM) that has some limitations. In spite of it having as building block the MOSFET transistor, which we usually associate as being a "forever lasting" device in the world of electronics, in this case however it is not quite like that: the special type of MOSFET (the floating-gate MOSFET) that constitutes each bit of flash memory will degrade over time, requiring more time to change state until eventually failing to retain a consistent state. This degradation is caused in most part by the erase cycles, and there are only so many cycles a flash memory cell can endure.

MicroSD card manufacturers implement some form of wear leveling technique in their devices, and depending on the make and model, the exact algorithms may vary. It is generally assumed however, that the sofistication of these algorithms is limited and cannot provide optimal wear leveling because given the current state of silicon chip integration and cost reduction, a sufficiently capable microcontroller would likely not fit in the limited room left in a MicroSD card - which as expected is mostly occupied by the flash memory itself. Besides that, good wear leveling algorithms tend to require substantial amounts of RAM, which again would compete for the available physical space as well.

Considering again the predominant use cases, such as the digital cameras and smartphones, it is easy to understand how manufacturers can "get away" with heuristic wear leveling methods: in the digital camera case, normally files are incrementally written to the SD card media until it is full. The user normally transfers the files and erases these from the SD card or performs a quick format of the card. The quick format normally does not erase the entire card. This type of pattern is not particularly demanding from a erase cycle point of view. Assuming the wear leveling argorithm is at least capable of relocating the most frequently written logical areas of the volume, such as the file allocation table and directory entries, the rest is guaranteed to be naturally wear leveled, given the usage pattern described.

However my usage scenario with the Rasberry Pi and the Hass.io will very hardly be gentle on the MicroSD card media: a fully fledged OS is present, and on top of it docker containers are spinned up (for supporting the core components as well as add-ons). Each docker container has its own volumes, and each will eventually require data to be written to disk. Also, there is a SQLite database for keeping track of the sensor data and events. If no wear leveling would be present, it is not difficult to foresee that quite quickly any flash storage media would be toasted.

In order to be less exposed to this problem, I decided to replace the MicroSD card with the next best solution: while a traditional spinning disk hard drive doesn't have the problem of areas of the media becoming damaged due to write cycles, it also fails for other reasons: being made up of moving parts, these eventually wear or get damaged due to physical forces. Their longevity depends in great part of how long the hard disk will be kept spinning. While the energy management software may be able to spin down a given drive due to inactivity, in a real world situation with will hardly ever happen, as disk access is frequently requested. 

With SSD drives becoming cheap, one will hardly have to think twice before purchasing one instead of a traditional HDD, except where storage density is really important and performance is secondary.

I managed to get a promotion for a 480 GB unit selling for 38 Euros. 

The drive was a Asenno AS25 480 GB SATA3 SSD, a mostly unknown (at least in the western world) chinese brand:


After the weekend sell, the price returned to a more market regular price of 51.89 €:


My reasoning was that even if this would be a lousy unit, it would likely still beat the SD card because: i) it has a lot more storage space (480 GB vs 64 GB), hence the wear leveling algorithm will be happy with plenty of free space; ii) being an SSD it likely would have a more sophisticated wear leveling strategy, given the room for a large SSD controller; iii) even if the performance drops from the initial 500 MB/s (as several users have reported), it should not make much difference, because in my Raspberry Pi 2, at most I will only be able to squeeze 60 MB/s (the 480 Mbps of the USB 2.0 bus) from it.

Regarding the cost, for a similarly sized MicroSD card I would still have to shell out a lot more than 38 Euros to have one, at today's prices.

After having the unit delivered, I couldn't resist the curiosity of analyzing it inside in order to really understand what was there before putting it into daily use. The first thing I noticed with this product was the plastic case, instead of the metal case that many SSD SATA drives still have (I do not think that metal or plastic should not make much difference technically, but I do believe that a metal case is a form engaging more of the consumer confidence by preserving some similarity to the traditional spinning disk hard drive):


After removing the cover, I was somewhat surprised to find just a small PCB compared to the area of the complete 2.5" drive:


I removed the board, and found that there really wasn't much going on inside it: a single flash chip, the controller chip, and some extra components such as a 50 MHz clock oscillator, and what appeared to be DC/DC converter components:


Looking closer at the flash memory chip, I noticed that it was SanDisk branded, and it was marked with the following codes which I was unable to lookup online: 60363 512G / MALAYSIA / 9452YXCYV07J / CSS997-N53QS3MM-080. Either it is a not very popular memory chip, or (most likely) the OEM provider of this SSD driver had an agreement with SanDisk for not disclosing the exact memory chip in use. There is also the possibility of not being a true SanDisk chip, but less likely I would say..


Regarding the SSD controller, it is the relatively popular Silicon Motion SM2258XT:


It is present in many common SSD drives, such as the Crucial BX500. It is in the entry level range of these flash memory controllers, as it doesn't use SDRAM for the wear leveling algorithm. Instead it uses a flash memory based strategy. It penalizes performance and flash memory longevity to some extent, compared to the more expensive SDRAM based controllers.

Overall I was satisfied, and under the impression that it could not be much worse than the Crucial BX500 drive, as it shares much of the same design and components. Even the plastic case was a choice that Crucial have taken for their low end drives.

The next topic to be presented will be about how I managed to use this SSD for running Hass.io, and to no longer depend on the MicroSD card for writing data. All of this considering the limitation that the old Raspberry Pi 2 v1.1 presented, of not supporting boot from an external USB drive.


No comments: