When Machines Fix Themselves: Alibaba’s Self-healing Hardware Solution
▻https://hackernoon.com/when-machines-fix-themselves-alibabas-self-healing-hardware-solution-d13
Alibaba is realizing a closed-loop self-healing hardware strategy to automatically resolve failures.In big data systems, operation and maintenance work plays a crucial role in ensuring that service interruptions from hardware and software failures do not threaten the overall stability of platforms. Given the challenges of doing so in massive data environments, groups such as Alibaba are increasingly seeking automated solutions that simplify response efforts from their responsible personnel, with one recent effort being self-healing hardware systems.For Alibaba, offline computing platform MaxCompute handles a massive 95 percent of all data storage and computing needs, which at the growing scale Alibaba’s business already encompasses hundreds of thousands of server units. Among these (...)
#big-data #cloud-computing #system-architecture #machines-fix-themselves #devops