Monitoring in the time of Cloud Native – Cindy Sridharan – Medium


  • Monitoring in the time of Cloud Native

    While we’re still at the stage of early adoption, with the failure modes of these new paradigms still being very nebulous and not widely advertised, these tools are only going to get increasingly better with time. Soon enough, if not already, we’ll be at that point where the network and underlying hardware failures have been robustly abstracted away from us, leaving us with the sole responsibility to ensure our application is good enough to piggy bank on top of the latest and greatest in networking and scheduling abstractions.
    No amount of GIFEE (Google Infrastructure for Everyone Else) or industrial-grade service mesh is going to fix the software we write. Better resilience and failure-tolerant paradigms from off-the-shelf components now means that — assuming said off-the-shelf components have been configured correctly — most failures will arise from the application layer or from the complex interactions between different applications. Trusting Kubernetes and friends to do their job makes it more important than ever for us to focus on the vagaries of the performance characteristics of our application and business logic. Application developers now have one job. We’re at a time when it has never been easier for application developers to focus on just making their service more robust and trust that if they do so, then the open source software they are building on top of will pay the concomitant dividends.