You just deployed your new version of an application or micro-service;how do you know everything works as expected? You run your comprehensivetest suite to verify functional correctness for known scenarios andperformance tests before deploying, but does your application reallywork at the moment or is it just responding with error messages to allincoming requests?I’m part of the team that runs a huge infrastructure for the SAP HANAdevelopment. This infrastructure is vital for nearly all development &testing activities of SAP HANA. As this infrastructure is powered bymultiple in-house developed applications, we immediately want to know ifan application starts to fail and we need to be able to quickly diagnosewhat caused the failure.This talk will give you an overview how we monitor our full stack fromthe 2000 physical machines up to the 10,000 parallel running Pythonapplication processes, micro-service instances and batch processingjobs. It includes a review about the used tools, bad and good examplesof instrumentation in Python code, the resulting visualisation and anoutlook on upcoming improvements.

