AI Powered Root Cause Analysis for Production Alerts


3rd December 2020
Speaker Date: 3rd December 2020
Speaker Time: IST: 12:30-13:00
Attendee Date: 3rd December 2020
Attendee Time: IST: 12:30-13:00 | SGT: 15:00-15:30 | AEDT: 18:00-18:30
Duration: 25 mins
Deepa Elumalai
Site Reliability Engineer, PayPal

At PayPal, SRE team troubleshoots production alerts (from ~2500 applications and services). There is always an inherent urgency in resolving the alert. At times, we are swamped with alerts, all requiring attention at the same time.

In this talk, we will share how we have started employing Machine Learning from the ground-up to give our platform the necessary power to predict the probable root cause of alerts.

Also, will elaborate how we use the existing troubleshooting results (from traditional programming) in machine learning to help improve the accuracy of the prediction. The design, working and methodology followed in experimental trials to identify the best model. The model that we built is integrated with our platform and pronounces the root cause in real time. The model has been showing promising results and is a game changer for SREs.

This presentation will mainly walk you through the journey of how we have built the machine learning models and employed the same in production.

You may also be interested in

3rd, September

Time
Uncovering your Personal Values

We regularly review our code and attend retros, but what about tracking and reviewing our personal identity? We all know...

26th, November

Time
On Being an Effective Developer

As developers we not only operate in different contexts, but also often have these different contexts interplay as part of...

26th, November

Time
Principles of Productive Software Developers

When working as a software developer, as well as in any other job, it’s important to be productive and to...

26th, November

Time
Growing into a Technology Leader

Have you ever wondered how you advance your career as a software developer? Over twenty years in the profession, I’ve...

3rd, September

Time
Leading & Guiding Development Teams

By definition, a tech leader is responsible for leading and guiding development teams. In this session we will take a...

26th, November

Time
Designers + Developers = Best Friends Forever?

How is the relationship between your design team and your development team? Is it highly functional? Or 'just professional'? Maybe...