How to be a well-grounded Site Reliability Engineer ⚒️

A short summary on the skills needed to be a well grounded Site Reliability Engineer

Mar 15, 2023

Many ask on what is SRE and what you need to learn to be a good Site Reliability Engineer.

Site reliability engineering (SRE) is a software engineering approach to IT operations. SRE teams use software as a tool to manage systems, solve problems, and automate operations tasks. (RedHat)

In this post, I will share some of the fundamental & crucial skills required to be a well grounded Site Reliability Engineer.

Skills & Knowledge Required:

1. Coding skills - having the problem solving mindset to solve coding problems. Can be in any language.

2. Computer networking - you need to have a good understanding of the OSI model, TCP/IP, different networking protocols and how they interlink. Aware of simple linux networking services like ssh, ping, curl, dig, netstat etc.

3. Linux fundamentals - a good understanding of basic linux commands and internals of linux operating system. This will be useful when troubleshooting Linux problems.

4. System design fundamentals - one of my favourite topics. A good understanding of how to design highly available and scalable systems. Aware of fundamental concepts like proxies (forward & reverse), load balancing, DNS, CDN, client/web server model, caching, message queues, event driven architecture and more. I go through all these in my blog CoderCo.

5. Technologies - AWS/Azure/GCP, Git, Linux, Kubernetes, Chef/Puppet/Ansible, Terraform, Docker, GitHub Actions, Golang/Python etc.

6. SRE processes - an indicator should have basic understanding & awareness about good monitoring/alerting practices, SLO/SLA/SLIs, error budgets, release engineering(staging/canary), oncall, postmortem, capacity planning etc.

I write about software engineering, devops, sre and system design content on my blog here at CoderCo.

#sre #devops #softwareengineering #aws #azure #kubernetes #architecture

CoderCo

How to be a well-grounded Site Reliability Engineer ⚒️

A short summary on the skills needed to be a well grounded Site Reliability Engineer

Discussion about this post