Site Reliability Engineer
Taboola
Realize your potential by joining the leading performance-driven advertising company!
As a Site Reliability Engineer on our Infrastructure team at the TLV office, you will play a key role in ensuring the reliability, scalability, and performance of our critical systems. You will be responsible for managing and improving our core infrastructure, with a focus on automation, monitoring, and incident response. You will work with a wide range of technologies, including Kubernetes, monitoring and observability tools, configuration management systems, and core networking services.
To thrive in this role, you'll need:
- 5+ years of experience in a Site Reliability Engineering, Systems Engineering, or similar role.
- Deep understanding of Site Reliability Engineering principles and practices.
- Extensive experience with Kubernetes, including deployment, management, and troubleshooting.
- Strong experience with monitoring and observability tools such as SensuGo, Zabbix, VictoriaMetrics, Prometheus, and ELK.
- Proficiency in configuration management tools such as Puppet and Ansible.
- Solid understanding of Linux internals and networking.
- Experience with managing and maintaining core services such as DNS and networking.
- Strong programming skills in Python and/or Go.
- Experience with both on-premises and cloud environments.
- Experience with KubeVirt.
- Excellent troubleshooting and problem-solving skills.
- Strong communication and collaboration skills.
- Ability to work in a fast-paced, dynamic environment.
- Ability to participate in on-call rotations including weekends.
Preferred Qualifications:
- Experience with large-scale, distributed systems.
- Experience with other cloud providers (e.g., AWS, Azure, GCP).
- Contributions to open-source projects.
How you’ll make an impact:
As a Senior Site Reliability Engineer, you’ll bring value by:
- Ensure the reliability, availability, and performance of our infrastructure services.
- Manage and maintain our Kubernetes infrastructure, including KubeVirt.
- Design, implement, and maintain our monitoring and observability stack (SensuGo, VictoriaMetrics, Prometheus, ELK).
- Automate infrastructure provisioning, configuration, and deployment processes using Puppet and Ansible.
- Manage and maintain core services such as DNS and networking.
- Troubleshoot and resolve complex infrastructure issues in a timely and efficient manner.
- Participate in on-call rotations and incident response.
- Develop and maintain infrastructure-as-code (IaC).
- Identify and implement proactive measures to prevent incidents and improve system reliability.
- Collaborate with development teams to ensure smooth and reliable deployments.
- Contribute to the design and implementation of new infrastructure solutions.
- Drive improvements in system architecture, processes, and tools.
- Mentor and coach other team members.
Why Taboola?
If you ask Taboolars what they love about working here, they’ll tell you that they’ve been empowered to realize their full potential while growing and learning from and with smart and talented people. They’ll also share more about:
- Adam Singolda, Taboola Founder and CEO says; “You can copy anything from another business but you can’t copy a company’s culture.
- Well-being: Enjoy comprehensive benefits (health, 401k, etc.), a fully stocked kitchen, and location-specific perks (gym partnerships, parking).
Flexibility: We offer a hybrid work schedule with 3 days in-office with an option to come in more often if desired.Work with some of the biggest names: We work with some of the biggest names in the business. Our publisher partners include Yahoo, Conde Nast, Fox Sports, NBCU, ESPN, CBS, and E! Online. Our advertiser clients include Wells Fargo, Honda, Pinterest, Expedia and Honda.
Ready to realize your potential?