Site Reliability Engineer
Location: Dublin, Ireland or London, UK
Duration:6 months - 1 year CTH
Who we are:
X serves our community of users and customers by working tirelessly to preserve free expression and choice, create limitless interactivity, and create a marketplace that enables the economic success of all its participants.
What you'll do:
X’s infrastructure team operates a number of internally developed foundational services that are used by almost every engineer at 𝕏 to build higher-level distributed applications. Services like Observability, Distributed coordination, Service discovery and others have been purpose-built to meet our demanding scale, latency, and reliability requirements. We are mission critical to X so can’t afford any downtime. Operating our own systems at hyper growth levels is a rewarding challenge, it is what helps make us great. We are a tight knit and passionate team that values fast pace, innovation, creativity, and a strong commitment to our mission. Come help us keep X online by ensuring our critical infrastructure services are always up.
Your day-to-day responsibilities will include:
- Monitoring operational health of foundational infrastructure services
- Ensuring the performance, reliability, and scalability of our backend systems
- Troubleshooting and debugging
- Collaborating with other X engineering teams to effectively mitigate and resolve operational issues, improve service monitoring and alarming
- Develop teams operational processes and tooling
- Increase team effectiveness by automation
Who you are
We're looking for engineers who are passionate about operating mission critical infrastructure at massive scale:
- 4+ years of site reliability, DevOps, or software engineering experience, preferably in operational roles
- Experience with developer workflows and software engineering best practices (e.g. unit testing, code reviews)
- Experience with tools like git, puppet
- Familiarity with basics of the Linux operating system and ecosystem (cron, systemd, etc)
- Proficiency in one or more object-oriented programming languages (Python, Java, Scala, C++, ..)
- Experience with business-critical distributed large-scale systems