Senior Site Reliability Software Developer
About The Job
We at Tempo are looking for a Senior Software Developer for Site Reliability Engineering to find innovative ways to optimize the development pipeline, runtime performance and the availability and efficiency of all our cloud applications and services. If cloud security, reliability, performance, and cost optimizations, are projects that you find exciting and you have hands-on experience operating large scale cloud services, we may well have the job for you!
The role involves
- Developing the solutions for scalability and performance challenges to keep our high-availability products up and running.
- Supporting our developers with continuously delivering their software to the Tempo cloud platform.
- Ensuring reliability, responding to outages and supporting the team in resolving pressing and complex technical issues.
- Proactively find ways to optimize our platform to ensure effective scalability and operational cost reductions.
- Supporting various initiatives regarding cloud security, including identity verification, access controls, and permissions.
- Developing and maintaining our tools for deployment and cluster management.
- Debugging complex production issues using our observability tools, logging, metrics and APM.
The ideal candidate
- BS or MS degree in Computer Science or a related technical field.
- 5+ years experience with the AWS Cloud / AWS CLI.
- Familiarity with AWS Well Formed Architecture and cloud security methodologies.
- 2+ years of Kubernetes experience.
- 2+ years of Terraform 11+ experience.
- Experience with Terragrunt is a plus.
- 2+ years of GoLang experience.
- A good understanding and ability to modify Python code.
- Has experience developing in environments with microservices, SOA or other distributed computing environment
- Experience with DevOps, various deployment strategies and maintaining large scale, multi-tenant applications.
- Has an impressive track record in managing and working with cloud platforms and cloud automation and monitoring tools. Our stack includes AWS, GCP, Kubernetes, Docker, RabbitMQ, and Datadog. Experience working with these tools or alternative tools in a production setting is a must.
- Has a solid understanding of configuration management and engineering for large scale websites and/or products, including networking, databases, and operating systems.
- Has a deep understanding of distributed version control systems like Git, including branching and merging strategies.
- Has important know-how of software build tools (e.g. Gradle and Maven) and continuous integration tools.
- Is proactive and creative in identifying ways to improve systems and their reliability.
- Is passionate about automation: We strongly believe in the benefits that repeatable environments bring to a software organization.
What's in it for you?
- Remote work!
- Unlimited vacation!!
- Great benefits plan including health, dental, vision and more
- Great office spaces in Canada & Iceland
- Diverse and dynamic teams
- Challenging and exciting work
- An opportunity to have a real impact on our business
- Free breakfast and snacks
- A great range of social activities
- And so much more!!
Note: As our hiring teams are global, please submit your resume in English only!