Site Reliability Engineer

United States

At Moove It we design, develop, and deploy custom software solutions for organizations that want to make an impact through technology.

Being officially recognized as a Great Place To Work in LATAM, we offer the perfect balance between work and fulfilling life. Here you can develop your passions and form strong relationships. Moove It is your opportunity to leave your mark in creating a better world for everyone, working on international projects that have a positive impact on our society.

Our growing SRE team needs a new teammate! We are looking for a service-oriented person, with programming experience in at least one language, strong communication skills, and enthusiasm for helping others.

You are excellent at doing the following:

  • Design and implement tooling to improve the availability, scalability, observability, and latency of our services, which are used by developers and customers to deploy and operate their services.
  • Share an on-call schedule for the platform services you own and respond to incidents alongside engineering teams.
  • Take a lead role in implementation and maintenance of system health monitoring and alerting.
  • Collaborate closely with architects, developers, database administrators in order to handle the reliability and scalability of the infrastructure
  • Ensure application performance, uptime, and scale, maintaining high standards of code quality and thoughtful design
  • Create a DevOps culture of communication and support between product engineering and our SRE Studio

What will make you a perfect fit

  • Over 4 years experience in contributing toward the architecture and design (architecture, reliability, and scaling) of new and existing systems.
  • Solid understanding of Linux containerization with Docker.
  • 2+ years production experience experience with AWS or other providers.
  • 2+ years of experience with Kubernetes, ECS or similar orchestration frameworks.
  • Programming skills in any programming language. Preferably Python, Go, Node.js or Ruby.
  • Experience implementing and maintaining observability tools (e.g. DataDog, New Relic, Prometheus, Grafana)
  • Ability to identify root-cause sources of instability in high-traffic, large-scale distributed systems.
  • Configuration management and orchestration (e.g. Terraform, CloudFormation)
  • Good written and verbal communication skills in English
  • Experience working as a software engineer
  • Familiarity understanding of software change management and software development processes
  • Experience working in an English speaking environment
separator line
You don't have to meet all requirements to be able to apply.
Apply if you think you are a good fit and we will take it from there.
separator line

The benefits that make us a great fit:

Work-life balance

Growing together

Diverse and healthy environment

What are you waiting for? Apply now or refer a friend!

Apply here

Attach your file
No file selected
Or drop us an email to