Share this Job

Senior Site Reliability Engineer

Date: Jun 20, 2022

Location: Hanoi, VN

Company: Optimizely

Optimizely is focused on unlocking digital potential and we are the recognized category leader in Digital Experience Platform (DXP) and created the category for A/B Testing and experimentation software. We have incredible customers – isn’t that one of the most important aspects of looking for your next job? Optimizely has over 9,000 brands from global organizations such as Visa, Sky, Yamaha, Wall Street Journal to tech innovators like Atlassian, DocuSign, FitBit and Zillow.


Not only are we financially sound and growing but we have unicorn status: Exceeded $300M in revenue in 2020, is profitable already, and has all strategic options ahead of itself. Optimizely continues to invest and addresses a market opportunity north of $30 billion, providing significant personal career growth opportunities.


We are an inclusive culture with a global team of 1200+ people across the US, Europe, Australia, and Vietnam. We blend European and American business culture with emphasis on teamwork, inclusion, and moving fast. People make the difference!


If you are looking to work on the next generation of digital technologies in a fast-paced, hyper-growth environment, apply! We’re just getting started...


SREs at Optimizely are focused on making us the most reliable, performant, and trustworthy Digital Experience Optimization platform ever. Our engineering teams have built data pipelines that process 10 billion events daily and applications that support powerful experimentation and collaboration workflows at scale. Our platforms are built on AWS and GCP. We use technologies such as Kafka, Samza, HBase, MySQL, and Postgres. We build and manage our systems using TravisCI, Jenkins, Docker, Kubernetes, Terraform, and Chef. We use a combination of managed and self-hosted approaches. This is a unique opportunity to lead the engineering organization in areas of standardized automated infrastructure and service provisioning and orchestration, service-oriented architectural excellence, and forward-looking planning and execution of large technical projects.

Job Responsibilities

  • Help build a Site Reliability Engineering culture across the organization by sharing your best practices, approaches, documentation, and code with other engineering teams
  • Apply automation and software to any tasks or parts of the system that would benefit from it or are performed manually
  • Ensure effective performance and 24x7 availability of all production systems
  • Monitor alerts coming out of all Optimizely’s platforms, and coordinate with Operations/SRE/TSS/Engineering teams as necessary to take preventive or corrective action to resolve any incidents, with a goal to minimize MTTA/R
  • Put in place and manage an effective on-call rotation within the team
  • Work with engineering teams to set up proper monitoring and alerting thresholds across all Optimizely services and applications so SRE team is focusing on key areas to stabilize the platforms
  • Document your system knowledge as you acquires it over time, create runbooks, and ensure critical system information is readily available to those who need it
  • Accountability for platform uptime SLAs.

Knowledge and Experience

  • Proven experience with AWS/Azure cloud infrastructure and DevOps
  • Experience using Kubernetes to build containerized applications
  • Good understanding Identity Governance catalog
  • Experience building secure multi-tier web applications
  • Experience configuring continuous integration and continuous delivery (CI/CD) systems such as TeamCity
  • Proficiency with databases such as SQL Server, Postgres, and MongoDB
  • Proficiency with ELK
  • Strong desire to learn and collaborate with the team
  • Must have a strong passion for continuous improvement
  • Ability to work with remote coworkers in other time zones
  • Familiarity with Agile development methodologies such as Scrum
  • Fluent in English both written and oral.

Bonus points:

  • Experience building scalable multi-region applications
  • Proficiency scripting/Programming like PowerShell, Bash, or Python
  • Experience configuring software monitoring tools such as DataDog, Kibana, ELK, etc.
  • Proficiency using configuration management tools such as Terraform.


Bachelor of Computer Science or equivalent industry experience


Displaying Technical Expertise
Critical Thinking
Testing and Troubleshooting
Demonstrating Initiative
Utilizing Feedback

Optimizely is committed to a diverse and inclusive workplace. Optimizely is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.