Senior Site Reliability Engineer

Date: Apr 2, 2024

Location: Stockholm, SE, 111 23

Company: Optimizely

At Optimizely, we're on a mission to help people unlock their digital potential. We do that by reinventing how marketing and product teams work to create and optimize digital experiences across all channels. With Optimizely One, our industry-first operating system for marketers, we offer teams flexibility and choice to build their stack their way with our fully SaaS, fully decoupled, and highly composable solution.  


We are proud to help more than 10,000 businesses, including H&M, PayPal, Zoom, and Toyota, enrich their customer lifetime value, increase revenue and grow their brands. Our innovation and excellence have earned us numerous recognitions as a leader by industry analysts such as Gartner, Forrester, and IDC, reinforcing our role as a trailblazer in MarTech. 

 

At our core, we believe work is about more than just numbers -- it's about the people. Our culture is dynamic and constantly evolving, shaped by every employee, their actions and their stories. With over 1500 Optimizers spread across 12 global locations, our diverse team embodies the "One Optimizely" spirit, emphasizing collaboration and continuous improvement, while fostering a culture where every voice is heard and valued. 

 

Join us and become part of a company that's empowering people to unlock their digital potential! 

Introduction

Optimizely is looking for it’s next SRE superstar in Sweden!

 

Every unicorn company wants you to look at their shiny new stuff - but we know you’re smarter than that! You look at how well the gears turn. You listen if there is any noise coming from the engine, so to say.

 

You have a strong urge to understand and learn how systems work (or shouldn’t work). You’re happy to step up and share what you know and drive the change needed to make systems and people live up to their full potential, if just “tweaked a little.”

 

You are vocal - when you see something wrong you tell the people that depend on it.

 

You explain that it needs fixing - then you help fix it or suggest how they do it.

 

You understand the value of simple and solid - not over engineered.

 

If this resonates with you keep on reading to see how we get things done in the team...

 

Job Responsibilities

As an SRE in the team you will:

  • Design, deploy and maintain cloud infrastructure
    • For designing and deployment we like Terraform (but as with any other tech company we have lots of options)
    • We maintain a multi-cloud footprint (Azure/GCP/AWS) but we try to go with Azure whenever possible.
    • For our edge tech we’re very good friends with Cloudflare
    • Our systems are engineered across IAAS, PAAS and SAAS components (k8s, VMSS, AppServices, SQL, Mongo, Elasticsearch, DNS and a lot more)
  • Work with monitoring, alerts & logs
    • We are consolidating our observability platform on Datadog but we run ELK & Azure AppInsight and others aswell.
  • Be on call, respond to incidents and conduct post-mortems..
    • When stuff break - we fix it. Simple as that
    • Then we do retros, let people know why stuff broke and the outcome are improvements stories to get less oncall - everyone wins!
  • Build tools and automations
    • No one likes toil but we really we hate it (we still have it though…) So whenever possible - we automate it. This means you need to know some scripting/programming like PowerShell, Bash or Python.
    • Then you do ‘GIT request-pull…’ to make sure we don’t have to do it again.
  • Be scalable
    • We like to plan our work, not react to it.
    • We work according to process to cut down on overhead
    • Establish repeatable patterns (so you don’t have to reinvent the wheel all the time)
    • We don't stick to a specific framework but we borrow from ITIL, SAFe and others when it brings us value.
    • Whenever possible we delete or simplify - the best part is no part.
  • Most important - make sure you have fun!
    • You will spend a lot of your daytime (CEST) with us, so we want you to have a great time!
    • Learn new tech! We get to play with a lot of new cool stuff, then if you want we’ll sponsor your certification!
    • Bring your meme A-game.

Knowledge and Experience

  •  
  • 5+ years of experience in Site Reliability Engineering, Platform Engineer, DevOps, System/Network Admin or similar roles. 
  • 3+ years of documented experience in operating large distributed systems. 
  • Strong experience with cloud platforms, preferably Azure. 
  • In-depth knowledge of technologies such as K8s, Docker, Elastic Search, .net, SQL, MongoDB, Github, Datadog, Cloudflare, IAM, Terraform, and various CI/CD tools. 
  • Experience in scripting/automation/development preferably thru Phyton, PowerShell & bash 
  • Excellent problem-solving skills and the ability to troubleshoot complex infrastructure & application issues. 
  • Strong communication and collaboration skills, with the ability to work effectively across teams and with stakeholders at all levels. 
  • A commitment to continuous improvement and knowledge sharing thru documentation, demos, mentoring etc. 
  • Proficient in both spoken and written English, with the ability to speak and write Swedish at a sufficient level
  •  

Optimizely is committed to a diverse and inclusive workplace. Optimizely is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.