Site Reliability Engineer – (m/w/d) Gigafactory Berlin-Brandenburg

Tesla

Gigafactory
Berlin-Brandenburg marks a defining moment in Europe for Tesla’s mission of
accelerating the world’s transition to sustainable energy. Here we will
launch revolutionary products on a massive scale, using extraordinary speed,
innovation and efficiency. Our employees will ensure that we achieve our goals,
and we welcome you to help us write the next chapter of our success
story.  

Role

The Core Automation Services (CAS) team at Tesla is building applications to enable manufacturing, with an eye towards reliability, availability, scalability, speed and security. We’re a diverse team composed of Controls Automation Engineers, Software Engineers, and various other disciplines that help facilitate automated manufacturing processes.

As an SRE on the CAS team you’ll be working with the infrastructure, systems and applications that act as the middleware layer between Programmable Logic Controllers (PLCs) and the outside world, such as Databases, MES systems andother services.


Responsibilities

  • Support interim HMI/SCADA vendor application (Ignition from Inductive Automation)
  • Building tooling around it, evaluating its usage, and helping to ensure its reliability, availability and security
  • Design systems and infrastructure that enable automated manufacturing at Tesla
  • Assist Software, Controls, Manufacturing and other types of Engineers with onboarding and integrating services into the Tesla stack (Kubernetes/VMWare/Bare-metal)
  • Ensuring best practices and observability of the service, such as metrics, logging, tracing, and alerting
  • Automate configuration and deployment of services
  • Consult on and design infrastructure, systems and application architecture

Requirements

  • Experience with Virtualization (vSphere) and/or Containerization (Kubernetes)
  • Expert skills in Linux and its administration (Ubuntu 18.04/20.04)
  • Understanding of networking (Routing/Switching, VLANs, Firewalls, and Load Balancers)
  • Experience in a high-level language such as Go, Python and/or Java
  • Observability (Prometheus, AlertManager, Grafana, Jaeger, and Splunk)
  • Infrastructure as Code (Ansible / Terraform)
  • CI/CD pipeline experience (GitHub Enterprise)
  • Artifact management (Artifactory)
  • Strong bias for action vs endless planning, willing to get hands dirty and
    learn from failure make mistakes sometimes
  • Good hands-on years of experience of
    DevOps/Site Reliability Engineer
  • Habitual documenter and spreader of knowledge
  • Willing to mentor other team members and engineers with less SWE/IT type knowledge
  • Comfortable on an on-call rotation
  • Comfortable doing live troubleshooting of issues on NOC bridges/outage calls

What we offer

You will be working in our state-of-the-art Gigafactory
where you’ll solve the world’s most interesting problems with the best and
brightest people who share a passion to change the world. Tesla’s compensation
package includes competitive salary and Tesla shares or bonusses. Typical
benefits that are offered are a pension program, 30 vacation days, employee
insurances, relocation and commuting support.