Founding Site Reliability Engineer (Remote - US) Job at Jobgether, San Francisco, CA

TVR0ejJ0Si9PSnVNd2VIVFJRMkd0SEVFK2c9PQ==
  • Jobgether
  • San Francisco, CA

Job Description

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Founding Site Reliability Engineer in the United States .

This is a unique opportunity to join a rapidly growing AI company as the first SRE hire in the San Francisco office. In this role, you will define and scale the Site Reliability Engineering discipline, ensuring the platform is reliable, secure, and performant at enterprise scale. You will work closely with engineering leads, product teams, and company founders to build infrastructure, establish best practices, and drive the organization’s reliability culture. The role involves hands‑on system design, automation, and observability work, while providing leadership and strategic input to shape long‑term operational excellence. Ideal candidates are technically strong, highly collaborative, and motivated by building world‑class systems from the ground up.

Accountabilities

  • Establish and scale the SRE discipline , including best practices, tooling, and culture.
  • Ensure 99.9% uptime of production systems and maintain global platform reliability.
  • Architect, automate, and manage AWS infrastructure using Terraform, CI/CD pipelines, and Infrastructure as Code.
  • Design and implement observability systems across microservices, APIs, and vector workloads, including metrics, tracing, and logging.
  • Lead incident management , reducing MTTR through runbooks, alerts, and postmortems.
  • Collaborate with engineering teams to embed reliability principles into the software development lifecycle.
  • Influence organizational strategy and culture as a founding voice in the engineering team.

Qualifications

  • 5+ years of experience in SRE, DevOps, or infrastructure roles, ideally in enterprise SaaS environments.
  • Expertise in AWS services (EC2, ECS/EKS, Lambda, RDS, VPC, IAM).
  • Proven experience with Infrastructure as Code (Terraform, Kubernetes/EKS, CDK, or CloudFormation).
  • Hands‑on experience with observability and monitoring stacks (CloudWatch, Grafana, Prometheus, Datadog).
  • Experience in incident management, on‑call responsibilities, and postmortem‑driven reliability improvements.
  • Bonus: exposure to AI/ML platforms, data‑heavy systems, or multi‑agent workloads.
  • Strong problem‑solving, communication, and collaboration skills.

Benefits

  • Competitive salary and equity options.
  • Health, dental, and vision insurance, including dependents coverage.
  • Paid time off and holidays, with parental leave benefits.
  • 401(k) plan and other financial perks.
  • Opportunity to shape company culture and systems at a high‑growth AI startup.

Thank you for your interest!

#J-18808-Ljbffr

Job Tags

Work at office, Remote work,

Similar Jobs

Wellspring Nurse Source

Travel Clinical Care RN - Home Health & Wound Care Job at Wellspring Nurse Source

 ...Job Description Wellspring Nurse Source is seeking a travel nurse RN Clinic Home Health for a travel nursing job in Battle Creek, Michigan. Job Description & Requirements ~ Specialty: Home Health ~ Discipline: RN ~ Start Date: ASAP ~ Duration: 13 weeks... 

Stevens Creek Cadillac

Apprentice Mechanic Job at Stevens Creek Cadillac

 ..., including Diagnostic, Electrical and Engine Repair* Dexterity, requiring a steady hand, excellent hand-eye coordination* Mechanical and troubleshooting skills and ability to operate electronic diagnostic equipment* Excellent customer service skills and basic... 

DW Simpson

Senior Manager - FSA Actuarial Asset Modeling - Remote Job at DW Simpson

Enjoy the luxury of a remote work environment with a financially solid company. A leading life insurer is searching for a Career ASA or FSA with 3+ years of experience to be their next Senior Manager - Actuarial Asset Modeling. This role will lead the development of asset... 

Comfrt

SEO Specialist Job at Comfrt

 ...supported and seen. Role Overview: The SEO Specialist will be responsible for...  .... This role will work closely with the marketing, growth, tech & product teams to align on...  ...prioritization and impact Ability to manage multiple initiatives and deadlines in a fast... 

LocumTenens.com

Dermatologist Needed for Locum Tenens Coverage at Facility in NJ Job at LocumTenens.com

 ...shifts without cosmetic procedures. Credentialing must be completed within 60 days for the on-site clinic assignment lasting through parts of 2026. This Job at a Glance Job Reference Id: ORD-203553-MD-NJ Title: MD Dates Needed: Jan-Q3/Q4 2026 Shift Type:...