Site Reliability Engineer Job at ITR, Oak Ridge, TN

TWo1MjBkZDhPcGFOeHVQZVRRK0FzblFCK1E9PQ==
  • ITR
  • Oak Ridge, TN

Job Description

Job Description

Job Description

Senior Site Reliability Engineer

  • Must be able to travel onsite periodically (Oak Ridge, TN)
  • Must be eligible for a Federal Security Clearance (US Citizen)

Major Duties/Responsibilities:

  • Lead ongoing improvements in reliability and scalability for our Kubernetes and Linux based applications and services.
  • Contribute as senior technical resource to define and implement best practices and standards for the center.
  • Provide primary operational support and engineering for production applications.
  • Define and implement define KPIs, processes and drive continuous improvement.
  • Influence the architecture and implementation of solutions.
  • Tune operating systems and applications to increase performance and reliability of services.
  • Mentor junior staff and enable them for success.
  • Diagnose system operational problems quickly and effectively.
  • Participate in on-call rotation providing 24-hour, 7-day support and off-hours maintenance windows.
  • Coordinate with vendors to resolve hardware and software problems.
  • Deliver mission by aligning behaviors, priorities, and interactions with our core values of Impact, Integrity, Teamwork, Safety, and Service. Promote diversity, equity, inclusion, and accessibility by fostering a respectful workplace – in how we treat one another, work together, and measure success.

Basic Qualifications:

Bachelor’s Degree in computer science or closely related field and a minimum of 8 years of experience as an SRE/Systems Engineer. An equivalent combination of education and experience may be considered.

Preferred Qualifications:

  • Excellent interpersonal/communication skills, and the ability to work as part of a team.
  • Strong working knowledge of Unix system fundamentals and common network protocols.
  • Experience managing Linux/UNIX operating systems in a heterogeneous environment.
  • Solid understanding of networked computing environment concepts.
  • Ability to develop and maintain programs and scripts that aid in the operation and automation using various shell (primarily bash) and high-level languages (Python or Go).
  • Ability to proactively identify performance issues, problems, and areas for improvement.
  • Ability to identify requirements and to define, plan, and implement requisite solutions.
  • Ability to plan, organize, prioritize tasks, and complete assigned projects with minimal supervision.
  • Experience with continuous integration and continuous deployment software methodologies and how they apply to SRE/systems engineering.
  • An understanding of code review and familiarity with tools like GitHub and GitLab
  • Experience using tools such as Nagios, Grafana and Prometheus to monitor systems, metrics, and create dashboards.
  • Experience designing and implement highly available systems/services utilizing virtual machines and Kubernetes resources.
  • Experience participating in an opensource community with patches accepted upstream.
  • Experience deploying and maintaining automated configuration management software such as Puppet or Ansible
  • Experience implementing systems-level security technologies like SELinux and following security best practices.

Job Tags

Similar Jobs

Next Step Systems Recruiters for Information Technology Jobs...

Onsite Silicon Valley CEO Blockchain & DAO Leadership (San Jose) Job at Next Step Systems Recruiters for Information Technology Jobs...

 ...A leading technology recruitment firm is looking for a CEO with substantial blockchain experience and a strong technology background. This role requires exceptional leadership and communication skills, with responsibilities including developing strategic plans and attracting... 

National Grid

Senior Finance Partner: Strategic Insights & Growth (New York) Job at National Grid

A major energy provider is seeking a Senior Analyst in New York to support financial decision-making. This role involves developing financial models, collaborating with cross-functional teams, and presenting insights from data analysis. Candidates should have a Bachelors...

Medtronic

Medical Device Packaging Assembler - Cleanroom Role Job at Medtronic

 ...offers a base pay of $21/hr with various shifts available. No prior experience is necessary, making it an excellent opportunity for those willing to learn in a regulated environment. Comprehensive benefits include paid time off and a 401(k) plan.#J-18808-Ljbffr Medtronic

Walmart Inc.

(USA) Asset Protection Assoc - Sam's Job at Walmart Inc.

 ...), stock purchase and company-paid life insurance -Paid time off benefits include PTO, parental leave, family care leave...  ...company paid education benefit program for full-time and part-time associates in Walmart and Sam's Club facilities. Programs range from high school... 

MV Transportation

Dispatcher Job at MV Transportation

MV Transportation, Inc. is seeking a highly organized and dynamic Dispatcher to join our team, ensuring the smooth operation of our diverse transportation services across the region. As a pivotal role in our operations, the Dispatcher will be responsible for managing the...