Site reliability Engineer

Job Location	London
Education	Not Mentioned
Salary	Salary negotiable
Industry	Not Mentioned
Functional Area	Not Mentioned
Job Type	Permanent, full-time

Job Description

****SITE RELIABILITY ENGINEER JOB OPPORTUNITY- REMOTE WORK*****ROLE-SITE RELIABILITY ENGINEER LOCATION- REMOTE WORKDURATION- 3 MONTHS ROLLING CONTRACTRATE- COMPETITIVE Our client is an International energy and services company providing energy services to consumers across the globe and driving innovative products and solutions to drive power to the customer in more ways than one. New Energy Platform is a new business unit, building the future energy supply platform for our UK customers. As a key part of the Technology function, we are creating a Site Reliability Engineering (SRE) team. The SRE team will work with our squad based engineering teams, global security and networks team and key vendors to determine aspects of applications that should be monitored, alerts that should be raised and what tooling or automation should be put in place to aid issue resolution and capacity planning.The engineering teams already follow a DevOps approach, owning both the code and production and we consult with them to adopt more SRE concepts and practices, helping them develop and meet their SLOs, learn from outages, and navigate a path from an honest assessment of where they are to where they would like to be.Duties

Deep understanding of SRE philosophy, technologies, platforms and tools, SLA management, incident resolution, and automation

Focus on system reliability, performance, and supportability by balancing feature development velocity and reliability with well-defined SLOs

Youll improve CI/CD pipelines to increase development squads velocity and confidence while automating provisioning, quality controls, security auditing and maintenance

Establish, manage and optimise our monitoring solutions to achieve observability

Support squads with best practices in monitoring and improving alert thresholds

Design monitoring systems that prioritize the customer perspective and experience

Contribute to architectural and design principles to drive reliability, scalability and reusability for a large-scale distributed platform

Work with development squads to implement automation opportunities to drive down toil and reduce technical debt

Carrying out end-to-end stability inspections to take a holistic view of system health and proactively mitigate customer impacts

Firefighting stability problems with business teams and engage in troubleshooting, service capacity planning and demand forecasting, platform performance analysis and system tuning

Conducting post-incident reviews and trend analysis and owning the learning loop back to the development squads

Providing reports on system health built around the service level indicators (SLIs)

The role requires flexibility to participate in rotating on-call duties and timely post-mortems of production incidents.

Describe key competencies required.

BS degree in Computer Science or related technical field involving coding or systems engineering

5+ years working in an SRE or DevOps team supporting a scaled production platform

Certification(s) within Cloud Architecture and/or AWS

Experience of implementing, maintaining and optimising a CI/CD pipeline

Real-world coding, whether thats with traditional compiled languages or scripting languages or both.

Experience of working within Cloud Computing and familiarity with Infrastructure as Code.

Candidates will ideally show evidence of the above in their CV in order to be considered.Please be advised if you havent heard from us within 48 hours then unfortunately your application has not been successful on this occasion, we may however keep your details on file for any suitable future vacancies and contact you accordingly.Pontoon is an employment consultancy and operates as an equal opportunitys employer.Please email me Required skills

software reliability skills

Keyskills :
software reliability skills

APPLY NOW

Site reliability Engineer Related Jobs

Retail Advisor

Virgin Media O2

London 14 May, 2024

View & Apply
Graduate Software Developer

Active Test

London 14 May, 2024

View & Apply
Housing Officer

The Hyde Group

London 14 May, 2024

View & Apply
Gardener - 40

Clarion

London 14 May, 2024

View & Apply
Interim HR Change Manager

Morgan Law

London 14 May, 2024

View & Apply
Property Manager (Prime London)

BBL Property recruitment Ltd

London 14 May, 2024

View & Apply
Assistant Manager

Rudy's Pizza

London 14 May, 2024

View & Apply
Patent Litigation Associate

Interlink Talent Solutions

London 14 May, 2024

View & Apply
School Business Manager

Engage Education Services

London 14 May, 2024

View & Apply
People Coordinator

Michael Page HR

London 14 May, 2024

View & Apply

Site reliability Engineer

Job Description

Site reliability Engineer Related Jobs

Retail Advisor

Graduate Software Developer

Housing Officer

Gardener - 40

Interim HR Change Manager

Property Manager (Prime London)

Assistant Manager

Patent Litigation Associate

School Business Manager

People Coordinator

Jobs By Category

Jobs By Skills

Jobs By Location

Main Menu

Jobseekers

Employers