Naukrijobs UK
Register
London Jobs
Manchester Jobs
Liverpool Jobs
Nottingham Jobs
Birmingham Jobs
Cambridge Jobs
Glasgow Jobs
Bristol Jobs
Wales Jobs
Oil & Gas Jobs
Banking Jobs
Construction Jobs
Top Management Jobs
IT - Software Jobs
Medical Healthcare Jobs
Purchase / Logistics Jobs
Sales
Ajax Jobs
Designing Jobs
ASP .NET Jobs
Java Jobs
MySQL Jobs
Sap hr Jobs
Software Testing Jobs
Html Jobs
IT Jobs
Logistics Jobs
Customer Service Jobs
Airport Jobs
Banking Jobs
Driver Jobs
Part Time Jobs
Civil Engineering Jobs
Accountant Jobs
Safety Officer Jobs
Nursing Jobs
Civil Engineering Jobs
Hospitality Jobs
Part Time Jobs
Security Jobs
Finance Jobs
Marketing Jobs
Shipping Jobs
Real Estate Jobs
Telecom Jobs

Head of Site Reliability £80k + 10% Bonus

Job LocationLeeds
EducationNot Mentioned
Salary£70,000 - £80,000 per annum
IndustryNot Mentioned
Functional AreaNot Mentioned
Job TypePermanent, full-time

Job Description

Our client is on the lookout for a Head of Site Reliability Engineering to join their team!Although this role is remote, the right individual does need to be within an accessible distance to their Leeds offices, flexibility to attend their offices every so often will be essential.In response to Coronavirus, their employees are currently 100% homeworking. Applicants must be in a position to work from their home and have a secure workstation appropriate for home working for an extended period.Our client provides 24/7/365 early-warning risk intelligence as a service for leading brands, global enterprises and social media platforms, providing customers with a real-time risk defense, with intelligence and compliance solutions guaranteed to ensure their customers are always the first to know and act.These risks can take many forms, including activist attacks, hate speech, threats, fake news, false rumours, illegal content, compliance failures and many more. Fuelled by the increased popularity of closed social media groups and messaging apps, this harmful content can now spread virally, at scale before it reaches mainstream media channels. Unfortunately, anyone has the power to create and share harmful content, especially instigators and influencers who distribute millions of new types every day.Established in 2005 by a social media entrepreneur began protecting children and teenagers using online games and social networks from abuse, sexual exploitation, cyberbullying and other online harms. This relentless focus on helping to create a digital world that is safe for everyone has been their mission from day one. Today that passion extends to working with leading brands, global enterprises and social media platforms.Our client currently protects over $4.5 trillion of aggregate market capitalisation across their current customer base. This demonstrates both the value and uniqueness of their service and the trust their customers have in protecting their reputational risk.This new role, Head of Site Reliability Engineering is a great opportunity for someone that wants to take ownership and responsibility for this growing area.You will be responsible for the design and implementation of their runbook/strategy for their Cloud hosted platforms, including their approach to 24x7 monitoring, reporting and delivery support.REQUIREMENTSYou will define, scope, implement and maintain an appropriate digital /site reliability operation, including escalation, communication and service restoration processes and procedures will be your day to day responsibility.Working with your team, you will develop a range of measures to proactively monitor their platforms to ensure continuous and robust platform operations and adherence to predefined platform up time SLAs.These measures would, for example, consider:

  • Overall platform health
  • API quota usage
  • Platform error curation
  • Anomaly detection
  • Developing and maintaining release and update processes, you will ensure monitoring, runbooks and release procedures are kept up to date. Regular platform resiliency testing will be your priority in order to ensure maximum platform up time in the event of a loss of a data centre availability zone or nodes in the cluster.Managing the risk register for the Digital Operations Centre will include:
  • Monitoring gaps and improvements, and formally reviewing these on at least a quarterly basis with the wider Digital Delivery team
  • Following-up on agreed risk mitigation and incident actions, ensuring that they are successfully implemented, and the risk assessment adjusted accordingly
  • Documenting and implementing the business’ Incident Response Plan, including root cause investigation and adoption of any appropriate mitigating actions, will be one of your first projects. Following the implementation, you and your team will ensure that the plan is regularly tested, reviewed and updated as appropriate.
  • You will be responsible for managing a team. This will include developing their capabilities, measuring their effectiveness and implementing a set of KPI’s to monitor performance.Key Experience
  • Experience in the creation, implementation and ongoing management of a digital operations / site reliability function and team
  • Experience managing an office based 24/7 Operations team
  • Experience managing complex proprietary software
  • Be proficient in SQL
  • Experience implementing monitoring software, such as Zabbix or Nagios
  • Experience of REST APIs for monitoring purposes
  • Ability to write and maintain monitoring scripts
  • Experience of working in a cloud technology environment, ideally Google Cloud
  • Able to manage incidents with a calm and confident approach
  • BENEFITS
  • 33 days holiday including Bank Holidays
  • Critical Illness
  • Life Insurance Cover
  • Healthcare Cash Plan
  • An attractive pension / 401k retirement plan scheme
  • Cycle to Work Scheme and Commuter Loans to help your journey to work
  • Employees perks with freebies, discounts, rewards and more
  • Free yoga classes twice a week
  • Subsidised gym membership
  • To be considered for this job, please apply via this advert. Required skills
  • Cloud Computing
  • REST
  • SQL
  • Nagios
  • Zabbix
  • Keyskills :
    Cloud Computing REST SQL Nagios Zabbix

    APPLY NOW

    Head of Site Reliability £80k + 10% Bonus Related Jobs

    © 2019 Naukrijobs All Rights Reserved