Apply now »

Consultant - Platform Site Reliability Engineer

Consultant -Infrastructure Senior Engineer - SRE

Consultant -Infrastructure Senior Engineer - SRE

 

 

  • Building and maintain the deployment architecture to meet the development and maintenance requirements of systems/platforms.
  • Working with development teams to evaluate the health, stability, and reliability of applications.
  • Identify and act on opportunities to take the deployment architecture to the next level in reliability, cost-effectiveness, and ease of use.
  • Empower engineers to build, test, deploy, and monitor services by themselves.
  • Define and implement solutions that eliminate repeating escalations.
  • TOIL Reduction: Research and analyses trends and behavioural data to identify opportunities for improvements and new initiatives Utilizing monitoring, alerts, dashboards, and management tools to ensure the availability, reliability and performance of applications and services.
  • Constantly working to improve and implement automation of applications tasks.
  • Error Budgeting: Providing technical support for systems/platforms according to application SLA's.
  • Responsible for designing and developing resiliency in the application code, troubleshooting incidents, engaging with squads to address failure patterns, and participating in incident management.
  • Site Reliability Engineering: Knowledge of the theories and methodologies of reliability engineering; ability to design, develop and support various tools, services and applications to maintain a reliable application environment.
  • Performance Measurement and Tuning: Knowledge of system performance, testing and programming; ability to monitor, measure, and optimize system performance and network communication.
  • CI/CD Pipeline: Knowledge of concepts, values and tools applied in building Continuous Integration (CI), Continuous Delivery and Continuous Deployment (CD) pipeline; ability to design, build, implement and maintain CI/CD pipelines to achieve the automation of software delivery process (AWS, Azure, Git).
  • IT Release Management: Knowledge of strategies, practices, and tools for managing versions and distribution of software products and enhancements; ability to evaluate and improve release management practices and tools and hands on expertise with Python, Ansible, Shell scripting.
  • Agile Development: Knowledge of agile methodologies and the agile development lifecycle; ability to utilize formal agile methodologies, disciplines, practices and techniques for the delivery of new and enhanced applications.
  • Container: Knowledge of concept, functions, and capabilities of container tools and techniques; ability to effectively apply containers in various IT business environments.
  • Good knowledge and considerable hands-on experience and know-how of HP-UX, Red hat Linux OS administration skills, web tier, DB (Oracle/SQL), BCP, DR services and associated technologies.
  • Cloud Platform: Knowledge of the products and services regarding cloud platforms; ability to utilize related tools and technologies to develop cloud solutions and deploy applications on cloud platforms. 
  • Understanding of Quality Assurance processes and assessment techniques on operational/infrastructure.

      Following Infrastructure knowledge is essential to perform the role at the capacity of the Lead.

  • Red Hat Enterprise Linux and HP-UX System Administration
  • Windows Systems Administration
  • Virtualisation technologies
  • Scripting technologies
  • IT System Performance Monitoring/Tuning and Capacity management
  • Server hardware support
  • Enterprise Storage Solutions
  • Storage replication technologies
  • SAN technologies
  • System management and monitoring tools
  • Exposure to Oracle middleware products
  • A high level understanding of Oracle Database technologies
  • Some knowledge of SQL database
  • Exposure to MicroFocus Cobol and C/C++
  • IT Business Continuity and Disaster Recovery service provision
  • TCP/IP networking and troubleshooting
  • Hyperconverged technologies
  • Software defined architectures
  • Devops working practices
  • Containerised workloads

Mandatory Skills:

  • 12 plus years of experience as an application developer or SRE concepts
  • 6 or more years of experience with ops automation using a scripting language

such as Shell/ Power-shell, Python, Ansible.

  • Proven experience in lead a small team of SRE engineers, automation Squad (4-6 members).
  • Able to produce SLA /SLI/SLO and observability related reports and act as  SRE focal for the team.
  • SRE Foundation / Practioner certification Preferred.

 

Apply now »