Job Description

Site Reliability Engineer



Date Posted:


Employment Type:



Christopher Stile

Recruiter Email:

Job ID:

JN -012019-13008

Job Description

  •  Hands on experience with performance tuning of Linux OS (CentOS) in identifying bottlenecks such as disk I/O, memory, CPU and network issues.
  •  Extensive experience with at least one scripting language apart from BASH (Ruby, Perl, Python).
  •  Strong understanding of TCP/IP networking, including familiarity with concepts such as OSI stack.
  •  Ability to analyze network behaviour, performance and application issues using standard tools.
  •  Hands on experience automating the provisioning of servers at a large scale (using tools such as Kickstart, Foreman etc).
  •  Hands on experience in configuration management of server farms (using tools such as mcollective, Puppet, Chef, Ansible etc).
  •  Hands on experience with open source monitoring and graphing solutions such as Nagios, Zabbix, Sensu, Graphite etc.
  •  Strong understanding of common Internet protocols and applications such as SMTP, DNS, HTTP, SSH, SNMP etc.
  •  Experience running farms of servers (at least 200+ physical servers) and associated networking infrastructure in a production environment.
  •  Hands on experience working with server hardware such as HP Proliant, Dell PowerEdge or equivalent.
  •  Be comfortable with working on call rotas and out of hours working as and when required to ensure uptime of service’s requirements.
Apply for this job