About this role
Responsibilities • Design, develop, and implement advanced SRE tooling and automation solutions using Python and Java to improve system reliability and operational efficiency across the infrastructure lifecycle • Lead the adoption and continuous improvement of CI/CD pipelines by applying DevOps best practices and tools to enable rapid, secure, and reliable software delivery and infrastructure provisioning • Proactively monitor, troubleshoot, and optimize system performance and reliability by identifying and resolving complex incidents and implementing preventative automation and root cause analysis • Collaborate with development and operations teams to embed SRE principles, define and maintain Service Level Objectives (SLOs), and ensure scalable, resilient system architecture and disaster recovery readiness • Provide secondary support and technical expertise for messaging and middleware platforms, including Kafka and MQ, by assisting with administration, configuration, performance tuning, and incident resolution • Conduct architectural reviews to identify infrastructure gaps, remediate network vulnerabilities, and advise application teams on operational excellence, security, and compliance best practices Required competencies and certifications • Demonstrated expertise in Python and Java for automation, scripting, and SRE tool development • Hands-on experience in Linux system administration • Proven experience with CI/CD practices and DevOps toolchains such as Jenkins and GitLab CI • Proficiency in version control systems (GIT) and agile project management tools (Jira) • Strong understanding and application of SRE principles including SLOs, error budgets, monitoring, alerting, and incident management • Foundational to intermediate knowledge and practical experience with Kafka and MQ messaging and middleware technologies, including administration and troubleshooting • Exceptional written and verbal communication skills with a solid understanding of ITIL processes • Minimum of 3 to 6 years of progressive experience in DevOps, SRE, or technical infrastructure roles, preferably supporting mission-critical banking systems
Also in Software Engineering