About this role
Responsibilities• Manage the full OS lifecycle including installation, configuration, patching, and upgrades across HPC environments • Monitor system performance across compute nodes, login nodes, and HCI infrastructure, ensuring reliability and uptime • Perform troubleshooting and root cause analysis for system-level issues and incidents • Support configuration management and automation using scripting and relevant tools • Work with internal and external teams to resolve incidents and maintain service continuity • Participate in on-call rotation to support critical escalations • Ensure adherence to operational standards, documentation practices, and ITIL processes Requirements• At least 3 years of experience in HPC, system administration, or infrastructure operations • Strong hands-on experience with Linux (preferably Red Hat Enterprise Linux) • Familiar with cluster management, job scheduling, and monitoring tools • Proficiency in scripting (e.g. Bash, Python) and configuration management tools (e.g. Ansible, Puppet, Chef) Tyson Jay Management Pte Ltd | EA License No.: 24C2479 Ivan Lim | EA Personnel No.: R1109856
Required
Also in Operations
PANGU (SG) PTE. LTD.
LVC SINGAPORE PTE. LTD.
ACHIEVERS DREAM LEARNING CENTRE PTE. LTD.