HPC / Linux / Cloud Engineer (GPU, Kubernetes, AWS/Azure) – 1 Year Renewable ContractAbout the RoleWe are seeking motivated HPC / Linux / Cloud Engineers (Junior to Lead level) to support and scale high-performance computing (HPC) environments powering advanced analytics, AI/ML, and GPU-intensive workloads. This role offers a rare opportunity to work across Linux infrastructure, GPU clusters, Kubernetes platforms, and cloud-native environments supporting enterprise-scale compute platforms. Candidates with strengths in Linux systems, cloud engineering, platform engineering, containers, or HPC environments are encouraged to apply. Senior Engineer• 5+ years of relevant experience Key Responsibilities• Manage and support Linux-based compute infrastructure and HPC clusters • Administer CPU/GPU environments for AI/ML and high-performance workloads • Monitor infrastructure performance, availability, and system health • Troubleshoot Linux, storage, networking, and cluster-related issues • Support and optimize workload schedulers such as PBS Professional or Slurm • Deploy and manage containerized workloads using Docker and Kubernetes • Support cloud-based HPC deployments across AWS, Azure, or GCP • Assist AI/ML teams with GPU utilization, workload tuning, and performance optimization • Automate infrastructure provisioning and configuration using Terraform, Ansible, or similar tools • Maintain system documentation, SOPs, and operational procedures • Collaborate with engineering, data science, and infrastructure teams to improve platform reliability and scalability • Participate in incident response, root cause analysis, and system recovery activities Required Skills & Experience• Hands-on Linux administration experience (RHEL/CentOS/Ubuntu or similar) • Experience supporting infrastructure, cloud, or platform environments • Exposure to AWS, Azure, or Google Cloud Platform • Familiarity with containers and orchestration technologies • Basic scripting or automation experience (Bash, Python, Shell, etc.) • Understanding of infrastructure monitoring and troubleshooting Preferred / Good-to-Have Skills• HPC or cluster computing environments • Job scheduling platforms such as:PBS ProfessionalSlurm • Kubernetes, Docker, or Singularity • GPU computing technologies:NVIDIA CUDAPyTorchTensorFlow • Infrastructure as Code:TerraformCloudFormationAnsible • Monitoring and observability tools:GrafanaPrometheusELK Stack • Experience supporting AI/ML workloads • Exposure to hybrid cloud infrastructure • Understanding of networking, storage, and security concepts in HPC environments Ideal Candidate ProfileYou are someone who: • Enjoys solving infrastructure and performance challenges • Has strong Linux troubleshooting skills • Is interested in GPU, AI/ML, or large-scale compute systems • Can work across cloud, infrastructure, and platform engineering domains • Is eager to learn and grow within HPC and modern compute environments We value depth in a few technical areas rather than expecting expertise across every technology listed. Why Join• Work on cutting-edge GPU and AI/ML platforms • Gain exposure to both cloud and HPC technologies — a highly sought-after combination • Opportunity to work on enterprise-scale compute environments • Build expertise in modern platform engineering and high-performance infrastructure • Grow into specialized HPC, Kubernetes, GPU, or cloud engineering roles

HPC / Linux / Cloud Engineer (GPU, Kubernetes, AWS/Azure) – 1 Year Renewable Contract

Similar roles

AI/ML Specialist - GEN AI & NLP l 12-month contract

Pilot Plant Engineer

HPC / Linux / Cloud Lead Engineer (GPU, Kubernetes, AWS/Azure) – 1 Year Contract