About this role
ABOUT THE ROLE As our HPC AI Engineer, you will be a key expert supporting researchers in leveraging our new supercomputer system for large-scale artificial intelligence. You will support and optimise massive AI application workloads, working with performance engineers to profile AI applications and establish best practices. Your work will directly enable national-scale projects in multimodal AI, healthcare, and AI for Science. RESPONSIBILITIES • Provide HPC and scientific domain advice to users of NSCC systems. • Engage and collaborate with new researchers, communities, and disciplines with computationally intensive requirements. • Support and optimise large-scale AI application workloads. • Work with HPC performance engineers to profile and build performance models of the AI applications and workflows. • Design, develop and implement HPC software best practices for AI applications and workflows. • Assist in the planning and design of future HPC systems, including benchmarking AI workloads on various platforms and recommending the most suitable architecture for the research community. • Analyse system and user job data for efficient resource allocation and management. • Develop HPC utilities, dashboards and automated testing tools for NSCC HPC systems. • Develop HPC user and best practice guides for NSCC HPC systems. • Get up-to-date with scientific domain research development, HPC system and software technology. QUALIFICATIONS • Bachelor’s degree in the field of computer science, computer engineering, or other relevant areas. • Proven working knowledge of models and algorithms in at least one area of generative models, computer vision, graph neural networks, or AI for Science applications. • Ideally, 3 years of experience in developing codes for AI training and inference. • Experience in setting up AI software stacks, familiar with diversified AI software stacks. • Good knowledge in AI application performance optimisation and troubleshooting. • Strong programming skills in Python; familiar with C/C++ programming is a plus. • Familiar with the working and using of AI frameworks (e.g. PyTorch, Tensorflow, JAX) for research. • Familiar with GPU architectures and programming is highly desired. • Familiar with Linux environment, scripting languages, profiler and debugger tools. • Familiar with HPC job schedulers and container technologies. • Familiar with object storage (S3); familiar with HPC storage (Lustre) is a plus. • Demonstrated team player with strong problem-solving skills. • Demonstrated effective communication skills including the ability to articulate technical concepts to a diverse range of audiences. • Demonstrated ability and willingness to contribute novel ideas and approaches in support of the research community. • Demonstrated passion for continuous learning and exploring new technologies or domains.
Also in Software Engineering
ZALEM PTE. LTD.
PURPLE PLUTO PTE. LTD.
CONSULGURU PTE. LTD.