GECO ASIA PTE. LTD. is hiring for a Machine Learning Engineer internship — a 12-month, on-site Software Engineering role based in Singapore. It is an unpaid internship. It is open to university students, typically in Year 2–4. Applicants with experience in Cloudera, Machine Learning, PySpark, Big Data Framework, and Hadoop Database are a strong fit.
⚡ New Software Engineering internships, the moment they're posted — join our Telegram
About this role
Job Summary Role: Machine Learning Engineer Start: ASAP Duration: 12 Months Location: Singapore We are seeking an experienced Machine Learning Engineer with 10+ years of overall experience and at least 6 years of relevant experience in Machine Learning Engineering, Data Engineering, and Big Data platforms. The ideal candidate will be responsible for designing, developing, and optimizing large-scale data and machine learning systems that support real-time and batch processing workloads across enterprise environments. This role requires strong expertise in the Hadoop ecosystem, distributed data processing, machine learning operationalization, and scalable data platform development. The successful candidate will work closely with Data Scientists, Data Engineers, and Platform Teams to build robust data ingestion frameworks, operationalize machine learning models, and deliver high-performance AI-driven solutions capable of processing structured, unstructured, and multi-modal data. Requirements: - Bachelor's degree in Computer Science, Information Technology, Data Science, Engineering, or a related field. - Minimum 10 years of overall software, data engineering, or platform engineering experience. - At least 6 years of hands-on experience in Machine Learning Engineering, Big Data Engineering, or Data Platform Development. - Strong expertise in Hadoop ecosystem technologies (Apache Spark, Apache Hive, Apache Kafka, Apache Flink, Apache NiFi, Apache Iceberg, etc.) - Strong programming experience in Java and/or Python. - Extensive experience developing data ingestion, transformation, and processing frameworks. - Hands-on experience with batch and real-time data processing architectures. - Strong understanding of distributed computing, scalable data platforms, and high-volume data processing systems. - Experience operationalizing machine learning models using platforms such as Cloudera Machine Learning (CML), Spark MLlib, or equivalent ML platforms. - Experience with machine learning libraries such as scikit-learn, XGBoost, and related Python-based ML frameworks. - Strong proficiency in shell scripting and automation. - Experience performing performance tuning and optimization of Hadoop-based applications and clusters. - Strong problem-solving, analytical, and troubleshooting skills. - Experience working within Agile development environments. Roles and Responsibilities: - Design, develop, and maintain highly scalable real-time and batch processing systems using Hadoop ecosystem technologies including Spark, Kafka, Flink, Hive, Iceberg, Trino, NiFi, Ranger, and Ozone. - Build and enhance robust data ingestion, transformation, and processing frameworks capable of handling structured, semi-structured, and multi-modal data sources including image, audio, video, and unstructured documents. - Develop scalable data pipelines using Java, Python, Spark, and shell scripting to support enterprise data and AI workloads. - Collaborate closely with Data Scientists to deploy, operationalize, monitor, and maintain machine learning models in production environments. - Utilize Cloudera Machine Learning (CML) and related ML platforms to support model lifecycle management and deployment. - Design and implement data engineering solutions that support both real-time streaming and large-scale batch processing requirements. - Develop internal engineering tools, automation solutions, and full-stack applications using Python and modern web frameworks such as Flask and React. - Perform performance tuning, optimization, and troubleshooting of Hadoop-based applications and distributed data processing systems. - Monitor system performance, resource utilization, and platform stability while implementing improvements to maximize efficiency and scalability. - Ensure adherence to data governance, security, and access control standards using tools such as Apache Ranger. - Participate in architecture reviews, technical design discussions, and platform modernization initiatives. - Support CI/CD implementation, deployment automation, and operational excellence practices for machine learning and data engineering workloads. - Stay current with emerging technologies, machine learning frameworks, and big data platform innovations. Please send your application highlighting: - Your relevant experience - Current/expected salary - Availability information - A latest MS-WORD Resume **We regret that only short-listed applicants will be contacted.** GECO Asia values the data privacy rights of our customers, associates, partners and prospective applicants. We have a privacy policy that governs our collection and use of personal data in place. In conjunction with the PDPA act in Singapore, we have updated our Privacy Policy and Terms of Use to better clarify our collection and use of your personal information. The same can be found here (https://www.geco.asia/about/privacy-policy) Note: GECO Asia is an Information Technology Consulting Services provider. We provide specialist IT and Digital Transformation specialist resources on a project (SOW) and/or permanent basis. We operate under a Comprehensive License offered by Ministry of Manpower, Singapore. [GECO Asia Pte Ltd, License No. 07C4453] [2 Venture Drive, #10-18 Vision Exchange, Singapore 608526]
Also in Software Engineering