DIGIWORLD TECHNOLOGIES PTE. LTD. is hiring for a Machine Learning Engineer internship — a 12-month, on-site Data Science role based in UBI CRESCENT, Singapore. It is an unpaid internship. It is open to university students, typically in Year 2–4. Applicants with experience in TensorFlow, Adobe Spark, scientific software, Scala, and scikit-learn are a strong fit.
⚡ New Data Science internships, the moment they're posted — join our Telegram
About this role
Job Objective: Design and deliver scalable real-time data and machine learning solutions by building robust ingestion and transformation frameworks across Hadoop ecosystems. Enable end-to-end ML model operationalization and performance optimization, while supporting multi-modal data processing and development of engineering tools and applications. Responsibilities: • Design and develop highly scalable, Real time systems using Hadoop ecosystem components(Iceberg, Spark, Ozone, Trino, Hive, Ranger, Kafka, Flink and Nifi) • Build robust data ingestion and transformation frameworks using Java, Spark, Python, and shell scripting for ingesting multi model data(image, audio, video, unstructured documents) with both batch and real-time. • Develop full stack applications and internal engineering tools using Python, shell scripting, and modern web frameworks (e.g., Flask, React). • Collaborate closely with data scientists to operationalize machine learning models using Cloudera Machine Learning (CML). • Perform performance tuning and optimization of data applications on Hadoop to ensure optimal resource utilization. Requirements: • Experience working with ML platforms such as CML, Spark MLlib, and Python ML libraries (scikit learn, XGBoost), including model deployment. • Design and develop highly scalable, Real time systems using Hadoop ecosystem components(Iceberg, Spark, Ozone, Trino, Hive, Ranger, Kafka, Flink and Nifi) • Build robust data ingestion and transformation frameworks using Java, Spark, Python, and shell scripting for ingesting multi model data(image, audio, video, unstructured documents) with both batch and real-time. • Develop full stack applications and internal engineering tools using Python, shell scripting, and modern web frameworks (e.g., Flask, React). • Collaborate closely with data scientists to operationalize machine learning models using Cloudera Machine Learning (CML). • Perform performance tuning and optimization of data applications on Hadoop to ensure optimal resource utilization.
Also in Data Science