KNOWLEDGESG GLOBAL PTE. LTD. is hiring for a Data Engineer internship — a 12-month, on-site Data Science role based in Singapore. It is an unpaid internship. It is open to university students, typically in Year 2–4. Applicants with experience in PySpark, Design, Apache Spark, Data Modelling, and ETL Tools are a strong fit.
⚡ New Data Science internships, the moment they're posted — join our Telegram
About this role
Key Responsibilities Data Engineering & Integration• Design, build, and optimize ETL/ELT pipelines using Apache Spark, PySpark, Databricks, Azure Synapse, or equivalent platforms. • Develop scalable batch and real-time data processing solutions. • Integrate data from Core Banking, Payments, Treasury, Trade Finance, CRM, Compliance, and Risk systems. • Develop and maintain enterprise data models including 3NF, Dimensional Modeling, and Data Vault 2.0. Streaming & Modern Data Platforms• Build and operationalize real-time streaming pipelines using Kafka, Confluent, or Azure Event Hubs. • Support data platform modernization initiatives, including migration from legacy platforms (e.g., Teradata, DB2) to cloud-native environments such as Snowflake, Databricks, or Azure Synapse. • Implement scalable cloud-based data lake and data warehouse architectures. Data Quality & Governance• Implement data quality, validation, lineage, and observability frameworks using tools such as Great Expectations, Deequ, or dbt. • Collaborate with Governance and Security teams to ensure compliance with enterprise data standards. • Support metadata management, cataloging, and lineage initiatives using Azure Purview, Apache Atlas, or Collibra. Regulatory & Compliance Support• Support regulatory reporting and risk data flows including:MAS 610MAS 649Basel III / Basel IVIFRS 9 / IFRS 17BCBS 239 • Ensure data security controls including encryption, tokenization, masking, RBAC, and audit logging are implemented. DevOps & MLOps• Develop CI/CD pipelines using Azure DevOps, GitHub Actions, or Terraform. • Collaborate with Data Scientists and AI teams to deploy ML feature stores and model-serving pipelines. • Support automation and Infrastructure-as-Code (IaC) initiatives. Required Technical SkillsProgramming Languages• Python • PySpark • SQL • Scala Data Platforms• Azure Data Lake • Azure Synapse Analytics • Databricks • Snowflake Data Orchestration• Apache Airflow • Azure Data Factory (ADF) • dbt Streaming Technologies• Apache Kafka • Confluent Platform • Azure Event Hubs Data Governance• Azure Purview • Apache Atlas • Collibra Security & Compliance• Encryption • Tokenization • Role-Based Access Control (RBAC) • Audit Logging DevOps & Infrastructure• Terraform • Azure DevOps • GitHub Actions Qualifications• Bachelor's or Master's Degree in Computer Science, Data Engineering, Information Technology, or related discipline. • 6–10 years of Data Engineering experience. • Minimum 3 years of experience within Banking, Financial Services, Insurance, or Capital Markets environments. • Strong experience designing and implementing cloud-based data platforms on Azure and/or AWS. • Hands-on experience with batch and real-time data processing frameworks. • Understanding of regulatory reporting and risk data management frameworks.
Also in Data Science