RAPSYS TECHNOLOGIES PTE. LTD. is hiring for a Senior Data Engineer internship — a 12-month, on-site Government Policy role based in Singapore. It is an unpaid internship. It is open to university students, typically in Year 2–4. Applicants with experience in Terraform, Airflow, Azure, Pipelines, and Data Pipeline are a strong fit.
⚡ New Government Policy internships, the moment they're posted — join our Telegram
About this role
Responsibility Data Pipeline Development and Architecture a) Translate data requirements from business users into technical specifications. b) Build out data products as part of a data team. c) Architect and build ingestion pipelines to collect, clean, merge, and harmonise data from different source systems. d) Construct, test, and update useful and reusable data models based on data needs of end users. e) Design and build secure mechanisms for end users and systems to access data in the data warehouse. f) Research, propose, and develop new technologies and processes to improve agency data infrastructure. Data Quality, Testing and Validation a) Implement data quality checks and validation processes to ensure data accuracy and consistency across all pipeline stages. b) Design and execute test cases for data pipelines, covering functional, regression, and exploratory testing to validate data transformations and business logic. c) Develop and maintain automated test coverage for data pipelines to prevent bugs and defects from reaching production. d) Perform data reconciliation and integrity testing to validate end-to-end data flows across source and target systems. e) Support User Acceptance Testing (UAT) by coordinating with stakeholders to validate that data products and pipelines meet business requirements. Operations, Governance and Continuous Improvement a) Conduct day-to-day monitoring of databases and ETL systems, including capacity planning, maintenance, performance tuning, and issue diagnosis. b) Collaborate with partner agency IT teams on technology stack, infrastructure, and security alignment. c) Collaborate with data stewards to establish and enforce data governance policies, best practices, and procedures. d) Maintain a data catalogue to document data assets, metadata, and lineage. e) Implement and enforce data security best practices, including access control, encryption, and data masking, to safeguard sensitive data. f) Drive continuous improvement in CI/CD and Site Reliability Engineering (SRE) practices. Experience and Skills Needed Qualifications and Experience a) Bachelor's Degree, preferably in Computer Science, Software Engineering, Information Technology, or related disciplines. b) Deep understanding of system design, data structures and algorithms, data modelling, data access, and data storage. c) Experience in designing, building, and maintaining batch and real-time data pipelines. d) Experience with orchestration frameworks such as Airflow and Azure Data Factory. e) Proficiency in working with Python, Shell Scripts, and SQL. f) Demonstrated ability in using cloud technologies such as AWS, Azure, and Google Cloud. g) Experience with Databricks. Data Quality and Testing a) Experience designing and executing data quality checks, validation processes, and test cases for data pipelines. b) Familiarity with test automation approaches for data pipelines, including integration with CI/CD pipelines. c) Experience with defect tracking and test management tools such as JIRA or equivalent. DevOps and CI/CD Practices a) Familiarity with building and using CI/CD pipelines. b) Familiarity with DevOps tools such as Docker, Git, and Terraform. c) Experience implementing technical processes to enforce data security, data quality, and data governance. Domain Knowledge (Added Advantage) a) Familiarity with government systems and policies relating to data governance, data management, data infrastructure, and data security. b) Experience in Climate and Weather domains will be an advantage.
Also in Government Policy