Key Responsibilities• Design, develop, and maintain data pipelines, ETL/ELT processes, and data integration workflows. • Architect and optimize data lakes, data warehouses, and streaming platforms. • Work with structured, semi-structured, and unstructured data at scale. • Implement real-time and batch data processing solutions. • Collaborate with Data Scientists, Analysts, and Business stakeholders to deliver high-quality data solutions. • Ensure data security, lineage, governance, and compliance across platforms. • Optimize queries, data models, and storage for performance and cost efficiency. • Automate processes and adopt DevOps/DataOps practices for CI/CD in data engineering. • Troubleshoot complex data-related issues and resolve production incidents. • Mentor junior engineers and contribute to technical strategy and best practices. Technical Skills (Must-Have Tough Requirements)Programming & Scripting • Proficiency in Python, Scala, or Java for data engineering. • Strong SQL skills (query optimization, tuning, advanced joins, window functions). Big Data & Distributed Systems • Expertise with Apache Spark, Hadoop, Hive, HBase, Flink, Kafka. • Hands-on with streaming frameworks (Kafka Streams, Spark Streaming, Flink). Cloud & Data Platforms • Deep knowledge of AWS (Redshift, Glue, EMR, Athena, S3, Kinesis), or Azure (Synapse, Data Factory, Databricks, ADLS), or GCP (BigQuery, Dataflow, Pub/Sub, Dataproc). • Experience with Snowflake, Databricks, or Teradata. ETL/ELT & Orchestration • Strong experience with Airflow, Luigi, Azkaban, Prefect. • ETL tools like Informatica, Talend, SSIS. Data Modeling & Storage • Experience with Data Lakes, Data Warehouses, and Lakehouse architectures. • Knowledge of Star Schema, Snowflake Schema, Normalization/Denormalization. DevOps & Automation • Proficiency in CI/CD (Jenkins, GitLab, Azure DevOps) for data pipelines. • Experience with Docker, Kubernetes, Terraform, Ansible for infrastructure automation. Other Tough Skills • Strong knowledge of Data Governance, MDM, Data Quality, Metadata Management. • Familiarity with Graph Databases (Neo4j), Time-Series Databases (InfluxDB, TimescaleDB). • Understanding of machine learning data pipelines (feature engineering, model serving). Qualifications• Bachelor’s/Master’s degree in Computer Science, Data Engineering, or related field. • 7–10 years of experience in data engineering or big data development. • At least 2–3 large-scale end-to-end data platform implementations. • Preferred Certifications: AWS Certified Data Analytics – Specialty Google Professional Data Engineer Databricks Certified Data Engineer

Data Engineer

Similar roles

Logistics & Inventory Operations Executive (Part time) (Japanese Speaking)

Research Analyst (Internship)

Assistant Manager, Insight, RHSO (2-Year Contract)(9139)