About this role
Design reusable patterns for Agentic AI systems including RAG, Multi-Agent Orchestration, and Human-in-the-loop systems Define how different agents communicate, share state, and hand off tasks to one another Architect long-term and episodic memory layers using Vector Databases, embedding pipelines, and knowledge graphs Decide when to use high-reasoning models vs. worker models to optimise cost and performance Predict and control token usage; architect systems with semantic caching to prevent redundant LLM spend Set architectural standards for explainability, auditability, and guardrails to prevent hallucinations and bias Ensure data governance, privacy compliance, and responsible AI practices across all systems AI Infrastructure & MLOps Design scalable AI infrastructure including model serving, inference architecture, AI microservices, and APIs Architect distributed systems supporting AI workloads Define MLOps and CI/CD pipelines for AI systems Architect containerised and cloud-native deployments; design monitoring and observability for AI services Optimise for cost, performance, and scalability across the AI stack Enterprise AI & Agentic Architecture Architect enterprise-scale Agentic AI frameworks using LangGraph, Model Context Protocol (MCP), multi-agent orchestration frameworks, and memory-driven AI systems Design and implement RAG pipelines (Hybrid RAG, Graph-RAG), embeddings pipelines (open-source and enterprise models), prompt orchestration, guardrails, and fine-tuning pipelines (PEFT, LoRA, domain adaptation) Build secure LLM deployments across on-prem, air-gapped, and cloud-agnostic environments Define LLMOps lifecycle covering evaluation harness, hallucination detection, observability (tracing, telemetry), and model governance Hands-on experience with agentic AI frameworks — LangChain, LlamaIndex, AutoGen, CrewAI Data Platform & Lakehouse Engineering Design and govern modern data platforms built on Medallion (Bronze-Silver-Gold) architecture with Delta tables and ACID transactional layers Architect multi-tenant platforms with cost governance and data mesh or federated data architecture patterns Work across the core stack: Databricks, Apache Spark (batch & streaming), Delta Live Tables, Apache Druid, Dremio, Kubeflow Pipelines, Airflow Drive schema evolution and versioning, metadata and lineage management, data quality frameworks, dimensional modelling for analytics, and Kafka-based streaming ingestion Advanced AI/ML & Deep Learning Architect ML systems using TensorFlow, PyTorch, Scikit-Learn, XGBoost, LSTM, CNN, Transformer models, and Vision-Language Models (VLMs) Design time-series forecasting and anomaly detection solutions for industrial telemetry Cloud, Infrastructure & DevOps Cloud-native AI architecture on Azure and AWS Containerisation using Docker and Kubernetes (Helm, Operators) Infrastructure as Code using Terraform CI/CD for ML pipelines with secure DevSecOps integration Hybrid and on-prem deployments under compliance constraints Databases, Graph & Vector Systems RDBMS: PostgreSQL; NoSQL: MongoDB Graph Databases: Neo4j for ontology and knowledge graph modelling Vector Databases: Pinecone, FAISS, Milvus, and enterprise vector DB solutions Context modelling and semantic search frameworks Required Experience 10+ years in Data, AI, and Platform Engineering 5+ years in an AI Architecture leadership role Proven delivery of enterprise-scale AI platforms in production environments Experience in industrial or engineering AI ecosystems Strong background in distributed systems and scalable data processing
Also in Data Science
ELLIOTT MOSS CONSULTING PTE. LTD.
UNISON CONSULTING PTE. LTD.
TUM CREATE LIMITED