AWS Lead Data Engineer

Posted by Kavimani V

28 days ago

Job Overview

As a Lead Data Engineer, you are the technical authority responsible for designing and scaling high-performance data architectures on AWS. You bridge the gap between traditional Big Data (ETL/Data Warehousing) and Agentic AI. Your primary mission is to build robust, event-driven pipelines that feed both analytical dashboards and autonomous AI agents using Bedrock and Agentcore.

Responsibilities

Architectural Leadership: Design end-to-end serverless data lakes and lakehouses using S3, Glue, and Redshift. AI Integration: Develop and maintain the data infrastructure for Agentic AI, including Knowledge Bases for Bedrock and tool-calling interfaces for Agentcore. Pipeline Engineering: Build scalable PySpark jobs and Lambda functions to handle real-time and batch data processing. Performance Tuning: Optimize SQL queries, Redshift distribution styles, and Glue DPU allocation to balance performance with cost. Governance & Security: Implement strict IAM policies, data encryption, and monitoring to ensure data privacy and compliance. Mentorship: Lead code reviews, define engineering standards, and mentor junior engineers in modern AI-driven data patterns.

Requirements

Core AWS: Mastery of S3, Glue (ETL/Catalog), Redshift, and Lambda. Languages: Expert-level Python, PySpark, and Advanced SQL. Agentic AI: Hands-on experience with Amazon Bedrock (FMs, RAG) and AI orchestration frameworks like Agentcore. Data Modeling: Deep understanding of Star/Snowflake schemas and Data Lakehouse design patterns. DevOps: Experience with CI/CD pipelines (Terraform/CDK) for infrastructure as code.