Design, build, and maintain ETL/ELT data pipelines and data lake solutions to support analytics and AI/ML use cases. Ensure data quality, performance, and reliability across enterprise data platforms.
Key Responsibilities
€¢Pipeline Development
€¢Data Lake Engineering
€¢Performance & Optimization
€¢Collaboration & Support
Required Skills & Experience
€¢ 4+ years of experience in data engineering or ETL development.
€¢ Proficiency in SQL and Python (or Scala/Java) for data transformations.
€¢ Hands-on with ETL tools (Informatica, Talend, dbt, SSIS, Glue, or similar).
€¢ Exposure to big data technologies (Hadoop, Spark, Hive, Delta Lake).
€¢ Familiarity with cloud data platforms (AWS Glue/Redshift, Azure Data Factory/Synapse, GCP Dataflow/BigQuery).
€¢ Understanding of workflow orchestration (Airflow, Oozie, Prefect, or Temporal).
Preferred Knowledge
€¢ Experience with real-time data pipelines using Kafka, Kinesis, or Pub/Sub.
€¢ Basic understanding of data warehousing and dimensional modeling.
€¢ Exposure to containerization and CI/CD pipelines for data engineering.
€¢ Knowledge of data security practices (masking, encryption, RBAC).
Education & Certifications
€¢ Bachelor€™s degree in Computer Science, IT, or related field.
Preferred certifications:
o AWS Data Analytics €“ Specialty / Azure Data Engineer Associate / GCP Data Engineer.
o dbt or Informatica/Talend certifications.