Experience and Skills:
â— 3-8 years of relevant experience
â— Expert user of Python & Presto SQL
â— Working experience in, the Hadoop ecosystem, Hive, Kubernetes
â— Usage of various machine learning or statistical libraries, frameworks like
PySpark
Roles & Responsibilities:
â— Data Engineering and Technical Delivery
â— Prepare data for analysis using Presto SQL or domain-specific tool (example:
Omniture for
Digital), visualizing the data and executing to specifications
â— Web Scraping using Python to get basic datasets from popular websites (.:
LinkedIn) as required, Parsing JSON objects to get the data in tabular format
â— Good knowledge of databases/SQL, relevant tools like R or Python, Omniture
(if digital)
â— Experience with frameworks like PySpark to handle large data
â— Shows drive to increase the breadth & depth of tools and systems creating
Data schemas, building the pipelines, collecting data, and moving it into
storage.
â— Preparing the data as part of ETL or ELT processes.
◠Stitch the data together with scripting languages and often work with DBA’s
to construct datastores or data models.
â— Ensure data is available for ready to use and use framework and microservices
to serve up the data
◠Design, build and optimize applications’ containerization and orchestration
with Docker and
Kubernetes
â— Stakeholder Engagement
â— Grasp requirements on call and deliver to specification; Present to Senior
Management &Leadership
â— Present findings to team lead/managers and to external stakeholders
â— Drive stakeholder engagements by driving complex analytical projects
including bottoms-up projects
â— Develop executive presentations with guidance