experience and skills:
● 3-8 years of relevant experience
● expert user of python & presto sql
● working experience in, the hadoop ecosystem, hive, kubernetes
● usage of various machine learning or statistical libraries, frameworks like
pyspark
roles & responsibilities:
● data engineering and technical delivery
● prepare data for analysis using presto sql or domain-specific tool (example:
omniture for
digital), visualizing the data and executing to specifications
● web scraping using python to get basic datasets from popular websites (.:
linkedin) as required, parsing json objects to get the data in tabular format
● good knowledge of databases/sql, relevant tools like r or python, omniture
(if digital)
● experience with frameworks like pyspark to handle large data
● shows drive to increase the breadth & depth of tools and systems creating
data schemas, building the pipelines, collecting data, and moving it into
storage.
● preparing the data as part of etl or elt processes.
● stitch the data together with scripting languages and often work with dba’s
to construct datastores or data models.
● ensure data is available for ready to use and use framework and microservices
to serve up the data
● design, build and optimize applications’ containerization and orchestration
with docker and
kubernetes
● stakeholder engagement
● grasp requirements on call and deliver to specification; present to senior
management &leadership
● present findings to team lead/managers and to external stakeholders
● drive stakeholder engagements by driving complex analytical projects
including bottoms-up projects
● develop executive presentations with guidance