Role Overview
We’re hiring a Senior Data Scientist who can own end-to-end problem solving-from
business discovery and hypothesis design to model deployment and post-production
monitoring. You will partner with product, engineering, and client stakeholders to build
production-grade ML/AI and GenAI solutions on AWS/Azure/GCP and mentor a small
pod (2-5) of data scientists/ML engineers.
Key Responsibilities
• Business & Problem Framing: Engage with client stakeholders to translate
objectives into measurable DS/ML use cases, define success metrics (ROI,
adoption, accuracy, latency), and create experiment plans.
• Data Strategy & Feature Engineering: Own data acquisition, quality checks, EDA,
and feature pipelines across SQL/Spark/Databricks; collaborate with Data
Engineering for robust ingestion and transformation (Airflow/dbt).
• Modeling: Build, tune, and compare models for supervised/unsupervised
learning, time-series forecasting, NLP/CV, and GenAI (RAG, fine-tuning,
prompt-engineering)
using
Python (pandas, NumPy, scikit-learn,
XGBoost/LightGBM), PyTorch/TensorFlow, Hugging Face.
• MLOps & Deployment: Productionize via MLflow/DVC, model registry, CI/CD
(GitHub/GitLab), containers (Docker/Kubernetes), and cloud ML platforms
(SageMaker/Azure ML/Vertex AI). Expose services via FastAPI/Flask; implement
monitoring for drift, data quality, and model performance.
• Experimentation & Causality: Design and analyze A/B tests, apply causal
inference techniques (., propensity scoring, DiD) to measure true impact.
• Explain ability, Fairness & Compliance: Apply model cards, SHAP/LIME, bias
checks, PII handling, anonymization/pseudonymization, and align with
applicable data privacy regulations (., GDPR/DPDP).
• Visualization & Storytelling: Build insights dashboards (Tableau/Power
BI/Plotly) and communicate recommendations to senior business and technical
stakeholders.
• Collaboration & Leadership: Mentor juniors, conduct code and research
reviews, contribute to standards, and support solutioning during
pre-sales/POCs.
Required Skills & Experience
• Experience: 7-10 years overall, with 5+ years in applied ML/Data Science
delivering models to production for enterprise clients.
• Programming & Data: Expert Python, advanced SQL, and hands-on with
Spark/Databricks. Strong software practices (testing, typing, packaging).
• ML/AI Stack: scikit-learn, XGBoost/LightGBM; PyTorch or TensorFlow; NLP
(spaCy,
Transformers,
embeddings), vector DBs (FAISS/Pinecone),
LangChain/LlamaIndex for RAG.
• Cloud & MLOps: Real-world deployments on AWS/Azure/GCP using
SageMaker/Azure ML/Vertex AI; MLflow, model registry, feature store,
Docker/K8s, and CI/CD.
• Experimentation & Analytics: A/B testing, Bayesian/ frequentist methods, causal
inference, statistical rigor.
• Visualization & Communication: Storytelling with data; Tableau/Power BI/Plotly,
executive-ready presentations.
• Domain Exposure (nice-to-have): BFSI risk/collections/CLV, retail
demand/personalization, healthcare claims/clinical NLP, manufacturing
quality/predictive maintenance.
• Bonus: Recommenders, time-series, graph ML, optimization (OR),
reinforcement learning, geospatial analytics.
Education & Certifications
• Bachelor’s/Master’s in Computer Science, Data Science, Statistics, Applied Math,
or related field.
• Preferred certifications: AWS/Azure/GCP ML, Databricks, TensorFlow or
PyTorch.