Data Scientist – Mount Sinai Health System
Jul 2024 – Present • New York, NY
- Owned an end-to-end symptom extraction pipeline—Naive Bayes + multi-label fine-tuned Clinical BERT; deployed with FastAPI on Kubernetes; drove 8X faster EHR symptom entry, processing 1000+ notes.
- Implemented a LLM chat agent using Ollama and MCP, querying PostgreSQL and data folders; 6 physicians use it daily, cutting chart lookup time by 65%.
- Set up real-time Grafana dashboards inside a Linux container using Docker Compose, monitoring servers and infra via Prometheus and InfluxDB; gave 24/7 visibility and cut troubleshooting time.
- Architected an internal Kubernetes platform running ML models, Label Studio, MLflow, and doctor dashboards; reduced new app deployment time from weeks to under one day.
- Built a dual-stream LSTM fusing airflow features, boosting 3-way breathe pattern detection F1 from 0.85 to 0.91.
Data Science Intern – LOCOMeX
Jan 2024 – May 2024 • New York, NY
- Designed a two-tower RFP-to-company recommendation system (bi-encoder with FC layers + BERT embeddings) that was rated ‘useful’ by 75% of initial customers.
- Deployed a metadata extraction service on AWS Lambda; slashed document parsing time by 80%.
- Migrated Python scripts to Airflow ETL; saved analysts 7 hours every week.
Research Assistant – NYU Stern
Sep 2023 – Dec 2023 • New York, NY
- Fine‑tuned Llama‑2‑13B to achieve target structure in 90 % of generations; added Retrieval‑Augmented Generation for zoning docs.
- Prompt‑engineering + few‑shot learning eliminated hallucinations in remaining 10 % outliers.
- Integrated ExLlama V2 kernels on NYU HPC to cut inference latency 57.7 % (26 s → 11 s).