Data Scientist – Mount Sinai Health System
Jul 2024 – Present • New York, NY
- Deployed a fine tuned BERT for real time symptoms detection on Triton Inference Server and Docker, reducing latency from 387 ms to 71 ms per clinical note across more than 9500.
- Implemented an LLM chat agent using Ollama and MCP Servers, enabling natural language querying of PostgreSQL and data folders.
- Architected a full-stack patient dashboard UI on a local Linux container, integrating ML models, chatbots, and an exploratory data interface; 6 physicians use it daily, cutting chart lookup time by 65%.
- Set up real-time Grafana dashboards inside a Linux container, monitoring servers and infra via Prometheus and InfluxDB; gave 24 hours visibility and cut troubleshooting time.
- Built a dual stream LSTM fusing airflow features, boosting 3 way breathe pattern detection F1 from 0.85 to 0.91.
Data Science Intern – LOCOMeX
Jan 2024 – May 2024 • New York, NY
- Designed a two tower RFP recommendation system using bi-encoder and BERT embeddings which was rated beneficial by 75% of initial customers.
- DDeployed a metadata extraction service on AWS Lambda; slashed document parsing time by 80%.
- Migrated Python scripts to Airflow ETL; saved analysts 7 hours every week.
Research Assistant – NYU Stern
Sep 2023 – Dec 2023 • New York, NY
- Fine‑tuned Llama‑2‑13B to achieve target structure in 90 % of generations; added Retrieval‑Augmented Generation for zoning docs.
- Prompt‑engineering + few‑shot learning eliminated hallucinations in remaining 10 % outliers.
- Integrated ExLlama V2 kernels on NYU HPC to cut inference latency 57.7 % (26 s → 11 s).