Senior Data Scientist with 6+ years of experience in natural language processing (NLP), statistical modeling, and deep learning (ML/DL). Proven ability to design scalable machine learning pipelines, optimize algorithms for real-world applications, and lead cross-functional teams to meet business-critical goals. Expertise in Python, R, TensorFlow, PyTorch, and AWS with strong problem-solving and communication skills. Passionate about delivering data-driven insights that drive innovation in healthcare and technology.
Overview
8
8
years of professional experience
1
1
Certification
Work History
Sr. Machine Learning Engineer
Sourcescrub
San Francisco, USA (Remote)
01.2022 - 09.2024
Led communication and organization within the team and with other teams for 18 months after the team lead departed, resulting in improved performance and collaboration.
Collaborated with multi-disciplinary product development teams to identify ML opportunities and integrate trained models.
Prototyped machine learning applications and quickly determined application viability.
Automated company classifications and description generation to scale from under 5 to 20+ million companies, significantly increasing deals by transitioning from manual to automated processes.
Implemented an automatic re-training pipeline process collaborating with DevOps, an architect and humans-in-the-loop, setting up the model re-training, comparison, deployment, and performance monitoring, drastically reducing manual workload and including MLOps practices.
Composed production-grade code to convert machine learning models into services and pipelines to be consumed at web-scale with optimal performance and load capabilities.
Trained and mentored a new machine learning engineer, enhancing their independence and project delivery by conducting frequent peer reviews, one-on-one coaching, and providing infrastructure guidance.
Supported AI-first transformation efforts with assistance to 8 engineers (3 lead engineers, 3 architects) and collaborating with DevOps to ensure accessible infrastructure for out-of-the-box model deployment and usage.
Delivered state-of-the-art models tailored to company needs, including revenue estimation, valuation, and text quality assessment across diverse data sources like Glassdoor, LinkedIn, and human annotations.
Full Stack, Data Scientist & +
Personal Proyect - BiosignalsIT
Zapopan, Mexico
05.2021 - Current
Developed a prototype in 4 months, integrating hardware selection, connectivity to a custom iOS/Android app, and AWS servers and databases for analyzing and storing biomedical signals and clinical data to assess risk factors.
Consulted post-PRECISE project on the analysis of PPG signals in pregnant women with the aim of validate diagnostic capabilities in hypertensive cases against Doppler as the gold standard.
Provided knowledge transfer post-PRECISE project supporting a researcher and undergrad student from the University of Strathclyde in the PPG signal analysis methods and applications.
Machine Learning Engineeer
Espressive
Zapopan, Mexico
02.2020 - 03.2021
Developed in 3 months a new paraphrase model, the heart of the then current intent matching. The new model improved the most relevant pain points: 1.8x better with partial matches, 2x more consistent with different entity values, and 3x better at differing between positive and negative phrases
Developed a text classifier, named entity recognition model, a model to generate phrases, and a couple of tools for data cleaning.
Algorithms Specialist
Soluciones Kenko
Zapopan, Mexico
04.2019 - 01.2020
Developed a model in 3 months to approximate standard electrocardiograms (ECG) using data from the ECGlove. The ECGlove is a glove used by a doctor and must be positioned on the patient's chest.
Validated the model in public databases prior to clinical trials
Designed in 2 months the protocol for clinical trials according to international and national regulations
Research Trainee
BC Children's Hospital Research Institute
Vancouver, Canada
09.2016 - 12.2018
Sponsored by Bill and Melinda Gates Foundation, CONACYT-COECYTJAL and MITACS
Designed the clinical protocol for a PPG study in pregnant women, collaborating with a global health initiative (PRE-EMPT), team in Canada, England, Pakistan and Mozambique.
Implemented an algorithm to automatically assess the quality of 17,000+ photoplethysmography (PPG) signals.
Developed algorithms to extract PPG features from 11,000+ pregnant women and performed clinical data cleaning and selection.
Increased understanding of gestational age and blood pressure effects on PPG signals.
Identified a 3x higher risk of perinatal death in hipertensive pregnancies using the extracted PPG features.
Developed strong working relationships with fellow researchers, fostering a positive and collaborative work environment.
Consultant
Pathonix
Vancouver, Canada (Remote)
11.2018 - 11.2019
Evaluated client needs and expectations, defining clear goals for consulting engagements and maintaining strong relationships through regular progress updates.
Performed research reviews and preprocessed data by cleaning and aligning PPG and blood pressure signals for analysis.
Implemented an algorithm to estimate PPG signal quality and tested associations between blood pressure changes and PPG features in an animal model.
Education
Master of Science - Biomedical Engineering
University of British Columbia
Vancouver, Canada
12-2018
Bachelor of Science - Biomedical Engineering
Universidad De Guadalajara
Guadalajara, Mexico
08-2015
Skills
Machine learning, model development & feature engineering: bayesian inference, probabilistic models, semi-supervised learning, anomaly detection, neural networks, ensemble methods, support vector machines, reinforcement learning, dimensionality reduction, transfer learning, random forests, clustering algorithms, k-nearest neighbors, time series analysis
Soft skills: attention to detail, problem-solving abilities, teamwork and collaboration, organizational skills, data-driven decision making
Natural Language Processing: Transformers, LangChain, GPT, OpenAI, Llama, LLM, HuggingFace
ML Frameworks: PyTorch, TensorFlow, Trax, scikitlearn, onnx