Summary
Overview
Work History
Education
Skills
Certification
Languages
Timeline
Generic

Barbara Macias

Zapopan, Jalisco

Summary

Senior Data Scientist with 6+ years of experience in natural language processing (NLP), statistical modeling, and deep learning (ML/DL). Proven ability to design scalable machine learning pipelines, optimize algorithms for real-world applications, and lead cross-functional teams to meet business-critical goals. Expertise in Python, R, TensorFlow, PyTorch, and AWS with strong problem-solving and communication skills. Passionate about delivering data-driven insights that drive innovation in healthcare and technology.

Overview

8
8
years of professional experience
1
1
Certification

Work History

Sr. Machine Learning Engineer

Sourcescrub
01.2022 - 09.2024
  • Led communication and organization within the team and with other teams for 18 months after the team lead departed, resulting in improved performance and collaboration.
  • Collaborated with multi-disciplinary product development teams to identify ML opportunities and integrate trained models.
  • Prototyped machine learning applications and quickly determined application viability.
  • Automated company classifications and description generation to scale from under 5 to 20+ million companies, significantly increasing deals by transitioning from manual to automated processes.
  • Implemented an automatic re-training pipeline process collaborating with DevOps, an architect and humans-in-the-loop, setting up the model re-training, comparison, deployment, and performance monitoring, drastically reducing manual workload and including MLOps practices.
  • Composed production-grade code to convert machine learning models into services and pipelines to be consumed at web-scale with optimal performance and load capabilities.
  • Trained and mentored a new machine learning engineer, enhancing their independence and project delivery by conducting frequent peer reviews, one-on-one coaching, and providing infrastructure guidance.
  • Supported AI-first transformation efforts with assistance to 8 engineers (3 lead engineers, 3 architects) and collaborating with DevOps to ensure accessible infrastructure for out-of-the-box model deployment and usage.
  • Delivered state-of-the-art models tailored to company needs, including revenue estimation, valuation, and text quality assessment across diverse data sources like Glassdoor, LinkedIn, and human annotations.

Full Stack, Data Scientist & +

Personal Proyect - BiosignalsIT
05.2021 - Current
  • Developed a prototype in 4 months, integrating hardware selection, connectivity to a custom iOS/Android app, and AWS servers and databases for analyzing and storing biomedical signals and clinical data to assess risk factors.
  • Consulted post-PRECISE project on the analysis of PPG signals in pregnant women with the aim of validate diagnostic capabilities in hypertensive cases against Doppler as the gold standard.
  • Provided knowledge transfer post-PRECISE project supporting a researcher and undergrad student from the University of Strathclyde in the PPG signal analysis methods and applications.

Machine Learning Engineeer

Espressive
02.2020 - 03.2021
  • Developed in 3 months a new paraphrase model, the heart of the then current intent matching. The new model improved the most relevant pain points: 1.8x better with partial matches, 2x more consistent with different entity values, and 3x better at differing between positive and negative phrases
  • Developed a text classifier, named entity recognition model, a model to generate phrases, and a couple of tools for data cleaning.

Algorithms Specialist

Soluciones Kenko
04.2019 - 01.2020
  • Developed a model in 3 months to approximate standard electrocardiograms (ECG) using data from the ECGlove. The ECGlove is a glove used by a doctor and must be positioned on the patient's chest.
  • Validated the model in public databases prior to clinical trials
  • Designed in 2 months the protocol for clinical trials according to international and national regulations

Research Trainee

BC Children's Hospital Research Institute
09.2016 - 12.2018
  • Sponsored by Bill and Melinda Gates Foundation, CONACYT-COECYTJAL and MITACS
  • Designed the clinical protocol for a PPG study in pregnant women, collaborating with a global health initiative (PRE-EMPT), team in Canada, England, Pakistan and Mozambique.
  • Implemented an algorithm to automatically assess the quality of 17,000+ photoplethysmography (PPG) signals.
  • Developed algorithms to extract PPG features from 11,000+ pregnant women and performed clinical data cleaning and selection.
  • Increased understanding of gestational age and blood pressure effects on PPG signals.
  • Identified a 3x higher risk of perinatal death in hipertensive pregnancies using the extracted PPG features.
  • Developed strong working relationships with fellow researchers, fostering a positive and collaborative work environment.

Consultant

Pathonix
11.2018 - 11.2019
  • Evaluated client needs and expectations, defining clear goals for consulting engagements and maintaining strong relationships through regular progress updates.
  • Performed research reviews and preprocessed data by cleaning and aligning PPG and blood pressure signals for analysis.
  • Implemented an algorithm to estimate PPG signal quality and tested associations between blood pressure changes and PPG features in an animal model.

Education

Master of Science - Biomedical Engineering

University of British Columbia
Vancouver, Canada
12-2018

Bachelor of Science - Biomedical Engineering

Universidad De Guadalajara
Guadalajara, Mexico
08-2015

Skills

  • Machine learning, model development & feature engineering: bayesian inference, probabilistic models, semi-supervised learning, anomaly detection, neural networks, ensemble methods, support vector machines, reinforcement learning, dimensionality reduction, transfer learning, random forests, clustering algorithms, k-nearest neighbors, time series analysis
  • Soft skills: attention to detail, problem-solving abilities, teamwork and collaboration, organizational skills, data-driven decision making
  • Natural Language Processing: Transformers, LangChain, GPT, OpenAI, Llama, LLM, HuggingFace
  • ML Frameworks: PyTorch, TensorFlow, Trax, scikitlearn, onnx
  • Data Analysis: pandas, numpy, spark
  • Experiment tracking: MLFlow, Weights&Biases, TensorBoard
  • Model optimization hyper parameter: Weights&Biases, RayTune
  • Hosted notebooks management: Databricks, Google Colab, Amazon Cluster
  • Model deployment and serving: Flask, Azure Machine Learning, RestAPI, httpx
  • Model performance: locus, postman
  • Data version control: Databricks FeatureStore & DeltaLake, DVC
  • Development: Git, VisualStudio, agile methodologies
  • Visualization: Altair, Seaborn, Matplotlib
  • Data streaming: Kafka
  • Model explainability: SHAP
  • R & Matlab
  • Databases: SQL, MariaDB
  • App development: Kotlin, Xcode

Certification

Natural Language Processing Specialization - DeepLearning.AI - Coursera

Applied Data Science Specialization - IBM - Coursera

IBM Data Science Professional Certificate - IBM - Coursera

Languages

Spanish
Bilingual or Proficient (C2)
English
Advanced (C1)

Timeline

Sr. Machine Learning Engineer

Sourcescrub
01.2022 - 09.2024

Full Stack, Data Scientist & +

Personal Proyect - BiosignalsIT
05.2021 - Current

Machine Learning Engineeer

Espressive
02.2020 - 03.2021

Algorithms Specialist

Soluciones Kenko
04.2019 - 01.2020

Consultant

Pathonix
11.2018 - 11.2019

Research Trainee

BC Children's Hospital Research Institute
09.2016 - 12.2018

Master of Science - Biomedical Engineering

University of British Columbia

Bachelor of Science - Biomedical Engineering

Universidad De Guadalajara
Barbara Macias