Sareh Soltani Nejad

Sareh Soltani Nejad

Data Scientist / Machine Learning Researcher

BrainsCAN

About

I’m a Data Scientist and Machine Learning Specialist with over 5 years of hands-on experience developing scalable ML solutions and extracting insights from complex, high-dimensional datasets. I am passionate about creating impactful solutions, especially in the healthcare and finance sectors, and I am always keen to learn and grow in areas that merge ML and real-world applications.

I graduated from my M.Sc. in Computer Science from Western University where I developed a weakly supervised anomaly detection model for surveillance videos using a Two-Stream I3D Convolutional Network as my thesis. I completed my B.Sc. in Computer Engineering at Amirkabir University, focusing on real-time tracking systems.

My research interests include ML fundamentals, Computer Vision, Biomedical Computing, and natural lanaguage processsing. On the applied side, I’m interested in ML problems at scale, modelling, training, deployment and evaluation. I am currently looking for my next job opportunity in machine learning.

Technical Skills:

  • Programming Languages: Python, SQL, C, JavaScript
  • Frameworks/Libraries: PyTorch, Tensorflow, Keras, Scikit-learn, NumPy, Pandas, Matplotlib, Seaborn, Tableau.
  • NLP Tools & Models: : HuggingFace, spaCy, NLTK, BERT, scGPT, scVI, sciPy, scanPy,
  • Other Skills: GCP, AWS EC2, Docker, Kubernetes, PySpark, FastAPI, Streamlit, Git, MLflow, Optuna, W&B, Slurm
Interests
  • Machine Learning
  • Applied ML in Healthcare
  • Computer Vision
  • Biomedical Computing
  • Natural Language Processing (NLP)
Education
  • MSc in Computer Science

    Western University, Canada

  • BSc in Computer Engineering

    Amirkabir University

Experience

 
 
 
 
 
BrainsCAN
Data Scientist - AI Engineer
BrainsCAN
February 2024 – Present London, ON
  • Collaborated on the large-scale OMMABA project to explore the impact of music perception on brain functionality and develop a multimodal dataset integrating behavioral, EEG, and fMRI data from 60 participants.
  • Developed a robust data preprocessing pipeline to enhance data quality, consistency, and usability for downstream analysis.
  • Achieved 94% accuracy in ECG arrhythmia classification by developing deep learning models (1D CNNs, RNNs) and traditional ML algorithms (SVM, XGBoost), with statistical insights into model performance.
  • Deployed a production-ready pipeline via Dockerized FastAPI, enabling scalable, real-time arrhythmia detection.
  • Annotated immune cell types from 20K single-cell RNA-seq blood samples using generative variational autoencoders (VAE) and scGPT (transformer-based model), leading to improved accuracy and interpretability.
  • Tools: Pandas, Scikit-learn, TensorFlow, HuggingFace, scanPy, sciPy, Seaborn, Optuna, MLflow, FastAPI, Docker, AWS
 
 
 
 
 
Vector Institute for AI
Machine Learning Engineer Technical Facilitator
Vector Institute for AI
September 2023 – December 2023 Toronto, Canada
  • Conducted two cohorts of Anomaly Detection Bootcamp as part of the Vector ML Experts team.
  • Deployed tailored ML solutions to address anomaly detection use cases for 12 companies by collaborating with stakeholders.
  • Developed an ML-based fraud detection framework using ensemble methods (LightGBM), boosting model accuracy by 27% and reducing false positives by 15%.
  • Implemented a DL-based TabNet model for financial fraud detection, achieving 88% accuracy on transaction data.
  • Built a scalable video anomaly detection framework using multiple instance ranking, reaching an AUC of 85%
  • Tools: Docker, GCP, Pandas, SQL, PySpark, Scikit-learn, PyTorch, Streamlit, T-Test, Matplotlib, Tableau, Wandb, Git, Slurm
 
 
 
 
 
Vector Institute for AI
Data Scientist Intern
Vector Institute for AI
May 2023 – August 2023 Toronto, Canada
  • Led an Anomaly Detection Workshop for 50+ professionals, delivering hands-on training on advanced techniques.
  • Delivered a reference fraud detection demo using a credit card dataset, achieving 93% AUC with an AutoEncoder.
  • Conducted analysis for a pharma company by integrating lab data from 1,000 patients to assess their drug impact on BMI and blood pressure, applying statistical, subgroup, and outlier analyses with external data to optimize clinical trial design.
  • Leveraged supervised models (Random Forest, XGBoost) to estimate treatment effects, predicting a 12% reduction in BMI and a 3% reduction in blood pressure.
 
 
 
 
 
Western University
Machine Learning Research Engineer
Western University
September 2021 – August 2023 London, Canada
  • Designed a novel weakly-supervised video-anomaly detection system built on a two-stream I3D ConvNet.
  • Built a data pipeline to process 1TB+ of video data from UCF-Crime benchmark, leveraging Multiple Instance Learning.
  • Automated extraction of appearance (RGB) and motion (optical - flow) embeddings through parallel two- stream I3D encoders
  • Devised a late-fusion strategy that improved accuracy by 20%, achieving an 85% AUC and surpassing published baselines.
  • Tools: Python, PyTorch, OpenCV, Weights & Biases, Matplotlib, Git
 
 
 
 
 
IPM & Sharif Brain Center
Machine Learning Engineer
IPM & Sharif Brain Center
July 2020 – July 2021
  • Applied NER and topic modeling to extract structured insights from unstructured EHR doctors’ notes, streamlining patient records and reducing manual chart review time by 30%.
  • Used large language models like BERT for clinical treatment categorization.
  • Developed 3D medical imaging visualizations from CBCT data, improving diagnostic accuracy for 50 patients.
  • Tools: Python, Pandas, LLM, HuggingFace, BERT, PyTorch, NLTK, spaCy

Projects

*
Example Project
An example of using the in-built project page.
Example Project
External Project
An example of linking directly to an external project website using external_link.
External Project