Bradfo'siPhone

09:30

Bradfo's Projects

TODAY • October 29

Today's App

Gnosis: The Radiologist's Trainer

Developed during the 'AI for Health' hackathon. Gnosis is a training platform for radiology students that leverages Google's MedGemma model to interpret medical images (X-rays, CT scans) and generate interactive True/False flashcards. The application, built with Streamlit and deployed on Google App Engine, provides AI-generated, personalized feedback to help students refine their diagnostic skills.

MedGemmaVertex AIStreamlitPythonGoogle CloudGenerative AIHealthTech

GET

Popular Apps

YouTube Sentiment Analysis

End-to-End MLOps Pipeline

GET

Built a complete MLOps pipeline for sentiment analysis on YouTube comments. This project integrates CI/CD with GitHub Actions for automation, data versioning (DVC), and experiment tracking (MLFlow) to ensure reproducibility and model quality.

MLOpsCI/CDGitHub ActionsDVCPythonMLFlow

AI AgentOps Replay

AI Agent Visualization

GET

Agent-agnostic solution to trace, visualize, and replay AI agent interactions. This tool significantly improves the debugging and analysis of agent behavior.

AI AgentsVisualizationDebuggingSoftware EngineeringPython

Mean Arterial Pressure Prediction

Machine Learning with Domain Adaptation

GET

Ranked 1st out of 178 in the Inria challenge on Mean Arterial Pressure (MAP) prediction. The winning solution focused on domain adaptation techniques to generalize predictions on unseen data, a key challenge in medical machine learning.

Machine LearningCompetitionDomain AdaptationPythonScikit-learn

Finetuning Efficient Checkpointing

25x Model Compression

GET

Implemented an efficient checkpointing scheme for fine-tuning with Delta-LoRA and LC-checkpoint, achieving a 25x model compression with no loss in accuracy. Validated the robustness across 5 different vision architectures using a supercomputer.

Deep LearningResearchComputer VisionPEFTLoRAHPC

HumanAI: Humanitarian Aid Chatbot

RAG Chatbot for Humanitarian Info

GET

Developed a chatbot based on the Retrieval-Augmented Generation (RAG) architecture to provide accurate, context-aware responses from a large knowledge base of humanitarian information.

Generative AIRAGLLMLangChainPythonNLP

HYGENE: Hypergraph Generation using Diffusion Models

Extension of the first Diffusion Model for Hypergraph Generation

GET

The first diffusion-based generative model specifically designed for hypergraphs. To overcome the challenges of modeling higher-order relationships, HYGENE employs an innovative iterative expansion mechanism that reverses a spectral coarsening process. The method successfully captures both global structure and local details, outperforming existing baselines in generating structurally valid and realistic hypergraphs.

ResearchDeep LearningDiffusion ModelsGenerative ModelsGNNsHypergraphs

Improving Vision-Language Models on Discriminative Tasks

Enhancing Few-Shot VLM Classification

GET

Explores enhancements to the Sparse Attention Vectors (SAVs) method, aiming to improve how Large Vision-Language Models (VLMs) perform on discriminative tasks without costly finetuning. The work investigates several novel approaches, including penalty-based scoring for better class separation, non-linear `artanh` transformations to amplify decisive attention heads, and alternative kernel-based similarity metrics. The findings provide valuable insights into designing data-efficient classifiers for low-data domains like medical or satellite imagery.

ResearchVision-Language Models (VLM)Few-Shot LearningAttention MechanismsDeep LearningData-Efficient

CoVR-2: Automatic Data Construction for Composed Video Retrieval

Extension of scalable Dataset Creation for Video Search

GET

This research introduces CoVR-2, a framework that pioneers a scalable method for automatically constructing datasets for Composed Video Retrieval (CoVR)—the task of searching for videos using a reference video and a text modifier. By leveraging the BLIP-2 architecture, this work eliminates the need for costly manual annotations. My contributions involved exploring novel strategies to enhance retrieval, including dynamic embedding balancing with an MLP and integrating a consistency loss function to improve model generalization.

ResearchMultimodal AIVideo RetrievalBLIP-2Contrastive LearningData ConstructionComputer Vision

Sketch Image Classification with EVA-CLIP

Achieving 93% Accuracy on ImageNet-Sketch

GET

This project tackled the image classification challenge on the ImageNet-Sketch dataset. Starting with EfficientNet, I explored more advanced models like CLIP and EVA-CLIP. The key contribution lies in the analysis of training strategies: rather than full fine-tuning, the most effective method was to pre-compute features using a frozen foundation model and train only a simple classifier. This feature extraction approach led to a 93% test accuracy with EVA-CLIP while significantly optimizing computation time.

Computer VisionImage ClassificationCLIPFine-TuningFeature ExtractionPyTorchDeep Learning

Listen To The Wild: Predicting Ecosystem Health

Eco-Acoustic Analysis with Machine Learning

GET

This project addresses the ecological challenge of analyzing massive soundscape datasets. I developed methods to predict the naturalness and richness of ecosystems using acoustic indices. The approach involved extensive supervised and unsupervised learning, training classifiers and regressors on features extracted from audio. We compared traditional acoustic indices from scikit-maad with deep learning embeddings from VGGish to classify environmental sites and predict biodiversity metrics, providing a valuable tool for environmental monitoring.

BioacousticsMachine LearningAudio ProcessingData Sciencescikit-maadVGGishPythonEcology

Recommender Systems with Generative Retrieval

Analyzing the TIGER Model (NeurIPS 2023)

GET

This project involved an in-depth presentation and analysis of TIGER, a state-of-the-art model from NeurIPS 2023 that reframes recommendation as a generation task. I detailed how this approach moves beyond traditional retrieve-and-rank methods by generatively retrieving item identifiers ('Semantic IDs') using a Transformer. The presentation covered how TIGER's core component, an RQ-VAE, learns a hierarchical representation of items, effectively solving major industry challenges like the cold-start problem and the retrieval bottleneck.

Recommender SystemsGenerative AITransformersInformation RetrievalDeep Learning

Classifier-Free Diffusion Guidance

Implementing a Core Technique for Generative AI (NeurIPS 2021)

GET

This project provides a deep dive into Classifier-Free Diffusion Guidance, a foundational technique from NeurIPS that simplifies high-fidelity image generation. I reviewed the paper and implemented the method, which removes the need for an external classifier by jointly training conditional and unconditional models. The implementation was validated on CIFAR-10, producing high-quality images, and experiments on ImageNet successfully reproduced the key fidelity-diversity trade-off curves from the original research, demonstrating the method's robustness.

Generative ModelsDiffusion ModelsComputer VisionDeep LearningImage Generation

Unifying GANs and Score-Based Diffusion Models

A Deep Dive into Generative Particle Models (NeurIPS 2023)

GET

Explained how both can be viewed as 'Generative Particle Models' (GPMs), where samples evolve over time according to a differential equation. The talk detailed this unifying perspective and explored the novel hybrid models it introduces, such as Score GANs (diffusion with a generator) and Discriminator Flows (GANs without a generator), bridging the gap between two major classes of generative AI.

Generative ModelsGANsDiffusion ModelsNeurIPSTheoretical MLDeep Learning

Interpretability with Knowledge Distillation

Exploring the Hidden Power of KD for Explainable AI

GET

Authored a comprehensive blog post analyzing how knowledge distillation (KD) enhances model interpretability, based on the ICML 2023 paper by Han et al. I detailed how KD transfers rich class-similarity information through soft targets, guiding student models to develop more object-centric and human-aligned features. The post contrasts this with Label Smoothing and explains the 'Network Dissection' methodology for quantifying interpretability. The work also included a hands-on reproduction of the paper's key experiments to validate the findings.

Explainable AI (XAI)InterpretabilityKnowledge DistillationDeep LearningICMLModel Compression