Data wrangling
Handle real-world data shape issues: join constraints, anonymized rows, nested JSON, and aggregation strategy.
Build intelligence layers on messy data with a measurable quality bar and clear business impact.
This track is for people who want applied machine learning on real-world search and analytics data. You will work with embeddings, clustering, intent classification, opportunity scoring, and insight-to-action reporting, connecting numbers to decisions, not just building charts.
6-10 hrs/week
Basic coding and spreadsheet comfort. No ML degree required.
Reproducible notebook or script with insights, recommendations, and presentation outline.
Async review on data handling, model approach, insight quality, and actionability.
The goal is proof of work, not passive course completion.
Handle real-world data shape issues: join constraints, anonymized rows, nested JSON, and aggregation strategy.
Group queries or pages by meaning and map clusters to content coverage gaps or cannibalization.
Classify intent beyond generic buckets and score where the biggest wins are hiding in existing data.
Connect model output to specific recommendations and define what counts as a non-obvious finding.
Builder means you have shipped a reproducible ML analysis with semantic methods, non-obvious insights, and recommendations that connect data to a specific action.
The track is designed around accessible tools and clear alternatives. Use this as a practical setup check before applying.
Local Jupyter, Google Colab, or another notebook runtime
Your prototype must be reproducible from a script or notebook.
Public datasets, sample CSV exports, BigQuery sandbox, or synthetic search analytics data
Do not share private client data. Use public or synthetic data for submissions.
pandas, scikit-learn, sentence-transformers, HDBSCAN, or equivalent open-source stack
Pick tools that fit your time budget; depth of thinking matters more than stack polish.