World models & simulation @ Waymo · NeurIPS 2024 first-author · ex-Google Brain

Miles Hutson

Currently building a production-focused world model at Waymo, expanding it to represent high-dimensional outputs while keeping inference costs down. Previously model lead on DermAssist at Google Health. First-author at NeurIPS 2024.

Miles Hutson

Experience

Where I've worked

  1. Sep 2022 – present

    Waymo

    Senior Software Engineer

    Now: Production-focused world model

    Expanding the world model to represent high-dimensional outputs while keeping inference costs down.

    • Rearchitected the transformer from a behavior-prediction model to a world-model architecture — and cut overall param count by folding specialized modules into the decoder.
    • Implemented a VQ-VAE that enables high-dimensional outputs and replaces bespoke, param-heavy per-output vocabularies with one compressed output space.
    • Serve as a TL across ~5 ML engineers on the model.

    Prev: ML for Road Understanding

    Improved the perception model's ability to understand the semantics of construction zones.

    • Foundational model loss & architecture improvements.
    • Evaluation methods that raised the quality of the deployed model.

    Prev: ML for Behavior Prediction

    Predicted the actions of cars, cyclists, and pedestrians so the car could safely share the road.

    • Designed and ran ablations to reduce inputs and model complexity.
  2. Feb 2017 – Sep 2022

    Google

    Senior Software Engineer

    DermAssist · Google Health Dermatology

    Model lead. Identified the most promising research and shepherded it into the commercial product.

    • Trained the majority of models in the production classification ensemble; led ensemble distillation to cut resource use.
    • Built continual-model-update infrastructure and new performance metrics for differential-diagnosis models.
    • Also frontend TL for the CE-Mark-approved product (demoed at Google I/O); shipped the on-device TensorFlow.js image-quality checks.
    • Onboarded new team members.

    Medical labeling infrastructure · Google Health

    Applied-ML collaborations across Google

    • CNN for endotracheal and nasogastric tube placement detection.
    • T5-based tool to help job applicants formulate interview responses.

    Earlier

    • Google Brain.
    • Ranking team within Google Cloud.
  3. Jan – Dec 2016

    University of Texas at Austin

    Undergraduate Research Assistant

  4. May – Aug 2016

    Fitbit

    Software Engineering Intern

    • Sleep and wellness algorithms. Built tooling to compare research models against production behavior on the same inputs; contributed to the work behind Fitbit Sleep Stages.
  5. Aug – Dec 2015

    Texas Tribune

    Digital Media Intern

    • Data visualization for news stories.
    • Internal tool that let journalists explore the Tribune's structured datasets.
  6. May – Aug 2015

    Blackbaud

    Software Engineering Intern

    • Debugged and shipped features on a platform for non-profit fundraising.
  7. 2012 – 2015

    The Daily Texan

    Digital Projects Lead → Senior Staff → Reporter

    • Led a 5-person interactives team. D3, Drupal, Python.

Education

School

Selected work

Things I've built

Production-focused world model — Waymo

Waymo · 2024 – present · Senior SWE

Current focus: expanding a production-focused world model to represent high-dimensional outputs while keeping inference costs down.

Architecture diagram from the Policy-Shaped Prediction paper

Policy-Shaped Prediction

NeurIPS 2024 · 2024 · First author

Reconstruction-based world models (DreamerV3, DreamerPro) waste capacity modeling pixel detail that's irrelevant to the task. We use a pretrained segmentation model, a task-aware reconstruction loss, and adversarial learning to focus the world model on what matters for control — recovering performance under intricate, predictable, but useless distractors.

DermAssist

Google Health · 2019 – 2022 · Model lead · Frontend TL

Consumer dermatology tool — computer vision to suggest possible matches for skin, hair, and nail conditions. CE-Mark approved, demoed at Google I/O.

As model lead I trained the majority of the production classification ensemble, designed the differential-diagnosis metric, and led the ensemble distillation that shrank the model's footprint.

As frontend TL I shipped the on-device TensorFlow.js image-quality checks. I also built the continual-update pipeline and the Post-Market Monitoring system that tracks live model performance in the wild.

Drone depth from a single camera

2021 · Monocular depth · VR

Monocular depth prediction trained on drone footage, then re-projected into a 3D point cloud you could walk through in VR. The fun part: depth-from-motion gives you most of the signal without needing stereo rigs or LiDAR.

Transcribing screenshots of Reddit posts

2020 · OCR · Transformers

An OCR + Transformer baseline for extracting text from Reddit posts that get shared around as screenshots, plus the dataset I built and trained it on.

Catbot

2019 · YOLO · Pi · OpenCV

YOLO-based pursuit robot, originally designed to chase a cat. Reprogrammed mid-demo to chase water bottles for safety reasons.

Hardware

details ↻

Hardware

  • Raspberry Pi 3
  • Raspberry Pi Camera
  • Adafruit motor controller
  • Ultrasonic rangefinder
  • 2× OSEPP tank platform kits

DIY Robocar

2019 · Particle filter · Lane detection

Autonomous RC car for DIY Robocar races. Iterated through two approaches — a simulator-trained particle filter, then a perspective-transform + hue-based lane detector.

Approach 1 — particle filter

details ↻

Approach 1 — particle filter

Built a track simulator and trained particle-filter localization with online path planning against it. Worked well in sim; transferred poorly to the physical track.

Approach 2 — lane detection

details ↻

Approach 2 — lane detection

Simpler and more robust: perspective transform from the onboard camera, hue-based segmentation of the painted lane lines, and a steering controller driven by the detected lane geometry.

Photography

Through the lens

Contact

Reach out

Working on something at the intersection of ML, simulation, or robotics? Happy to talk.

GitHub
@CuriousG102
LinkedIn
/in/mileshutson
Medium (archive)
mileshutson.medium.com