Outline

HSTU

Most Deep Learning Recommendation Models (DLRMs) in industry fail to scale with compute. We reformulate recommendation problems as sequential transduction tasks within a generative modeling framework (“Generative Recommenders”).

We observe that alternative formulations at billion-user scale need to overcome three challenges.

  • First, features in recommendation systems lack explicit structures.
  • Second, recommendation systems use billion-scale vocabularies that change continuously.
  • Finally, computational cost represents the main bottleneck in enabling large-scale sequential models.

In this work, we treat user actions as a new modality in generative modeling.

  • core ranking and retrieval tasks in industrial-scale recommenders can be cast as generative modeling problems given an appropriate new feature space;
  • this paradigm enables us to systematically leverage redundancies in features, training, and inference to improve efficiency.

Sequential Transduction Tasks

Generative training