RCMD LLM Paper
Outline
HSTU
Most Deep Learning Recommendation Models (DLRMs) in industry fail to scale with compute. We reformulate recommendation problems as sequential transduction tasks within a generative modeling framework (“Generative Recommenders”).
We observe that alternative formulations at billion-user scale need to overcome three challenges.
- First, features in recommendation systems lack explicit structures.
- Second, recommendation systems use billion-scale vocabularies that change continuously.
- Finally, computational cost represents the main bottleneck in enabling large-scale sequential models.
In this work, we treat user actions as a new modality in generative modeling.
- core ranking and retrieval tasks in industrial-scale recommenders can be cast as generative modeling problems given an appropriate new feature space;
- this paradigm enables us to systematically leverage redundancies in features, training, and inference to improve efficiency.