Mabble Rabble
random ramblings & thunderous tidbits
11 May 2022
Fundamental Methods of Prediction Speed-Ups
There are four fundamental ways in which one can speed-up prediction and reduce memory footprint of transformer models:
Knowledge Distillation
Quantization
Pruning
Graph Optimization
Newer Post
Older Post
Home