There are comparisons out there between various transformer models. However, such comparisons often are flawed that provide evaluations between apples and oranges. Transformers must be first narrowed down to the architecture and use case in order to compare. For example, a comparison between Bert vs GPT vs T5 is pointless. Bert has an encoder architecture. GPT has a decoder architecture. And, T5 has a balanced encoder-decoder architecture. All three have their contexts for specific use cases. One approach is better for NLU use cases. One approach is better for NLG use cases. And, one can be used interchangeably between NLU and NLG use case. There are cases for when a balance is needed and cases for when a particular architecture is required. Evaluation really should be made across similar use cases and context-specific architectures so as to provide a more meaningful comparison. Context is very important.