- ChatEval
- Acute-Eval
- SSA
- NUC
- SASSI
- WER
- DSTC
- DSTC2
- BLEU
- PARADISE
- QoE
- IQ
- Perplexity
- F1
- Hits@k
- Average Utterance Length
- Ratio of Rare Words
- Number of Repetitions
- Number of System Questions
- Comprehensible
- Interesting
- Topical Relevance
- Response Incorrectness
- Conversation Continuity
- Engagement
- Conversational Depth
- Coherence
- Domain Coverage
- Conversational Diversity and Breadth
- Naturalness
- Informativeness
- Unification Quality
- User Satisfaction (Questionnaires)
- User Simulations
13 January 2021
Chatbot Evaluations
Labels:
big data
,
data science
,
deep learning
,
machine learning
,
natural language processing
,
text analytics