Evaluation of Response Generation Models: Shouldn’t It Be Shareable and Replicable?

Seyed Mahed Mousavi, Gabriel Roccabruna, Michela Lorandi, Simone Caldarella, Giuseppe Riccardi. Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM). 2022.