Watch this video on YouTube
A team is comparing two summarization models. One model shows a significantly higher ROUGE-L score. What can they conclude?