Watch this video on YouTube
A hospital is deploying a summarization model to generate clinical summaries from physician notes. The deployment team is focused on ensuring the outputs are factually correct. Which evaluation metric should they prioritize?