A Generative AI Engineer is designing a GenAI application for a retail company using third-party datasets scraped from the web. Some documents contain licensing terms restricting commercial redistribution. The legal team has raised concerns about potential intellectual property violations. What should the engineer do to reduce the risk of legal exposure while maintaining application functionality?