A publishing company is currently integrating a data feed into Amazon S3, which consists of book features originating from multiple sources. Given the redundancy of data across these sources, it is imperative to remove duplicates in the S3 data lake before proceeding with further data processing. Which approach ensures the elimination of duplicate records with the least amount of development and maintenance effort?