A Generative AI Engineer is working with a government contractor to build a RAG-based assistant that references classified policy documents. The system must prevent sensitive fields like citizen_id and medical_history from being exposed in model outputs. Additionally, the model is expected to return useful responses even when masking is applied. Which approach best balances privacy and output usefulness?