Your organization has developed a pre-trained foundation model for language translation but requires better accuracy in translating industry-specific jargon. You need to fine-tune the model using specific documents in your domain. What data preparation strategy is essential for this fine-tuning process?