A media company needs to automatically generate transcriptions of podcast episodes stored in an Amazon S3 bucket for natural language processing (NLP) tasks such as sentiment detection and keyword extraction. The company requires a cost-efficient, scalable solution that allows customization of vocabulary to improve transcription accuracy for industry-specific terminology. Which solution should you recommend?