Watch this video on YouTube
A Generative AI Engineer is tasked with deploying an LLM application that uses a vector store for document retrieval. The application must ensure low latency and scalability. What infrastructure should they prioritize?