article thumbnail

Super charge your LLMs with RAG at scale using AWS Glue for Apache Spark

AWS Machine Learning Blog

This enables you to preprocess your external data in the phases including cleaning, sanitization, chunking documents, generating vector embeddings for each chunk, and loading into a vector store. About the Authors Noritaka Sekiyama is a Principal Big Data Architect on the AWS Glue team.

LLM 116