Build a ChatGPT RAG Data Pipeline with RisingWave Stream Processor and Vector Store
Enter the exciting brave new world of GenAI, by building a ChatGPT Data Pipeline that leverages on RisingWave's efficient stream processing jobs for real-time data that we draw from an X (or Twitter feed) that's been enriched with vector data and similarity search.
We'll explore the exciting ChatGPT world, building an efficient data pipeline that's enriched with vector embeddings as stored in a vector DB (PgVector) , and how it can pair with the performant RisingWave cloud-based stream processor for its write job. We will illustrate a sample use case with live coding, as follows:
* Simulate a streaming data feed from X (or Twitter), we'll be using Kafka as the message broker for data ingestion
* RisingWave will consume the data stream, and perform data analysis
* Construct prompts based on the top 3 hashtags identified by RisingWave
* Prompts will be used for inferencing against a RAG-based BOT built with PgVectorLevel:Intermediate