Artisan SVG > Call Workflow (call-workflow) (flash-1)
Artisan IMG > AWS Bedrock (aws-bedrock) (5a1f2dc4-52b2-418d-a8ae-99fbaebbc4fa)
Artisan IMG > Vector Tables (vector-tables) (248afc8f-aa04-4481-9db0-151db15c004d)

Getting started with Retrieval Augmented Generation (RAG) and grounding AI in your data

Project
Artificial Intelligence
Getting Started
Beginner

This is a 'Project' template which means that it contains a group of workflows that work together to achieve a particular aim

Overview
Copy

This template comes ready to test by either uploading files OR scraping websites. You can test getting answers with the RAG pipeline with a Tray form and get answers back at your email address. This template is meant to help run a POC and components can be repurposed for your production use cases.

Prerequisites
Copy

To deploy this template you will need to have:

  • An Amazon Bedrock account (you can replace this with your AI Vendor of choice)

  • If you wish to use the web-scraping option you will need to setup an APIFY credential

Getting Live
Copy

After deploying the template, you will need to take the following steps:

1 - Add authentications
Copy

  • Add your Bedrock credentials for the P04 workflow

  • If you want to run the inference workflow, add credentials for at least one of the vendors (Bedrock is an option)

2 - Create a new native Vector Table in your project
Copy

The dimensions depend on the model you are using and we provide a quick guide. Always best to double check the model card information for the service you are using. Make sure to update the vector table in workflows P04 and R01

3 - Run the Ingestion process
Copy

First enable the Tray form(s). Then navigate to them using the public URL (right click on the trigger)

You can use the file upload form or the scraping. You will get emails to let you know the ingestion started and when it is complete.

4 - Test your RAG System
Copy

The email notification related to the ingestion being complete includes a link to the next form (make sure you enabled it within Tray). You can now ask a question and see how your RAG system is able to respond (by email again). We suggest asking a question that you know is relevant to the content you ingested and one that is irrelevant.

Next steps
Copy

To bring this to a production state you can switch your ingestion workflows to point to systems where you knowledge lives. Then you can use the same callable workflows to process that data and get it into vector storage. You can use Tray to give AI access to this knowledge so it can answer support tickets, respond as a slackbot, and draft email responses.

Updating the your Native Vector Table
Copy

When new or updated content is added to your docs repo you will need a process to update the Vector Table.

One simplistic approach is to simply periodically erase the Vector Table completely and run the crawl pipeline again. (You can use the X01 workflow to do this, just update your table and apply your Bedrock auth)

More likely, you will want to build a system which is triggered by updates to the repo.

In this case, it is recommended that you:

  1. Attach a unique identifier to each markdown file in GitHub

  2. Store this identifier as metadata for each chunk in Vector Tables

  3. When a page is updated delete all Vector Table entries that have that page's identifier

  4. Store the chunks for the updated page

NOTE: Workflows