Skip to content

Knowledge (RAG)

Give your agents knowledge by creating collections, ingesting content, and enabling retrieval-augmented generation.

How It Works

  1. Create a collection — a named container for documents
  2. Ingest content — text, transcripts, or audio are chunked and embedded into vectors
  3. Attach to an agent — set the agent's collectionId
  4. Automatic retrieval — during conversations, the agent queries the collection for relevant context before responding

The RAG pipeline uses hybrid search: 80% vector similarity (pgvector) + 20% keyword matching (tsvector).

Create a Collection

bash
curl -X POST https://persona-labsvoice-api-production.up.railway.app/v1/collections \
  -H "Authorization: Bearer $PH0NY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Product Knowledge Base",
    "description": "Product docs, FAQs, and support articles"
  }'

Ingest Content

Text

bash
curl -X POST https://persona-labsvoice-api-production.up.railway.app/v1/collections/:id/ingest \
  -H "Authorization: Bearer $PH0NY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Our premium plan includes unlimited API calls, priority support, and custom voice cloning. Pricing starts at $99/month...",
    "contentType": "text",
    "chunkSize": 1000,
    "chunkOverlap": 200,
    "metadata": {
      "source": "pricing-page",
      "category": "billing"
    }
  }'

Transcript

bash
curl -X POST https://persona-labsvoice-api-production.up.railway.app/v1/collections/:id/ingest \
  -H "Authorization: Bearer $PH0NY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Speaker 1: Welcome to the show...",
    "contentType": "transcript"
  }'

Audio (auto-transcribed)

bash
curl -X POST https://persona-labsvoice-api-production.up.railway.app/v1/collections/:id/ingest \
  -H "Authorization: Bearer $PH0NY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sourceUrl": "https://example.com/podcast-episode.mp3",
    "contentType": "audio"
  }'

Audio is transcribed via STT before chunking and embedding.

Ingest Options

FieldTypeDefaultDescription
contentstring-Raw text content (up to 1MB)
sourceUrlstring-URL to fetch content from
contentTypestring"text"text, transcript, audio, or url
chunkSizenumber1000Characters per chunk (100-4000)
chunkOverlapnumber200Overlap between chunks (0-500)
metadataobject-Arbitrary metadata attached to chunks

Query a collection directly:

bash
curl -X POST https://persona-labsvoice-api-production.up.railway.app/v1/collections/:id/search \
  -H "Authorization: Bearer $PH0NY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "what is the return policy?",
    "limit": 5,
    "threshold": 0.7,
    "includeMetadata": true
  }'

Response:

json
{
  "results": [
    {
      "content": "Returns are accepted within 30 days of purchase...",
      "score": 0.89,
      "metadata": { "source": "faq", "category": "returns" }
    }
  ]
}

Attach to an Agent

When creating or updating an agent, set the collectionId:

bash
curl -X PUT https://persona-labsvoice-api-production.up.railway.app/v1/agents/agent_abc123 \
  -H "Authorization: Bearer $PH0NY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "collectionId": "coll_xyz789"
  }'

The agent will now automatically search the collection for relevant context on every conversational turn.

Delete a Document

bash
curl -X DELETE https://persona-labsvoice-api-production.up.railway.app/v1/collections/:collectionId/documents/:documentId \
  -H "Authorization: Bearer $PH0NY_API_KEY"

Delete a Collection

bash
curl -X DELETE https://persona-labsvoice-api-production.up.railway.app/v1/collections/:id \
  -H "Authorization: Bearer $PH0NY_API_KEY"

Deleting a collection removes all documents and chunks. Any agents referencing this collection will lose their knowledge source.

Built by Persona Labs.