Membuat RAG menggunakan Database vector Halaman all

gunakan postgres yang sudah ada pgvector nya di container
Ubah port host nya ke port yang aman digunakan from 49152 to 65535

C:\Users\Retno>netstat -ano | findstr :5432
TCP 0.0.0.0:5432 0.0.0.0:0 LISTENING 5664
TCP [::]:5432 [::]:0 LISTENING 5664
TCP [::1]:1105 [::1]:5432 ESTABLISHED 8596
TCP [::1]:1106 [::1]:5432 ESTABLISHED 8596
TCP [::1]:5432 [::1]:1105 ESTABLISHED 5664
TCP [::1]:5432 [::1]:1106 ESTABLISHED 5664

C:\Users\Retno>netstat -ano | findstr :45432

C:\Users\Retno>tasklist | findstr 5664
postgres.exe 5664 Services 0 3,144 K
terminal.exe 15664 Console 1 35,396 K

C:\Users\Retno>tasklist | findstr 8596
python.exe 8596 Services 0 2,760 K

C:\Users\Retno>docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
rubythalib/pgvector latest ed8fa9c916d6 18 months ago 591MB

C:\Users\Retno>docker run -p 45432:5432 -e POSTGRES_PASSWORD=example --- name my_pgvector -d rubythalib/pgvector, from 49152 to 65535
C:\Users\Retno>docker run -p 45432:5432 -e POSTGRES_PASSWORD=example --- name pgvector_1 -d rubythalib/pgvector
a8647f0ab8b6e934000a9635d346fc9768e6f3a900483801f8360de4e11e0dd2

C:\Users\Retno>psql -U postgres -d postgres -p 45432
Password for user postgres: *example
psql (12.4, server 15.4 (Debian 15.4--1.pgdg120+1))
WARNING: psql major version 12, server major version 15.
Some psql features might not work.
WARNING: Console code page (437) differs from Windows code page (1252)
8-bit characters might not work correctly. See psql reference
page "Notes for Windows users" for details.
Type "help" for help.

postgres=# CREATE DATABASE rag;
CREATE DATABASE

postgres=# SELECT 1 FROM pg_database WHERE datname = 'rag';
?column?
- - - - -
1
(1 row)

rag=# CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION

rag=# CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION
rag=# SELECT extversion FROM pg_extension WHERE extname = 'vector';
extversion
- - - - - -
0.5.0
(1 row)

rag_1.py

from fastapi import FastAPI
app = FastAPI()
@app.get("/")
def read_root():
return {"Hi": "Rulita"}

ASGI (Asynchronous Server Gateway Interface )

PS D:\Belajar ML\RT> uvicorn rag_1:app --- reload --- port 48000
http://127.0.0.1:48000/
{"Hi":"Rulita"}

from pydantic import BaseModel

# Contoh model data menggunakan Pydantic
class Item(BaseModel):
name: str
description: str = None
price: float
tax: float = None

# Global dictionary to store items
item_dict = {}
# Counter for generating unique IDs
next_id = 1

# Endpoint dengan request body
@app.post("/items/")
async def create_item(item: Item):
global next_id
item_id = next_id
next_id += 1
item_data = item.dict()
if item.tax:
total = item.price + item.tax
item_data.update({"total_price": total})
item_dict[item_id] = item_data
return {"item_id": item_id, "item_data": item_data}

{
"name": "Sample Item",
"description": "This is a sample",
"price": 100.0,
"tax": 10.0
}

D:\Belajar ML\RT>curl -X POST -H "Content-Type: application/json" -d @Item.json http://127.0.0.1:48000/items/
{"item_id":1,"item_data":{"name":"Sample Item","description":"This is a sample","price":100.0,"tax":10.0,"total_price":110.0}}

# Endpoint dengan path parameter
@app.get("/items/{item_id}")
async def read_item(item_id: int):
if item_id in item_dict:
return {"item_id": item_id, "item_data": item_dict[item_id]}
else:
return {"error": "Item not found"}

D:\Belajar ML\RT>curl http://127.0.0.1:48000/items/1
{"item_id":1,"item_data":{"name":"Sample Item","description":"This is a sample","price":100.0,"tax":10.0,"total_price":110.0}}

Alternatif bisa menggunakan Redocly di http://127.0.0.1:8000/redoc atau Swagger di http://127.0.0.1:48000/docs

from fastapi import FastAPI, File, UploadFile
from dotenv import load_dotenv
import os
import requests
import psycopg2
from datetime import datetime

@app.post("/upload-knowledge/")
async def upload_knowledge(file: UploadFile = File(...)):
content = await file.read()
content = content.decode("utf-8")
chunks = split_text_into_chunks(content, CHUNK_SIZE, OVERLAP_SIZE)
for chunk in chunks:
embedding = get_embedding(chunk)
vector_store.store_embedding(embedding, chunk, tokens_used=None)

return {"message": "Knowledge uploaded successfully"}

# Helper to split text into chunks with overlap
def split_text_into_chunks(text, chunk_size, overlap_size):
words = text.split()
chunks = []
for i in range(0, len(words), chunk_size - overlap_size):
chunk = ' '.join(words[i:i + chunk_size])
chunks.append(chunk)
if i + chunk_size >= len(words):
break

return chunks

def get_embedding(text):
url = "https://api.openai.com/v1/embeddings"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {OPENAI_API_KEY}"
}
data = {
"model": "text-embedding-ada-002",
"input": text
}
response = requests.post(url, headers=headers, json=data)
response_data = response.json()
tokens_used = response_data['usage']['total_tokens']
vector_store.store_token_count('embedding_input', tokens_used)

return response_data['data'][0]['embedding']

Sebelum nya persiapkan dulu object VectorStore dan table token_counter

class VectorStore:
def store_token_count(self, token_type, tokens_used):
self.cursor.execute("""
INSERT INTO token_counter (token_type, tokens_used, timestamp)
VALUES (%s, %s, %s)
""", (token_type, tokens_used, datetime.now()))
self.connection.commit()

def store_embedding(self, embedding, content, tokens_used):
embedding = str(embedding)
self.cursor.execute("""
INSERT INTO knowledge (embedding, content)
VALUES (%s, %s)
""", (embedding, content))
self.connection.commit()
self.store_token_count('embedding_input', tokens_used)

vector_store = VectorStore()

Sebelumnya kita perlu create table token_counter

(C:\Users\Retno\AppData\Roaming\jupyterlab-desktop\jlab_server) C:\Users\Retno>psql -U postgres -d rag -p 45432
Password for user postgres: *example
psql (12.4, server 15.4 (Debian 15.4-1.pgdg120+1))
WARNING: psql major version 12, server major version 15.
Some psql features might not work.
WARNING: Console code page (437) differs from Windows code page (1252)
8-bit characters might not work correctly. See psql reference
page "Notes for Windows users" for details.
Type "help" for help.

rag=#

rag=# CREATE TABLE IF NOT EXISTS token_counter (id SERIAL PRIMARY KEY, token_type VARCHAR(50), tokens_used INT, timestamp TIMESTAMP);
CREATE TABLE
rag=#

rag=# CREATE TABLE IF NOT EXISTS knowledge (id SERIAL PRIMARY KEY, embedding VECTOR(1536), content TEXT);
CREATE TABLE

PS D:\Belajar ML\RT> uvicorn rag_1:app --- reload --- port 48000
INFO: Will watch for changes in these directories: ['D:\\Belajar ML\\RT']
INFO: Uvicorn running on http://127.0.0.1:48000 (Press CTRL+C to quit)
INFO: Started reloader process [5456] using WatchFiles
INFO: Started server process [13476]
INFO: Waiting for application startup.
INFO: Application startup complete.

Alternative menggunakan

(C:\Users\Retno\AppData\Roaming\jupyterlab-desktop\jlab_server) C:\Users\Retno>conda list

Jika tidak ada package di list, maka cari di channel anaconda.org/search?q=pyngrok

(C:\Users\Retno\AppData\Roaming\jupyterlab-desktop\jlab_server) C:\Users\Retno>conda activate C:\Users\Retno\AppData\Roaming\jupyterlab-desktop\jlab_server

(C:\Users\Retno\AppData\Roaming\jupyterlab-desktop\jlab_server) C:\Users\Retno>conda install -c conda-forge ollama --force-reinstall -y

(C:\Users\Retno\AppData\Roaming\jupyterlab-desktop\jlab_server) C:\Users\Retno>conda install -c conda-forge llama-cpp-python --force-reinstall -y

D:\Belajar ML\RT\ngrok>ngrok config add-authtoken 2ujnnPOf6ZXP0G1XPpgHozmkgTe_2q5CSnL6xBB3PSDf6hWoT

Authtoken saved to configuration file: C:\Users\Retno\AppData\Local/ngrok/ngrok.yml

ngrok http --- url=ultimate-bug-randomly.ngrok-free.app 80

rag=# CREATE TABLE IF NOT EXISTS chat_sessions (session_id SERIAL PRIMARY KEY, history JSONB);
CREATE TABLE

@app.post("/newchat/")
async def newchat():
messages = [{"role": "system", "content": CHATBOT_PREPROMPT}]
session_id = vector_store.store_session(messages)
return {"session_id": session_id}

any

class VectorStore:

# any

def store_session(self, history):
self.cursor.execute("""
INSERT INTO chat_sessions (history)
VALUES (%s) RETURNING session_id
""", [Json(history)])
session_id = self.cursor.fetchone()[0]
self.connection.commit()
return session_id

rag=# select * from chat_sessions;
session_id | history
------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | [{"role": "system", "content": "Kamu adalah customer service chatbot untuk perusahaan jual beli perabotan rumah tangga bernama Emerald Mabel. Jika pertanyaan di luar konteks, maka kamu jawab tidak tahu"}]
(1 row)

<-> adalah operator jarak (distance operator) untuk tipe data vektor (vector) di PostgreSQL, khususnya ketika menggunakan ekstensi pgvector. Operator <->menghitung jarak Euclidean (L2 distance) antara dua vektor untuk menemukan data paling relevan berdasarkan kedekatan embedding. Pencarian vektor (semantic search) adalah teknik yang umum digunakan dalam aplikasi seperti pencarian dokumen semantik, Sistem rekomendasi, dan RAG.

class VectorStore:

# ..

def query_similar(self, embedding, limit=1):
embedding = str(embedding)
self.cursor.execute("""
SELECT content FROM knowledge
ORDER BY embedding <-> %s
LIMIT %s
""", (embedding, limit))
return self.cursor.fetchall()

Klausa ORDER BY embedding <-> %smengurutkan baris berdasarkan jarak antara kolom embedding di tabel dengan vektor atau nilai embedding yang diberikan parameter %spertama.

rag=# CREATE TABLE IF NOT EXISTS chat_history (id SERIAL PRIMARY KEY, chat_session_id INT REFERENCES chat_sessions(session_id), chat TEXT, ai_answer TEXT, chat_embedding VECTOR(1536), ai_answer_embedding VECTOR(1536));
CREATE TABLE
rag=#

curl -X 'POST' \
'http://localhost:8000/chat/?session_id=1&text=Apa%20kode%20pos%20emerald%20mable%20%3F' \
-H 'accept: application/json' \
-d ''

Response body

{
"response": "Kode pos untuk Emerald Mable adalah 12430."
}

Alternatif model untuk embedding

import fasttext
model = fasttext.load_model('cc.id.300.bin') # Model pra-latih untuk bahasa Indonesia
embedding = model.get_sentence_vector("Teks contoh")

API keys

Acm
a

https://platform.openai.com/api-keys

curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-proj-u_xUa_gpk6J0m6AKzXOaN2tyNT382x7nZhqKaWuzooQVvf9VxSb-pnThP4cK1iTuxeyhhFOpc8T3BlbkFJR4mScQlHR4HUZ3iJmw84uNU95RAoHQ-gjWjdeMB5PMBfF2eB41P6n081gi-oMtekiOls_ONJgA_1" \
-d '{
"model": "gpt-4o-mini",
"store": true,
"messages": [
{"role": "user", "content": "Write a haiku"}
]
}'

Add some API credits

Add a payment method to buy credits and unlock the full power of the API.

$5 can cover about 2 million input or 500k output tokens. Plenty to start building your idea. View pricing

Free

Explore how AI can help with everyday tasks

Access to GPT4o mini
Real-time data from the web with search
Limited access to GPT4o and o3mini
Limited access to file uploads, data analysis, image generation, and voice mode
Code edits with the ChatGPT desktop app for macOS

Use custom GPTs

Follow Instagram @kompasianacom juga Tiktok @kompasiana biar nggak ketinggalan event seru komunitas dan tips dapat cuan dari Kompasiana. Baca juga cerita inspiratif langsung dari smartphone kamu dengan bergabung di WhatsApp Channel Kompasiana di SINI

HALAMAN :

LIHAT SEMUA

Mohon tunggu...

Lihat Artificial intelligence Selengkapnya

Beri Komentar

Belum ada komentar. Jadilah yang pertama untuk memberikan komentar!

Membuat RAG menggunakan Database vector

artificial-intelligence

pgvector

new world

artificial intelligence

Artikel Lainnya

LAPORKAN KONTEN

Apakah Media Sosial Membuat Jarak Anak dan Orang Tua Makin Lebar?

Ternyata, Tidak Kepo Bisa Membuat Kesehatan Mentalmu Terjaga!

Smartphone Mempermudah Hidup atau Justru Membuat Kita Ribet?