Building a Postgres Vector Database for AI: From LocalStack to AWS RDS

Building a Postgres Vector Database for AI: From LocalStack to AWS RDS

PostgreSQL
pgvector
Vector Database
AWS RDS
LocalStack
RAG
Embeddings
2025-04-01

Introduction

Welcome to the second installment in our vector database series! In our previous post about Building Vector Database with MongoDB Atlas, we explored how to leverage MongoDB Atlas to create a powerful vector search system for our coffee shop example "Bean There, Done That."

This time, we're taking a different approach. We'll build a PostgreSQL-based vector search system with the pgvector extension—starting with a local development environment using LocalStack, then deploying to production with AWS RDS. This self-hosted alternative offers greater control, potentially lower costs, and a powerful path to production.

If you're not familiar with vector databases or embeddings, I recommend starting with our previous article for a foundational understanding. We'll focus here on the key differences, implementation details, and trade-offs between MongoDB Atlas and PostgreSQL for vector search.

By the end of this tutorial, you'll have a fully functional vector search system for both development and production environments, offering the best of both worlds: easy local development and robust cloud deployment.

Why PostgreSQL for Vector Search?

PostgreSQL has emerged as a powerful contender in the vector database landscape, especially with the pgvector extension. But why might you choose Postgres over a dedicated vector database or MongoDB Atlas? Let's explore some compelling reasons:

  • Self-hosted control: Run everything locally or on your own infrastructure without dependency on external cloud services.
  • Familiar SQL interface: Leverage your existing SQL knowledge instead of learning new query languages or APIs.
  • Cost efficiency: Potentially significant cost savings compared to specialized vector database services, especially at scale.
  • All-in-one solution: Store traditional relational data alongside vector embeddings in the same database.
  • Mature ecosystem: Benefit from PostgreSQL's robust tooling, backups, monitoring, and extensive community support.
  • Advanced indexing: pgvector supports multiple indexing methods including IVFFlat (inverted file with flat compression) for faster searches at scale.
  • AWS RDS compatibility: Deploy to production on AWS RDS with native pgvector support for a fully managed solution.

The pgvector extension adds three critical capabilities to PostgreSQL:

  1. A new vector data type for storing embedding vectors
  2. Vector similarity functions (cosine distance, Euclidean distance, and inner product)
  3. Vector indexing for efficient similarity searches

Let's compare how PostgreSQL with pgvector compares to other vector database solutions:

Feature / DatabasePostgreSQL + pgvectorMongoDB AtlasPineconeWeaviateQdrant
Deployment OptionsSelf-hosted, Cloud
(RDS, Azure, GCP)
Atlas (Fully Managed)Fully Managed OnlySelf-hosted, Cloud
(SaaS)
Self-hosted, Cloud
(SaaS)
Cost$$ (Low to Medium)
RDS pricing + storage
$$$ (Medium to High)
Atlas pricing tiers
$$$$ (High)
Per operation pricing
$$$ (Medium to High)
Resource-based pricing
$$ (Low to Medium)
CPU/RAM based pricing
SQL IntegrationNative SQL
Full SQL capabilities
Limited (MQL)
MongoDB query language
None
REST API only
GraphQL
+ REST API
REST API
+ gRPC
ScalabilityMedium
Limited by RDS instance size
High
Horizontal sharding
Very High
Auto-scaling clusters
High
Distributed architecture
High
Distributed architecture
Query PerformanceGood
Depends on index configuration
Very Good
Optimized for vectors
Excellent
Purpose-built for vectors
Very Good
HNSW optimized
Very Good
HNSW optimized
Search AlgorithmsHNSW, IVF, L2, Cosine
Multiple distance metrics
HNSW, BF, Cosine
Limited metrics
HNSW, ANN, Hybrid
Advanced hybrid search
HNSW, BM25, Hybrid
Multi-modal search
HNSW, Exact, Quantized
Memory-optimized
Learning CurveLow (for SQL devs)
Familiar SQL syntax
Medium
MongoDB specific
Low
Simple REST API
Medium
GraphQL complexity
Low
Simple API design
Data IntegrationExcellent
Join with any SQL table
Very Good
Document-oriented
Limited
Vector-focused only
Good
Schema-based
Good
Payload-based
Operational ComplexityMedium
RDS management required
Low (managed)
Fully managed service
Low (managed)
Fully managed service
Medium
Choice of deployment
Medium
Choice of deployment
Ecosystem MaturityVery High
Decades of PostgreSQL development
High
Established NoSQL database
Medium
Vector DB pioneer
Medium
Growing community
Medium
Growing rapidly
Language SupportExtensive
All major languages
Extensive
All major languages
Good
Multiple SDKs
Good
Growing SDKs
Good
Multiple SDKs
Transaction SupportFull ACID
Complete transaction support
ACID for documents
Multi-document transactions
Limited
Not transaction-focused
Limited
Batch operations
Limited
Batch operations

Key: Cyan highlights show PostgreSQL strengths; Amber indicates potential limitations; Blue shows where other solutions excel. This comparison is based on general characteristics and may vary depending on specific use cases and configurations.

Why Choose PostgreSQL Despite Trade-offs?

While our comparison table highlights several areas where specialized vector databases outperform PostgreSQL + pgvector, we're still confidently choosing PostgreSQL for our production vector search implementation. Here's why this trade-off makes strategic sense for many organizations:

Key Advantages

1
Existing Infrastructure: We already use RDS PostgreSQL in production, making integration seamless and reducing adoption barriers.
2
Team Expertise: Our team already has SQL expertise, eliminating the learning curve associated with specialized vector databases.
3
Simplified Architecture: Consolidating both traditional data and vector embeddings in one database significantly reduces system complexity.
4
Cost Efficiency: Leveraging existing database infrastructure avoids additional service costs and overhead.

Recognized Trade-offs

1
Scaling Limits: We acknowledge pgvector's scaling limitations compared to specialized vector DBs, but our current workload remains well within PostgreSQL's capabilities.
2
Search Performance: While specialized databases offer better raw vector search performance, PostgreSQL's performance is sufficient for our use cases with proper indexing.
3
Operational Overhead: Managing PostgreSQL configuration requires more expertise than fully-managed solutions, but we already have this expertise in-house.
4
Future-proofing: If our vector search needs grow beyond PostgreSQL's capabilities, we've designed our application with a clean separation that will facilitate migration.

Migration Path If Needed

By implementing a clean repository pattern and abstraction layer in our code, we can switch from PostgreSQL to a specialized vector database in the future with minimal disruption if our needs change. This future-proofing strategy gives us the best of both worlds: immediate integration with existing infrastructure now, with a clear path forward if we need specialized scaling later.

Optimizing PostgreSQL for Production Vector Workloads

While PostgreSQL with pgvector may not match the raw performance of specialized vector databases, we can significantly improve its performance and reliability for production workloads with the following techniques:

Instance Sizing Recommendations

  • Memory-optimized instances: Choose r5/r6g RDS instances that provide higher memory-to-vCPU ratios for vector operations
  • Vertical scaling: Start with a moderate instance size and scale up as vector workload grows, monitoring memory usage
  • Benchmarking: Test with realistic vector data volumes to determine optimal instance size before production deployment

PostgreSQL Configuration

  • shared_buffers: Increase to 25-40% of total RAM for vector operations (higher than standard OLTP workloads)
  • work_mem: Set to 32-64MB to improve vector search performance
  • maintenance_work_mem: Increase to 256-512MB to speed up index creation
  • effective_cache_size: Set to ~70% of available system memory to help query planning

Index Optimization Strategies

  • HNSW vs IVFFlat: Choose HNSW for faster queries but slower index builds; IVFFlat for balanced performance
  • Probes configuration: Adjust IVFFlat probes (10-100) to balance speed vs. accuracy based on use case
  • Partitioning: For large tables (>1M vectors), consider partitioning by date or category to improve performance
  • Periodic reindexing: Schedule maintenance windows for rebuilding indexes as vector data grows

Connection Management

  • Connection pooling: Implement pgBouncer or RDS Proxy to manage database connections efficiently
  • Read replicas: Offload vector search queries to read replicas for scalability
  • Application-level pooling: Configure Node.js pool settings to prevent connection exhaustion
  • Query timeouts: Implement application-level timeouts for vector queries to prevent hanging operations

Scaling Strategy: The 3-Tiered Approach

For organizations using PostgreSQL + pgvector in production, we recommend this pragmatic scaling strategy that aligns with data volume growth:

1
Tier 1: Single PostgreSQL Instance (Up to 1M vectors)

Start with a single optimized RDS instance using HNSW or IVFFlat indexes. Most applications will find this sufficient for initial production deployment. Focus on proper indexing and query optimization.

2
Tier 2: Multiple PostgreSQL Instances (1-10M vectors)

As volume grows, implement read replicas and connection pooling. Consider application-level sharding by categorizing vector data across multiple tables or databases. Use pgvector's approximate search capabilities with appropriate probe settings.

3
Tier 3: Hybrid Approach (10M+ vectors)

At this scale, consider a hybrid approach: Keep PostgreSQL for relational data and frequently accessed vectors, while migrating historical or specialized vector search to a dedicated vector database through the abstraction layer you've built. This allows you to leverage PostgreSQL's strengths while addressing its limitations at scale.

With these optimization strategies, PostgreSQL with pgvector can efficiently handle vector search workloads for most business applications, while providing a clear path to scale as your needs grow. The key is understanding the performance characteristics and making informed architectural decisions based on your specific requirements.

Setting Up LocalStack and PostgreSQL for Development

For our development environment, we'll set up everything locally using Docker and LocalStack. This approach has several advantages:

  • No cloud account or costs required for development
  • Complete development environment that can run offline
  • Faster iteration without network latency
  • Easy teardown and recreation of the entire environment
  • Parity with production environment for seamless transitions

Let's start by setting up our local development environment:

1. Docker Compose Setup

First, create a docker-compose.yml file to orchestrate our services:

version: '3.8' services: postgres: image: ankane/pgvector:latest environment: POSTGRES_USER: postgres POSTGRES_PASSWORD: postgres POSTGRES_DB: vectordb ports: - "5432:5432" volumes: - postgres-data:/var/lib/postgresql/data - ./init-scripts:/docker-entrypoint-initdb.d healthcheck: test: ["CMD", "pg_isready", "-U", "postgres"] interval: 5s timeout: 5s retries: 5 localstack: image: localstack/localstack:latest ports: - "4566:4566" environment: - SERVICES=s3,lambda,sqs - DEBUG=1 - DATA_DIR=/tmp/localstack/data volumes: - ./localstack:/var/lib/localstack - /var/run/docker.sock:/var/run/docker.sock pgadmin: image: dpage/pgadmin4 environment: PGADMIN_DEFAULT_EMAIL: admin@example.com PGADMIN_DEFAULT_PASSWORD: admin ports: - "5050:80" depends_on: - postgres volumes: postgres-data:

This compose file sets up three services:

  • PostgreSQL with pgvector: We're using a pre-built image that includes the pgvector extension
  • LocalStack: Provides local AWS cloud service emulation (S3, Lambda, SQS)
  • pgAdmin: A helpful GUI for PostgreSQL management

2. Initialize Database Schema

Create an initialization script that will run when the Postgres container starts:

-- File: init-scripts/01-init-schema.sql CREATE EXTENSION IF NOT EXISTS vector; -- Create products table CREATE TABLE products ( id SERIAL PRIMARY KEY, name VARCHAR(255) NOT NULL, description TEXT, category VARCHAR(100), price DECIMAL(10, 2), image_url VARCHAR(255), created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- Create product embeddings table CREATE TABLE product_embeddings ( id SERIAL PRIMARY KEY, product_id INTEGER REFERENCES products(id) ON DELETE CASCADE, embedding vector(1536), embedding_type VARCHAR(50) DEFAULT 'text', created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- Create customer table CREATE TABLE customers ( id SERIAL PRIMARY KEY, name VARCHAR(255), email VARCHAR(255) UNIQUE, preferences JSONB, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- Create customer embeddings table CREATE TABLE customer_embeddings ( id SERIAL PRIMARY KEY, customer_id INTEGER REFERENCES customers(id) ON DELETE CASCADE, embedding vector(512), embedding_type VARCHAR(50) DEFAULT 'face', created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- Create indexes for vector search CREATE INDEX ON product_embeddings USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100); CREATE INDEX ON customer_embeddings USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100); -- Insert sample coffee products INSERT INTO products (name, description, category, price) VALUES ('Espresso', 'Strong black coffee made by forcing steam through ground coffee beans', 'Coffee', 3.50), ('Cappuccino', 'Coffee with steamed milk foam', 'Coffee', 4.50), ('Latte', 'Coffee with steamed milk', 'Coffee', 4.75), ('Cold Brew', 'Coffee brewed with cold water over long period', 'Coffee', 5.00), ('Croissant', 'Buttery, flaky pastry', 'Bakery', 3.75), ('Blueberry Muffin', 'Sweet breakfast muffin with blueberries', 'Bakery', 3.50);

Note the key differences in our Postgres schema compared to MongoDB:

  • We use traditional relational tables with foreign key constraints
  • The vector type is explicitly defined with dimension size (1536 for text, 512 for face embeddings)
  • We create IVFFlat indexes for efficient vector similarity search

3. Start the Environment

Launch the entire development environment with:

docker-compose up -d

Verify that everything is running:

docker-compose ps

You should see all three services running. You can access pgAdmin at http://localhost:5050 to explore your PostgreSQL database.

Using pgvector for Vector Search

Now that our PostgreSQL database with pgvector is set up, let's explore how to use it for vector search operations. We'll implement similar functionality to our MongoDB Atlas example, but with SQL queries instead of MongoDB queries.

Creating the Database Access Layer

Let's create a database access module using Node.js with the pg library:

// db.ts import { Pool } from 'pg'; // Initialize a connection pool const pool = new Pool({ host: process.env.DB_HOST || 'localhost', port: parseInt(process.env.DB_PORT || '5432'), user: process.env.DB_USER || 'postgres', password: process.env.DB_PASSWORD || 'postgres', database: process.env.DB_NAME || 'vectordb', ssl: process.env.DB_SSL === 'true' ? { rejectUnauthorized: false } : undefined }); // Test connection async function testConnection() { try { const client = await pool.connect(); console.log('Connected to PostgreSQL'); client.release(); return true; } catch (err) { console.error('Database connection error:', err); return false; } } // Add a product with its embedding async function addProductWithEmbedding( name: string, description: string, category: string, price: number, embedding: number[] ) { const client = await pool.connect(); try { await client.query('BEGIN'); // Insert product const productResult = await client.query( 'INSERT INTO products (name, description, category, price) VALUES ($1, $2, $3, $4) RETURNING id', [name, description, category, price] ); const productId = productResult.rows[0].id; // Insert embedding await client.query( 'INSERT INTO product_embeddings (product_id, embedding) VALUES ($1, $2)', [productId, embedding] ); await client.query('COMMIT'); return productId; } catch (err) { await client.query('ROLLBACK'); throw err; } finally { client.release(); } } // Perform vector similarity search async function searchProductsByEmbedding( queryEmbedding: number[], limit: number = 5 ) { const client = await pool.connect(); try { const result = await client.query( `SELECT p.*, 1 - (pe.embedding <=> $1) as similarity FROM products p JOIN product_embeddings pe ON p.id = pe.product_id ORDER BY similarity DESC LIMIT $2`, [queryEmbedding, limit] ); return result.rows; } finally { client.release(); } } // Export our database functions export default { testConnection, addProductWithEmbedding, searchProductsByEmbedding, };

Note the key differences from our MongoDB implementation:

  1. We use SQL queries with parameters instead of MongoDB query objects
  2. The <=> operator in pgvector represents cosine distance between vectors
  3. We use transactions with BEGIN/COMMIT/ROLLBACK for data integrity when inserting related records
  4. Our configuration supports both local development and production deployment with environment variables

Vector Search Operation Diagram

PostgreSQL Vector Search Flow

Generating Embeddings with OpenAI

The process of generating embeddings remains the same as in our MongoDB implementation, since we're still using OpenAI's embedding API. However, we'll modify our code to store these embeddings in PostgreSQL:

// embeddings.ts import { OpenAI } from 'openai'; import db from './db'; const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, }); // Generate text embeddings using OpenAI async function generateEmbedding(text: string) { const response = await openai.embeddings.create({ model: "text-embedding-3-small", input: text, }); return response.data[0].embedding; } // Add a product with automatically generated embedding async function addProductWithTextEmbedding( name: string, description: string, category: string, price: number ) { // Generate embedding from product details const textToEmbed = `${name}. ${description}. Category: ${category}`; const embedding = await generateEmbedding(textToEmbed); // Store product and embedding return db.addProductWithEmbedding(name, description, category, price, embedding); } // Search products by text query async function searchProductsByText(query: string, limit: number = 5) { // Generate embedding for search query const queryEmbedding = await generateEmbedding(query); // Perform vector search return db.searchProductsByEmbedding(queryEmbedding, limit); } export default { generateEmbedding, addProductWithTextEmbedding, searchProductsByText, };

The embedding generation process is identical to our MongoDB implementation, showcasing one of the advantages of vector databases: the embedding generation is independent of the storage technology.

Building the API Layer

Now, let's create a simple Express.js API that exposes our vector search capabilities:

// app.ts import express from 'express'; import cors from 'cors'; import db from './db'; import embeddings from './embeddings'; const app = express(); const port = 3001; // Middleware app.use(cors()); app.use(express.json()); // Health check endpoint app.get('/health', async (req, res) => { const dbConnected = await db.testConnection(); res.json({ status: 'ok', database: dbConnected ? 'connected' : 'disconnected' }); }); // Add a product app.post('/products', async (req, res) => { try { const { name, description, category, price } = req.body; if (!name || !description || !category || !price) { return res.status(400).json({ error: 'Missing required fields' }); } const productId = await embeddings.addProductWithTextEmbedding( name, description, category, parseFloat(price) ); res.status(201).json({ success: true, productId, message: 'Product created with embedding' }); } catch (err) { console.error('Error creating product:', err); res.status(500).json({ error: 'Failed to create product' }); } }); // Search products by text app.get('/products/search', async (req, res) => { try { const { q, limit } = req.query; if (!q) { return res.status(400).json({ error: 'Query parameter required' }); } const results = await embeddings.searchProductsByText( q.toString(), limit ? parseInt(limit.toString()) : 5 ); res.json({ query: q, results }); } catch (err) { console.error('Error searching products:', err); res.status(500).json({ error: 'Search failed' }); } }); app.listen(port, () => { console.log(`API server running at http://localhost:${port}`); });

This API provides endpoints to:

  • Add products with automatically generated embeddings
  • Search products using semantic vector search
  • Check the health of the API and database connection

Deployment to AWS RDS

Now that we have our local development environment working, let's explore how to deploy this solution to AWS RDS for production use. This gives you the best of both worlds: local development with no cloud costs, and a fully managed production environment.

1. Setting Up AWS RDS with pgvector

AWS RDS now supports PostgreSQL with the pgvector extension, making it easy to deploy our solution to production. You can create a new RDS instance either through the AWS console or using infrastructure as code (IaC) tools like Terraform or AWS CDK.

Here's a Terraform example for creating an RDS instance with pgvector:

resource "aws_db_parameter_group" "postgres_pgvector" { name = "postgres-pgvector" family = "postgres14" parameter { name = "shared_preload_libraries" value = "pg_stat_statements,pgvector" } } resource "aws_db_instance" "postgres_vector_db" { identifier = "postgres-vector-db" engine = "postgres" engine_version = "14.6" instance_class = "db.t3.micro" allocated_storage = 20 max_allocated_storage = 100 storage_type = "gp2" db_name = "vectordb" username = "postgres" password = var.db_password parameter_group_name = aws_db_parameter_group.postgres_pgvector.name skip_final_snapshot = true publicly_accessible = false vpc_security_group_ids = [aws_security_group.postgres_sg.id] db_subnet_group_name = aws_db_subnet_group.postgres_subnet_group.name tags = { Environment = "production" Project = "vector-search" } }

After creating the RDS instance, you'll need to connect to it and create the pgvector extension and initialize your schema:

-- Connect to your RDS instance and run: CREATE EXTENSION IF NOT EXISTS vector; -- Then run the same schema creation script we used locally -- (You can use a database migration tool like Flyway or Liquibase for this)

2. Configuring for Production

Update your application's configuration to connect to RDS in production. Remember our database module was already designed to use environment variables, making this transition seamless:

// For local development, no changes needed // For production, set these environment variables process.env.DB_HOST = 'postgres-vector-db.xxxxxxxxxx.us-east-1.rds.amazonaws.com'; process.env.DB_PORT = '5432'; process.env.DB_USER = 'postgres'; process.env.DB_PASSWORD = 'your-secure-password'; process.env.DB_NAME = 'vectordb'; process.env.DB_SSL = 'true';

3. Production Architecture

For a complete production deployment, you'll want to set up a secure, scalable architecture:

AWS Production Architecture

This architecture includes:

  • RDS with pgvector: Our fully managed vector database
  • ECS Fargate: Serverless container service for running our API
  • Application Load Balancer: For routing traffic and SSL termination
  • Private and Public Subnets: Security best practice to isolate the database

4. Performance Considerations for RDS

When running pgvector on RDS, consider these performance optimizations:

  1. Instance Sizing: Choose an instance with adequate memory for your vector indexes (r5 or r6g instances work well)
  2. Parameter Tuning: Configure shared_buffers, work_mem, and maintenance_work_mem for vector workloads
  3. Index Choice: Use HNSW indexes for best query performance, or IVFFlat for balanced build/query times
  4. Connection Pooling: Implement pgBouncer or RDS Proxy for managing database connections
  5. Read Replicas: For read-heavy workloads, consider adding RDS read replicas

Configure these parameters in your RDS parameter group:

resource "aws_db_parameter_group" "postgres_pgvector_optimized" { name = "postgres-pgvector-optimized" family = "postgres14" parameter { name = "shared_preload_libraries" value = "pg_stat_statements,pgvector" } parameter { name = "shared_buffers" value = "{DBInstanceClassMemory/4}" } parameter { name = "work_mem" value = "32768" # 32MB, adjust based on your workload } parameter { name = "maintenance_work_mem" value = "512000" # 512MB, helps with index creation } parameter { name = "max_connections" value = "200" } }

Integration with RAG Applications

Our PostgreSQL vector database integrates seamlessly with Retrieval-Augmented Generation (RAG) applications, similar to the MongoDB Atlas implementation. The key difference is in the database query layer.

RAG Implementation

Here's how to integrate our PostgreSQL vector search with an AI assistant using a RAG approach:

// rag-assistant.ts import { OpenAI } from 'openai'; import db from './db'; import embeddings from './embeddings'; const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, }); async function createAssistantResponse(query: string) { // Step 1: Generate embedding for the user query const queryEmbedding = await embeddings.generateEmbedding(query); // Step 2: Retrieve relevant products using vector search const relevantProducts = await db.searchProductsByEmbedding(queryEmbedding, 3); // Step 3: Format product information as context const productContext = relevantProducts.map(p => `Product: ${p.name}\nDescription: ${p.description}\nCategory: ${p.category}\nPrice: $${p.price}` ).join('\n\n'); // Step 4: Generate response with context-enhanced prompt const response = await openai.chat.completions.create({ model: "gpt-4o", messages: [ { role: "system", content: `You are a helpful barista assistant for "Bean There, Done That" coffee shop. Use the provided product information to answer customer questions accurately. If you don't know or can't find relevant information, be honest about it.` }, { role: "user", content: `Based on the following products in our inventory, please answer this customer question: "${query}" PRODUCTS: ${productContext}` } ], temperature: 0.7, max_tokens: 500 }); return response.choices[0].message.content; } export default { createAssistantResponse };

This implementation follows the same RAG pattern as our MongoDB Atlas solution:

  1. Generate an embedding for the user's query
  2. Retrieve relevant information using vector similarity search
  3. Enhance the prompt with the retrieved information as context
  4. Generate a response using the context-enhanced prompt

System Architecture

System Architecture

Performance Considerations

When using PostgreSQL with pgvector for vector search, several performance considerations should be kept in mind:

Indexing Strategies

pgvector supports two main indexing methods:

  1. IVFFlat: Inverted file with flat compression
    -- Create an IVFFlat index CREATE INDEX ON product_embeddings USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

    The lists parameter controls the number of partitions. Larger values make building the index slower but can speed up larger queries. A good rule of thumb is to use lists = sqrt(row_count).

  2. HNSW: Hierarchical Navigable Small World graphs (available in pgvector 0.4.0+)
    -- Create an HNSW index CREATE INDEX ON product_embeddings USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

    HNSW indexes provide faster search but use more memory and are slower to build. The m parameter controls the maximum number of connections per node, and ef_construction controls index quality vs. build speed.

Query Optimization

For faster queries, especially with approximate nearest neighbor search:

-- Use the ivfflat index with a specified probe value SET ivfflat.probes = 10; -- Query with approximate search SELECT p.*, 1 - (pe.embedding <=> query_embedding) as similarity FROM products p JOIN product_embeddings pe ON p.id = pe.product_id ORDER BY pe.embedding <=> query_embedding LIMIT 5;

The ivfflat.probes setting controls the trade-off between search speed and accuracy. Higher values increase accuracy but slow down the search.

Benchmarking Results

Here's a comparison of query performance between MongoDB Atlas and PostgreSQL with pgvector based on our testing:

Query Time Comparison (milliseconds, lower is better)

PostgreSQL + pgvector
15ms
MongoDB Atlas
12ms
Pinecone
8ms

These results are based on a small dataset of 10,000 products with 1,536-dimensional embeddings. Your actual performance may vary depending on data size, hardware, network, and configuration.

Comparing Development and Production Environments

Let's compare the characteristics of our development (LocalStack) and production (AWS RDS) environments:

FeatureLocalStack + pgvector (Development)AWS RDS + pgvector (Production)
CostFree (local resources only)Pay for RDS instance, storage, I/O
PerformanceLimited by local hardwareScalable to various instance types
AvailabilitySingle instance, no HAMulti-AZ option, high availability
ScalabilityLimited to local resourcesVertical scaling, read replicas
MaintenanceManual, developer-managedAutomated backups, patches, monitoring
SecurityBasic local securityVPC, encryption, IAM, security groups
Network AccessLocalhost onlyInternet or VPC-private
Setup ComplexitySimple (Docker Compose)More complex (AWS configuration)

This comparison highlights the key strengths of each environment:

  • LocalStack: Perfect for development with zero cloud costs and simple setup
  • AWS RDS: Ideal for production with scalability, reliability, and security features

The beauty of our approach is that the same code works in both environments with just configuration changes, allowing for a smooth transition from development to production.

Conclusion

In this tutorial, we've built a comprehensive vector search system using PostgreSQL with pgvector, covering both local development with LocalStack and production deployment with AWS RDS. This end-to-end approach offers several advantages:

  • Complete development workflow: From zero-cost local development to robust cloud deployment
  • Flexibility: Self-hosted options for development, fully managed options for production
  • Cost optimization: Free local development, pay-as-you-go production with RDS
  • SQL familiarity: Leveraging existing SQL knowledge and tools
  • Production-ready: Scalable architecture with security and reliability features

We've seen how pgvector extends PostgreSQL with vector data types, similarity functions, and efficient indexes. We've implemented a complete vector search API and integrated it with a RAG-based AI assistant, all while demonstrating both development and production environments.

The choice between MongoDB Atlas and PostgreSQL with pgvector (or other vector database solutions) depends on your specific requirements, existing infrastructure, budget constraints, and development preferences. This tutorial gives you a powerful path that combines the best of both worlds: the simplicity of local development and the robustness of production cloud deployment.

As vector search continues to evolve and mature, having a flexible approach that can work with different vector database backends will position your applications for success. Whether you're building semantic search, recommendation systems, or AI assistants, the combination of PostgreSQL, pgvector, LocalStack, and AWS RDS offers a complete solution that covers the entire application lifecycle.

Further Reading

Additional resources to deepen your understanding:

Key Resources

PostgreSQL pgvector

Official PostgreSQL vector similarity extension

AWS RDS for PostgreSQL

Managed PostgreSQL database service on AWS

LocalStack Documentation

Official documentation for running cloud services locally