Google LangExtract: The Ultimate AI Developer's Guide to Structured Information Extraction

Google LangExtract: The Ultimate AI Developer's Guide to Structured Information Extraction

AI
ML
NLP
Python
Google
LangExtract
LLM
Gemini
Data Engineering
2025-08-04

Table of Contents

Introduction

Google's LangExtract is a revolutionary open-source Python library that transforms unstructured text into structured, actionable data using large language models like Gemini and OpenAI's GPT models. Released in July 2025, this powerful tool addresses the critical challenge of extracting reliable, traceable information from complex documents—from clinical notes and legal contracts to research papers and customer feedback.

Unlike traditional Named Entity Recognition (NER) tools that require extensive training data and domain-specific fine-tuning, LangExtract leverages the natural language understanding capabilities of modern LLMs to adapt to any domain with just a few examples, achieving 99.9% accuracy while maintaining precise source grounding for every extraction.

The library transforms chaotic, free-form text into clean, structured data formats while maintaining precise source grounding—mapping every extraction back to its exact location in the original document. This ensures transparency, traceability, and verification of extracted information.

In this comprehensive guide, we'll explore how to implement LangExtract in production environments, optimize performance for large-scale deployments, and leverage its capabilities for various AI applications including knowledge graphs, RAG systems, and document processing pipelines.

What is LangExtract?

LangExtract is a Python library designed to programmatically extract structured information from unstructured text documents using LLMs. Unlike traditional Named Entity Recognition (NER) tools that require extensive training data and domain-specific fine-tuning, LangExtract leverages the natural language understanding capabilities of modern LLMs to adapt to any domain with just a few examples.

The library transforms chaotic, free-form text into clean, structured data formats while maintaining precise source grounding—mapping every extraction back to its exact location in the original document. This ensures transparency, traceability, and verification of extracted information.

  • No training required - works with just 3-5 examples
  • Character-level source grounding for verification and visual highlighting
  • Supports 100+ languages out of the box
  • 99.9% accuracy on industry-standard datasets
  • Interactive HTML visualizations for easy review and validation
  • Multi-model support - works with cloud and local models via Ollama

How LangExtract Works

LangExtract operates through a sophisticated pipeline that combines prompt engineering, few-shot learning, and controlled generation to extract structured information from text.

Core Architecture

The extraction pipeline consists of several key steps:

  1. Input Processing: Accepts text documents, URLs, or file paths as input
  2. Prompt Engineering: Uses developer-defined extraction prompts with clear instructions
  3. Few-Shot Learning: Leverages example data to guide the model's understanding
  4. LLM Processing: Employs advanced language models (Gemini, GPT, or local models via Ollama) for extraction
  5. Source Grounding: Maps each extracted entity to its precise location in the source text
  6. Structured Output: Generates JSONL format data with consistent schema

Key Features

LangExtract provides several powerful features that set it apart from traditional extraction tools:

  • Precise Source Grounding: Every extraction includes character-level mapping to the original text
  • Controlled Generation: Uses schema constraints and few-shot examples to ensure consistent outputs
  • Long Document Processing: Handles extensive documents through intelligent text chunking
  • Multi-Model Support: Works with cloud-based models (Gemini, OpenAI) and local models via Ollama

Installation and Setup

Getting started with LangExtract is straightforward. First, install the library using pip:

Installation
# Standard installation pip install langextract # For OpenAI models pip install "langextract[openai]" # For development pip install -e ".[dev]"

For cloud-based models, you'll need to configure API access. Set up your API key using environment variables:

API Configuration
# Option 1: Environment variable export LANGEXTRACT_API_KEY="your-api-key-here" # Option 2: .env file (recommended) echo "LANGEXTRACT_API_KEY=your-api-key-here" > .env echo ".env" >> .gitignore

Complete Code Examples

Basic Entity Extraction

Here's a simple example of extracting entities from text using LangExtract:

import langextract as lx import textwrap import os # Define extraction prompt prompt = textwrap.dedent("""\ Extract characters, emotions, and relationships in order of appearance. Use exact text for extractions. Do not paraphrase or overlap entities. Provide meaningful attributes for each entity to add context.""") # Provide few-shot examples examples = [ lx.data.ExampleData( text="ROMEO. But soft! What light through yonder window breaks? It is the east, and Juliet is the sun.", extractions=[ lx.data.Extraction( extraction_class="character", extraction_text="ROMEO", attributes={"emotional_state": "wonder"} ), lx.data.Extraction( extraction_class="emotion", extraction_text="But soft!", attributes={"feeling": "gentle awe"} ), lx.data.Extraction( extraction_class="relationship", extraction_text="Juliet is the sun", attributes={"type": "metaphor"} ), ] ) ] # Input text to process input_text = "Lady Juliet gazed longingly at the stars, her heart aching for Romeo" # Run extraction result = lx.extract( text_or_documents=input_text, prompt_description=prompt, examples=examples, model_id="gemini-2.5-flash" ) # Display results for extraction in result.extractions: print(f"Class: {extraction.extraction_class}") print(f"Text: {extraction.extraction_text}") print(f"Attributes: {extraction.attributes}") print(f"Source location: {extraction.start_char}-{extraction.end_char}") print("---")

Advanced Document Processing

For more complex extraction tasks, you can optimize the extraction process with multiple passes and parallel processing:

import langextract as lx import textwrap # Complex extraction for business documents prompt = textwrap.dedent("""\ Extract companies, financial metrics, dates, and market sentiment. Use exact text for extractions. Include specific values and context.""") examples = [ lx.data.ExampleData( text="TechCorp reported Q3 revenue of $2.5B on October 15, 2024, exceeding analyst expectations and driving bullish market sentiment.", extractions=[ lx.data.Extraction( extraction_class="company", extraction_text="TechCorp", attributes={"type": "public_company"} ), lx.data.Extraction( extraction_class="financial_metric", extraction_text="Q3 revenue of $2.5B", attributes={"metric_type": "revenue", "period": "Q3", "value": "$2.5B"} ), lx.data.Extraction( extraction_class="date", extraction_text="October 15, 2024", attributes={"event": "earnings_report"} ), lx.data.Extraction( extraction_class="sentiment", extraction_text="bullish market sentiment", attributes={"sentiment": "bullish", "context": "earnings_reaction"} ), ] ) ] # Process large document with optimization result = lx.extract( text_or_documents="path/to/large_document.txt", # Or URL prompt_description=prompt, examples=examples, model_id="gemini-2.5-flash", extraction_passes=3, # Multiple passes for better recall max_workers=20, # Parallel processing max_char_buffer=1000 # Optimal chunking size ) print(f"Extracted {len(result.extractions)} entities") print(f"Processing completed with {result.extraction_passes} passes")

Real-World Applications

LangExtract excels in various real-world applications where structured information extraction is critical. Here are some practical implementations:

  1. Healthcare: Extract medications, dosages, symptoms, and diagnoses from clinical notes with precise accuracy.
  2. Legal: Process contracts and legal documents to extract parties, terms, dates, and obligations.
  3. Finance: Analyze financial reports to extract metrics, companies, and market sentiment for investment analysis.
  4. Research: Extract findings, methodologies, and citations from academic papers for literature reviews.
  5. Customer Intelligence: Process customer feedback to extract sentiment, product mentions, and feature requests.

Interactive HTML Visualization

One of LangExtract's most powerful features is its ability to generate interactive HTML visualizations that highlight extracted entities directly in the source text:

Generating Interactive Visualizations
import langextract as lx # Run extraction result = lx.extract( text_or_documents="path/to/document.txt", prompt_description="Extract key entities...", examples=examples, model_id="gemini-2.0-flash-exp" ) # Generate interactive HTML visualization result.to_html("extraction_visualization.html") # The HTML file provides: # - Color-coded entity highlighting # - Hover tooltips with extraction details # - Side panel with extraction list # - Search and filter capabilities # - Export options for further processing # You can also get the HTML as a string html_content = result.to_html() # Or create a custom visualization from langextract.visualization import create_custom_viz custom_html = create_custom_viz( result, highlight_colors={"person": "#3B82F6", "location": "#10B981"}, show_confidence_scores=True, enable_export=True )
Medical Information Extraction
import langextract as lx import textwrap # Healthcare-specific extraction prompt = textwrap.dedent("""\ Extract medications, dosages, symptoms, and diagnoses from clinical notes. Include administration routes and frequencies where mentioned. Use exact medical terminology from the text.""") examples = [ lx.data.ExampleData( text="Patient prescribed Metformin 500mg twice daily for Type 2 diabetes", extractions=[ lx.data.Extraction( extraction_class="medication", extraction_text="Metformin", attributes={"dosage": "500mg", "frequency": "twice daily"} ), lx.data.Extraction( extraction_class="diagnosis", extraction_text="Type 2 diabetes", attributes={"status": "ongoing_management"} ), ] ) ] clinical_note = """ Patient presents with chest pain and shortness of breath. Prescribed Lisinopril 10mg once daily for hypertension. Follow-up recommended in 2 weeks. """ result = lx.extract( text_or_documents=clinical_note, prompt_description=prompt, examples=examples, model_id="gemini-2.5-flash" ) # Process results medications = [e for e in result.extractions if e.extraction_class == "medication"] for med in medications: print(f"Medication: {med.extraction_text}") print(f"Details: {med.attributes}")

OpenAI Models Integration

LangExtract also supports OpenAI models like GPT-4o with specific configuration requirements:

Using OpenAI Models
import langextract as lx import os # Configure for OpenAI result = lx.extract( text_or_documents=input_text, prompt_description=prompt, examples=examples, model_id="gpt-4o", api_key=os.environ.get('OPENAI_API_KEY'), fence_output=True, # Required for OpenAI use_schema_constraints=False # Required for OpenAI ) # Alternative models # model_id="gpt-4o-mini" # Faster, cheaper option # model_id="gpt-4-turbo" # Balance of speed and capability

Batch Processing Pipeline

For processing multiple documents efficiently, implement a batch processing pipeline:

Batch Document Processing
import langextract as lx import os from pathlib import Path def process_document_batch(file_paths, prompt, examples, output_dir="results"): """Process multiple documents efficiently""" Path(output_dir).mkdir(exist_ok=True) results = [] for file_path in file_paths: print(f"Processing {file_path}...") result = lx.extract( text_or_documents=file_path, prompt_description=prompt, examples=examples, model_id="gemini-2.5-flash", extraction_passes=2, max_workers=10 ) results.append(result) # Save individual results filename = Path(file_path).stem lx.io.save_annotated_documents( [result], output_name=f"{filename}_extractions.jsonl", output_dir=output_dir ) return results # Example usage document_files = [ "contract1.pdf", "report2.docx", "notes3.txt" ] batch_results = process_document_batch(document_files, prompt, examples)

Visualization and Output

Interactive HTML Visualization

One of LangExtract's most powerful features is its ability to generate interactive HTML visualizations that highlight extracted entities directly in the source text with precise character-level grounding:

Generating Interactive Visualizations
# Save results and create visualization lx.io.save_annotated_documents( [result], output_name="extraction_results.jsonl", output_dir="." ) # Generate interactive HTML html_content = lx.visualize("extraction_results.jsonl") with open("visualization.html", "w", encoding="utf-8") as f: if hasattr(html_content, 'data'): f.write(html_content.data) # For Jupyter/Colab else: f.write(html_content) print("Open visualization.html in your browser to review results") # The HTML visualization provides: # - Color-coded entity highlighting # - Character-level source grounding # - Hover tooltips with extraction details # - Side panel with extraction list # - Search and filter capabilities # - Export options for further processing

JSONL Output Format

LangExtract outputs data in JSONL (JSON Lines) format, where each line represents an extracted document with its entities and precise source grounding:

JSONL Output Structure
{ "document_id": "1", "text": "Original input text...", "extractions": [ { "extraction_class": "character", "extraction_text": "ROMEO", "start_char": 0, "end_char": 5, "attributes": { "emotional_state": "wonder" } } ] }

Using Local Models with Ollama

For privacy-sensitive applications or when you need to process data offline, LangExtract supports running local models through Ollama integration:

Setting up Ollama
# Install Ollama from ollama.com ollama pull gemma2:2b ollama serve
Local Model Configuration
import langextract as lx # Use local models (no API key required) result = lx.extract( text_or_documents=input_text, prompt_description=prompt, examples=examples, model_id="gemma2:2b", # Ollama model model_url="http://localhost:11434", # Ollama server fence_output=False, use_schema_constraints=False ) # Alternative local models: # model_id="llama3.2" # Larger model for better accuracy # model_id="mistral" # Good balance of speed and quality # Benefits of local models: # - Complete data privacy - no data leaves your infrastructure # - No API costs or rate limits # - Consistent latency without network dependencies # - Compliance with strict data residency requirements # Trade-offs: # - Requires local compute resources # - Model management and updates are manual # - May have lower accuracy than cloud models

Performance Optimization

Optimize LangExtract performance for large-scale deployments with these proven strategies. LangExtract with Gemini achieves 99.9% accuracy and can process 100+ documents per second with proper configuration:

Model Selection Guidelines

  • gemini-2.5-flash: Recommended default, excellent balance of speed, cost, and quality
  • gemini-2.5-pro: Superior reasoning for complex extraction tasks
  • gpt-4o-mini: Fast OpenAI alternative for cost optimization
  • gemma2:2b: Lightweight local model via Ollama for privacy
Performance Optimization Strategies
import langextract as lx from concurrent.futures import ThreadPoolExecutor import time import logging # 1. Optimize for Large Documents result = lx.extract( text_or_documents=large_document, prompt_description=prompt, examples=examples, model_id="gemini-2.5-flash", extraction_passes=3, # Multiple passes improve recall max_workers=20, # Parallel processing max_char_buffer=800, # Smaller chunks for accuracy # Consider reducing for cost optimization ) # 2. Smart Model Selection def smart_model_selection(text_length, complexity): """Choose optimal model based on task requirements""" if text_length < 1000 and complexity == "simple": return "gemini-2.5-flash" # Fastest, cheapest elif complexity == "complex": return "gemini-2.5-pro" # Best accuracy else: return "gemma2:2b" # Local processing # 3. Batch Processing with Rate Limiting from functools import wraps def rate_limited(max_per_minute=60): min_interval = 60.0 / max_per_minute last_called = [0.0] def decorator(func): @wraps(func) def wrapper(*args, **kwargs): elapsed = time.time() - last_called[0] left_to_wait = min_interval - elapsed if left_to_wait > 0: time.sleep(left_to_wait) ret = func(*args, **kwargs) last_called[0] = time.time() return ret return wrapper return decorator @rate_limited(max_per_minute=30) def controlled_extraction(text, prompt, examples): return lx.extract(text, prompt, examples, "gemini-2.5-flash") # 4. Performance Monitoring logger = logging.getLogger(__name__) def timed_extraction(text, prompt, examples): """Extract with performance monitoring""" start_time = time.time() result = lx.extract( text_or_documents=text, prompt_description=prompt, examples=examples, model_id="gemini-2.5-flash" ) elapsed = time.time() - start_time tokens_processed = len(text.split()) logger.info(f"Extraction completed in {elapsed:.2f}s") logger.info(f"Tokens/sec: {tokens_processed/elapsed:.0f}") logger.info(f"Entities extracted: {len(result.extractions)}") return result

Cost Management

Implement cost-effective strategies for production environments:

Cost Optimization
def cost_optimized_extraction(documents, prompt, examples): """Optimize for cost in production environments""" results = [] for doc in documents: # Use faster, cheaper model for initial processing result = lx.extract( doc, prompt, examples, model_id="gemini-2.5-flash", # Cost-effective choice extraction_passes=1, # Reduce passes for speed max_workers=5 # Limit parallelism ) # Only use expensive model for complex cases if len(result.extractions) < expected_minimum: result = lx.extract( doc, prompt, examples, model_id="gemini-2.5-pro", # More expensive but accurate extraction_passes=2 ) results.append(result) return results

Advanced Use Cases

LangExtract excels in sophisticated AI applications beyond basic entity extraction:

Building Knowledge Graphs

Knowledge Graph Construction
def build_knowledge_graph(documents): """Extract entities and relationships for knowledge graph construction""" prompt = textwrap.dedent("""\ Extract entities and their relationships. Focus on connections between people, organizations, and concepts.""") examples = [ lx.data.ExampleData( text="Apple Inc. was founded by Steve Jobs in Cupertino.", extractions=[ lx.data.Extraction("organization", "Apple Inc."), lx.data.Extraction("person", "Steve Jobs"), lx.data.Extraction("location", "Cupertino"), lx.data.Extraction("relationship", "founded by", {"subject": "Apple Inc.", "object": "Steve Jobs"}) ] ) ] kg_data = [] for doc in documents: result = lx.extract(doc, prompt, examples, "gemini-2.5-flash") kg_data.append(result) return kg_data

RAG System Enhancement

Enhanced RAG with Structured Metadata
def enhance_rag_with_langextract(documents, query): """Enhance RAG retrieval with structured extraction""" # Extract structured metadata from documents metadata_prompt = textwrap.dedent("""\ Extract key topics, entities, and concepts that would help with document retrieval and relevance scoring.""") enhanced_docs = [] for doc in documents: # Extract structured metadata metadata = lx.extract(doc, metadata_prompt, examples, "gemini-2.5-flash") # Combine original text with structured metadata enhanced_doc = { "original_text": doc, "entities": [e.extraction_text for e in metadata.extractions], "metadata": metadata } enhanced_docs.append(enhanced_doc) return enhanced_docs

Migration from Traditional NER

LangExtract represents a significant advancement over traditional approaches:

Migration Example
# Traditional spaCy approach (before) import spacy nlp = spacy.load("en_core_web_sm") doc = nlp(text) entities = [(ent.text, ent.label_) for ent in doc.ents] # LangExtract approach (after) import langextract as lx result = lx.extract(text, prompt, examples, "gemini-2.5-flash") entities = [(e.extraction_text, e.extraction_class) for e in result.extractions] # Benefits of migration: # - No training data required # - Domain adaptation with just prompt changes # - Better context understanding # - Built-in relationship extraction # - 99.9% accuracy across varied domains # - Multilingual support for 100+ languages

Using Local Models with Ollama

For privacy-sensitive applications or when you need to process data offline, LangExtract supports running local models through Ollama integration:

Local Model Setup with Ollama
# First, install and start Ollama # brew install ollama # macOS # ollama serve # Start the Ollama server # Pull a model # ollama pull llama2 # ollama pull mistral import langextract as lx # Configure for local model result = lx.extract( text_or_documents="sensitive_document.txt", prompt_description="Extract PII and sensitive information", examples=examples, model_id="ollama:llama2", # Use local Llama 2 # model_id="ollama:mistral", # Or Mistral extraction_passes=2, max_workers=5 # Adjust based on local resources ) # Benefits of local models: # - Complete data privacy - no data leaves your infrastructure # - No API costs or rate limits # - Consistent latency without network dependencies # - Compliance with strict data residency requirements # Trade-offs: # - Slightly lower accuracy than Gemini 2.0 # - Requires local compute resources # - Model management and updates are manual

Performance Optimization

Optimize LangExtract performance for large-scale deployments with these techniques:

Performance Optimization Strategies
import langextract as lx from concurrent.futures import ThreadPoolExecutor import asyncio # 1. Batch Processing for Multiple Documents def batch_extract(documents, prompt, examples): """Process multiple documents in parallel""" with ThreadPoolExecutor(max_workers=10) as executor: futures = [] for doc in documents: future = executor.submit( lx.extract, text_or_documents=doc, prompt_description=prompt, examples=examples, model_id="gemini-2.0-flash-exp", extraction_passes=2 ) futures.append(future) results = [f.result() for f in futures] return results # 2. Optimize Chunk Size for Long Documents result = lx.extract( text_or_documents="very_long_document.pdf", prompt_description=prompt, examples=examples, model_id="gemini-2.0-flash-exp", max_char_buffer=2000, # Optimal chunk size max_workers=20, # Parallel chunk processing extraction_passes=3 # Multiple passes for completeness ) # 3. Cache Results for Repeated Extractions from functools import lru_cache import hashlib @lru_cache(maxsize=100) def cached_extract(text_hash, prompt, model_id): """Cache extraction results for repeated queries""" return lx.extract( text_or_documents=text_hash, # Use original text prompt_description=prompt, examples=examples, model_id=model_id ) # 4. Model Selection by Task Complexity def smart_model_selection(text_length, complexity): """Choose optimal model based on task requirements""" if text_length < 1000 and complexity == "simple": return "gemini-1.5-flash" # Fastest, cheapest elif complexity == "complex": return "gemini-2.0-flash-exp" # Best accuracy else: return "ollama:mistral" # Local processing # 5. Monitor and Log Performance Metrics import time import logging logger = logging.getLogger(__name__) def timed_extraction(text, prompt, examples): """Extract with performance monitoring""" start_time = time.time() result = lx.extract( text_or_documents=text, prompt_description=prompt, examples=examples, model_id="gemini-2.0-flash-exp" ) elapsed = time.time() - start_time tokens_processed = len(text.split()) logger.info(f"Extraction completed in {elapsed:.2f}s") logger.info(f"Tokens/sec: {tokens_processed/elapsed:.0f}") logger.info(f"Entities extracted: {len(result.extractions)}") return result

Best Practices

Here are key best practices for implementing LangExtract in production environments:

  1. Prompt Engineering: Invest time in crafting clear, specific prompts with high-quality examples that cover edge cases.
  2. Model Selection: Use gemini-2.5-flash for speed and cost efficiency, or gemini-2.5-pro for complex extraction tasks requiring advanced reasoning.
  3. Error Handling: Implement robust retry logic and validation to handle API failures and ensure extraction quality.
  4. Performance Optimization: Use multiple extraction passes and parallel processing for large documents while managing costs.
  5. Monitoring: Track extraction performance, costs, and quality metrics over time to identify areas for improvement.

Common Issues and Solutions

Issue: Low extraction accuracy

Improving Accuracy
# Solution: Improve examples and prompt clarity prompt = textwrap.dedent("""\ Extract entities with high precision. Use EXACT text spans from the source. Do NOT paraphrase or generalize entities. Include specific attributes for context.""") # Provide diverse, high-quality examples examples = [ lx.data.ExampleData( text="Clear, specific example text", extractions=[ lx.data.Extraction( extraction_class="specific_class", extraction_text="exact text span", attributes={"detailed": "attributes"} ) ] ) ]

Issue: Missing entities in long documents

Optimizing for Long Documents
# Solution: Optimize chunking and use multiple passes result = lx.extract( text, prompt, examples, "gemini-2.5-flash", extraction_passes=3, # Multiple passes max_char_buffer=600, # Smaller chunks max_workers=15 # Parallel processing )
Production Error Handling
import langextract as lx from tenacity import retry, stop_after_attempt, wait_exponential import logging # Configure logging logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) @retry( stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10) ) def robust_extraction(text, prompt, examples): """Production-ready extraction with retry logic and monitoring""" try: result = lx.extract( text_or_documents=text, prompt_description=prompt, examples=examples, model_id="gemini-2.5-flash", # Recommended model extraction_passes=2, # Multiple passes for better recall max_workers=10 # Parallel processing ) # Validate results if not result.extractions: logger.warning("No extractions found") raise ValueError("No extractions found") # Log extraction metrics logger.info(f"Extracted {len(result.extractions)} entities") return result except Exception as e: logger.error(f"Extraction failed: {e}") raise # Validation function def validate_extraction_quality(result, expected_classes): """Validate extraction results for production quality""" extracted_classes = {e.extraction_class for e in result.extractions} missing_classes = set(expected_classes) - extracted_classes quality_score = len(extracted_classes) / len(expected_classes) return { "quality_score": quality_score, "missing_classes": list(missing_classes), "extraction_count": len(result.extractions), "has_attributes": sum(1 for e in result.extractions if e.attributes) }

Conclusion

LangExtract represents a paradigm shift in information extraction, democratizing access to sophisticated NLP capabilities while maintaining the precision and traceability required for production applications. For AI developers, it offers an unprecedented combination of simplicity, power, and reliability that makes structured data extraction accessible and scalable across diverse domains and use cases.

The library's key advantages include requiring no training data (just a few examples), achieving 99.9% accuracy with precise source grounding, supporting 100+ languages, and working with both cloud and local models. This makes it an ideal choice for organizations looking to extract valuable insights from unstructured data efficiently.

As you implement LangExtract in your projects, remember to focus on clear prompt engineering, choose the right model for your use case, and implement proper error handling and monitoring for production deployments. Whether you're building knowledge graphs, enhancing RAG systems, or processing medical documents, LangExtract provides the tools needed to transform your unstructured data into actionable insights at scale.

The future of information extraction is here, and with LangExtract's active development and growing community, you're positioned at the forefront of this technological evolution.

Further Reading

Additional resources to deepen your understanding of LangExtract:

Key Resources

LangExtract GitHub Repository

Official repository with documentation, examples, and source code

Google AI Studio

Get your Gemini API key and explore model capabilities

LangExtract PyPI Package

Install LangExtract and explore the Python package documentation

Ollama - Local Models

Run local language models for privacy-sensitive applications

OpenAI Platform

Get your OpenAI API key for GPT model integration