Retriever
Retriever for MemVid vector store.
Performs semantic search using FAISS index and retrieves documents from video storage.
Supports both essential metadata (fast) and full metadata (from video QR codes).
Implements frame caching for efficient repeated access.
- class langchain_memvid.retriever.Retriever(*args, **kwargs)[source]
Bases:
BaseRetriever
,BaseModel
Retriever for MemVid vector store.
Performs semantic search using FAISS index and retrieves documents from video storage.
Supports both essential metadata (fast) and full metadata (from video QR codes).
Implements frame caching for efficient repeated access.
- Parameters:
- _get_frame(frame_number)[source]
Get a specific frame from the video with caching.
- Parameters:
frame_number (
int
) – Frame number to get- Return type:
- Returns:
Frame if found, None otherwise
- Raises:
RetrievalError – If frame retrieval fails
- _get_full_metadata_from_video(doc_id)[source]
Get full metadata from video storage for a specific document.
This method implements the full metadata retrieval component of the hybrid storage approach:
Hybrid Storage Implementation
Video Decoding: Decodes specific video frames to extract complete metadata
Frame Mapping: Uses document-to-frame mapping for efficient frame location
Complete Data: Retrieves all metadata fields and custom attributes
Fallback Mechanism: Provides complete data access when FAISS data is insufficient
Performance Characteristics
Frame Lookup: O(1) lookup using frame mapping
Video Decoding: Additional processing time for frame decoding and QR code processing
Memory Usage: Medium (requires frame decoding and QR code processing)
Error Handling
Returns None if frame mapping is not available
Returns None if video decoding fails
Logs warnings for debugging purposes
Graceful degradation when video data is corrupted
Use Cases
Complete Metadata Access: When all metadata fields are required
Data Integrity Verification: When FAISS data needs validation
Backup Recovery: When FAISS index is corrupted or incomplete
Custom Field Access: When accessing fields not in essential metadata
- _get_relevant_documents(query)[source]
Get documents relevant to the query.
This method implements the hybrid storage approach for optimal search performance:
Hybrid Storage Implementation
Essential Metadata Only: Returns documents with minimal metadata from FAISS
Fast Search: Leverages FAISS capabilities for sub-second search
Metadata Structure: Includes text, source, category, doc_id, metadata_hash
Metadata Type Flag: Sets “metadata_type”: “essential” for identification
Performance Optimizations
Progress Bar: Shows progress for large result sets (>10 documents)
Memory Efficient: Processes results in batches to avoid memory issues
Caching: Leverages frame caching for repeated access
Metadata Structure
source: Document source
category: Document category
similarity: Similarity score
doc_id: Document ID
metadata_hash: Metadata hash
metadata_type: Metadata type
… other essential fields
- Parameters:
query (
str
) – Query string- Return type:
List
[Document
]- Returns:
List of relevant documents with essential metadata
- Raises:
RetrievalError – If retrieval fails
- async abatch(inputs, config=None, *, return_exceptions=False)[source]
Asynchronously invoke the retriever on multiple inputs.
- async ainvoke(input, config=None)[source]
Asynchronously invoke the retriever on a single input.
- Parameters:
input (
str
) – Query stringconfig (
Optional
[RunnableConfig
]) – Optional configuration for the run
- Return type:
List
[Document
]- Returns:
List of relevant documents
- batch(inputs, config=None, *, return_exceptions=False)[source]
Invoke the retriever on multiple inputs.
-
config:
VectorStoreConfig
- decode_all_frames()[source]
Decode all frames from the video.
- Return type:
List
[Document
]- Returns:
List of all documents in the video
- Raises:
RetrievalError – If decoding fails
- decode_frame(frame_number)[source]
Decode a specific frame from the video.
- Parameters:
frame_number (
int
) – Frame number to decode- Return type:
Optional
[Document
]- Returns:
Document if frame contains valid QR code, None otherwise
- Raises:
RetrievalError – If decoding fails
- get_document_by_id(doc_id, include_full_metadata=False)[source]
Get a document by its ID.
This method supports the hybrid storage approach with flexible metadata retrieval:
Essential Metadata Only (include_full_metadata=False): Fast retrieval from FAISS index
Document text, source, category, doc_id, metadata_hash
O(1) lookup time from FAISS
Minimal memory usage
Metadata type: “essential”
Full Metadata (include_full_metadata=True): Complete metadata from video storage
All metadata fields and custom attributes
Requires video frame decoding
Complete data access with integrity checking
Metadata type: “full”
- Parameters:
- Return type:
Optional
[Document
]- Returns:
Document if found, None otherwise
- Raises:
RetrievalError – If retrieval fails
- get_documents_by_ids(doc_ids, include_full_metadata=False)[source]
Get documents by their IDs.
- Parameters:
- Return type:
List
[Document
]- Returns:
List of documents
- Raises:
RetrievalError – If retrieval fails
-
index_manager:
Union
[IndexManager
,Any
]
- invoke(input, config=None)[source]
Invoke the retriever on a single input.
- Parameters:
input (
str
) – Query stringconfig (
Optional
[RunnableConfig
]) – Optional configuration for the run
- Return type:
List
[Document
]- Returns:
List of relevant documents
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'from_attributes': True, 'protected_namespaces': (), 'strict': False, 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_post_init(_Retriever__context)[source]
Initialize additional attributes after Pydantic model initialization.
- Parameters:
_Retriever__context (Any)
- retrieve(query)[source]
Retrieve documents relevant to the query.
- Parameters:
query (
str
) – Query string- Return type:
List
[Document
]- Returns:
List of relevant documents
- Raises:
RetrievalError – If retrieval fails
-
video_processor:
Union
[VideoProcessor
,Any
]