Basic Usage Example
This example shows how to use langchain-memvid for basic video processing and retrieval.
Quick Start Example
# Generated by ipykernel_memvid_extension from %dump in quickstart.ipynb. DO NOT EDIT.
# LangChain MemVid Quick Start Guide
# This notebook demonstrates the basic usage of the LangChain MemVid library, which allows you to store and retrieve
# text data using video files as a storage medium.
# Setup and Imports
# First, we'll install the required dependencies and import the necessary modules. The main components we need are:
# langchain-huggingface for embeddings
# sentence-transformers for the underlying embedding model
# VectorStore from langchain_memvid for our main functionality
from langchain_huggingface import HuggingFaceEmbeddings
from pathlib import Path
from langchain_memvid import VectorStore
# Creating a Vector Store
# Now we'll create a vector store with some example data. We'll:
# Define paths for storing the video and index files
# Initialize the embedding model
# Create sample text data with metadata
# Build the vector store from our texts
# Note: The metadata helps organize and filter our data, associating each text with a source, category, and ID.
# Paths to store the video and index files
knowledge_base_file = Path("knowledge_base.mp4")
knowledge_base_index_dir = Path("knowledge_base_index.d")
# Embedding model
embedding = HuggingFaceEmbeddings()
# Example text chunks
texts = [
"The quick brown fox jumps over the lazy dog",
"A fast orange fox leaps across a sleepy canine",
"The weather is beautiful today",
"It's raining cats and dogs outside",
"Python is a popular programming language",
"JavaScript is widely used for web development"
]
# Example metadata for each text
metadata = [
{"id": 0, "source": "example1.txt", "category": "animals"},
{"id": 1, "source": "example1.txt", "category": "animals"},
{"id": 2, "source": "example2.txt", "category": "weather"},
{"id": 3, "source": "example2.txt", "category": "weather"},
{"id": 4, "source": "example3.txt", "category": "programming"},
{"id": 5, "source": "example3.txt", "category": "programming"}
]
# Create vector store
vs = VectorStore.from_texts(
texts=texts,
embedding=embedding,
video_file=knowledge_base_file,
index_dir=knowledge_base_index_dir,
metadatas=metadata,
)
# Performing Similarity Searches
# Let's test our vector store by performing similarity searches. We'll try different queries to see how well the system
# retrieves relevant information. The search will return the most similar texts along with their metadata and similarity
# scores.
# Example searches
queries = [
"Tell me about foxes",
"What's the weather like?",
"What programming languages are mentioned?"
]
results = [
{
"query": query,
"content": doc.page_content,
**{k: v for k, v in doc.metadata.items() if k != "text" and v is not None}
}
for query in queries
for doc in vs.similarity_search(query, k=2, include_full_metadata=True)
]
# Removing content
# Let's us remove some of the documents and re-run the simiarity search.
# Remove every second document
vs.delete_by_texts(texts[::2])
# Re-run the similarity search
results = [
{
"query": query,
"content": doc.page_content,
**{k: v for k, v in doc.metadata.items() if k != "text" and v is not None}
}
for query in queries
for doc in vs.similarity_search(query, k=2, include_full_metadata=True)
]
The complete example demonstrates:
Setting up embeddings with HuggingFace
Creating a vector store from texts
Adding metadata to documents
Performing similarity searches
Retrieving results with metadata
For the interactive Jupyter notebook version, see Jupyter Notebook Examples.