Basic Usage Example

This example shows how to use langchain-memvid for basic video processing and retrieval.

Quick Start Example
# Generated by ipykernel_memvid_extension from %dump in quickstart.ipynb. DO NOT EDIT.

# LangChain MemVid Quick Start Guide
# This notebook demonstrates the basic usage of the LangChain MemVid library, which allows you to store and retrieve
# text data using video files as a storage medium.

# Setup and Imports
# First, we'll install the required dependencies and import the necessary modules. The main components we need are:
# langchain-huggingface for embeddings
# sentence-transformers for the underlying embedding model
# VectorStore from langchain_memvid for our main functionality

from langchain_huggingface import HuggingFaceEmbeddings
from pathlib import Path
from langchain_memvid import VectorStore

# Creating a Vector Store
# Now we'll create a vector store with some example data. We'll:
# Define paths for storing the video and index files
# Initialize the embedding model
# Create sample text data with metadata
# Build the vector store from our texts
# Note: The metadata helps organize and filter our data, associating each text with a source, category, and ID.

# Paths to store the video and index files
knowledge_base_file = Path("knowledge_base.mp4")
knowledge_base_index_dir = Path("knowledge_base_index.d")
# Embedding model
embedding = HuggingFaceEmbeddings()
# Example text chunks
texts = [
    "The quick brown fox jumps over the lazy dog",
    "A fast orange fox leaps across a sleepy canine",
    "The weather is beautiful today",
    "It's raining cats and dogs outside",
    "Python is a popular programming language",
    "JavaScript is widely used for web development"
]
# Example metadata for each text
metadata = [
    {"id": 0, "source": "example1.txt", "category": "animals"},
    {"id": 1, "source": "example1.txt", "category": "animals"},
    {"id": 2, "source": "example2.txt", "category": "weather"},
    {"id": 3, "source": "example2.txt", "category": "weather"},
    {"id": 4, "source": "example3.txt", "category": "programming"},
    {"id": 5, "source": "example3.txt", "category": "programming"}
]
# Create vector store
vs = VectorStore.from_texts(
    texts=texts,
    embedding=embedding,
    video_file=knowledge_base_file,
    index_dir=knowledge_base_index_dir,
    metadatas=metadata,
)

# Performing Similarity Searches
# Let's test our vector store by performing similarity searches. We'll try different queries to see how well the system
# retrieves relevant information. The search will return the most similar texts along with their metadata and similarity
# scores.

# Example searches
queries = [
    "Tell me about foxes",
    "What's the weather like?",
    "What programming languages are mentioned?"
]
results = [
    {
        "query": query,
        "content": doc.page_content,
        **{k: v for k, v in doc.metadata.items() if k != "text" and v is not None}
    }
    for query in queries
    for doc in vs.similarity_search(query, k=2, include_full_metadata=True)
]

# Removing content
# Let's us remove some of the documents and re-run the simiarity search.

# Remove every second document
vs.delete_by_texts(texts[::2])
# Re-run the similarity search
results = [
    {
        "query": query,
        "content": doc.page_content,
        **{k: v for k, v in doc.metadata.items() if k != "text" and v is not None}
    }
    for query in queries
    for doc in vs.similarity_search(query, k=2, include_full_metadata=True)
]

The complete example demonstrates:

  • Setting up embeddings with HuggingFace

  • Creating a vector store from texts

  • Adding metadata to documents

  • Performing similarity searches

  • Retrieving results with metadata

For the interactive Jupyter notebook version, see Jupyter Notebook Examples.