Skip to content

Instantly share code, notes, and snippets.

View LuQQiu's full-sized avatar

LuQQiu

View GitHub Profile
@LuQQiu
LuQQiu / fts_benchmark.py
Last active August 13, 2025 21:39
Reddit Comment Dataset
#!/usr/bin/env python3
"""
Multi-process FTS (Full-Text Search) benchmark for LanceDB
Benchmarks FTS queries on the 600M row Reddit comments dataset.
Requires two text files:
- words.txt: List of words for FTS queries (one per line)
- subreddits.txt: List of subreddits for filters (one per line)
Each query uses exactly the specified number of words and filters.