README
Categories:
🔍 Async File Search with SQLite FTS5"
A powerful, blazing-fast local file content search tool built with Python 3. Uses asynchronous indexing, SQLite FTS5 for full-text and regex search, change-aware indexing, tag management, profiles, fuzzy logic, and structured exports.
Features
- 📂 Recursive folder indexing
- ⚡ Async I/O for fast processing
- 🔍 Full-text search using SQLite FTS5
- 🧬 Regex content search with smart snippets
- 🧠 Fuzzy search (
--fuzzy
,--fuzzy-threshold
) - 🧠 Advanced logic: supports
AND
,OR
,NOT
,NEAR
, quotes"
, wildcards*
, and grouping()
- ⟳ NEAR operator for proximity search (
term1 NEAR term2
) - 📂 Image metadata indexing (EXIF, dimensions, timestamp, camera, format)
- 📊 CSV analyzer (
analyze-csv
) for structured file summaries - 🔍 Metadata filtering (
--author
,--camera
,--image_created
,--format
) - 📀 Change detection using content hashes
- 🗵️ Real-time folder watch (watchdog)
- 🏷️ Tagging system via CLI or virtual fields
- 📁 Profile saving/loading for repeatable queries
- 📤 Export results: PDF, TXT, JSON
- 📊
stats
command shows DB insights and top tags - 🎨 Colorized terminal snippets (colorama)
- 🌈 Animated ripple effect during wait times
Requirements
pip install -r requirements.txt
- Required:
PyPDF2
,python-docx
,openpyxl
,watchdog
,colorama
- Optional:
fpdf2
,reportlab
,pytesseract
(needs external Tesseract OCR)
Usage Examples
📁 Indexing
indexly index "C:/docs"
🔍 Full-Text Search
indexly search "invoice AND 2024"
indexly search "example" --fuzzy --fuzzy-threshold 85
⟳ Operators
indexly search '"invoice" AND "2024"'
indexly search '"term1" NEAR "term2"'
indexly search 'error NOT warning'
indexly search '"meeting*" AND (urgent OR deadline)'
🧬 Regex Search
indexly regex "ERROR.*\d+"
🎯 Filtered Search
indexly search "report" --filetype .pdf .docx --date-from 2024-01-01 --path-contains finance
indexly search "image" --camera Nikon --image_created 2023-08-01 --format jpg
🏷️ Tagging
# Tagging a single file
indexly tag add --files "/path/to/file.txt" --tags important
# Multiple files
indexly tag add --files "/path/to/file1.txt" "/path/to/file2.txt" --tags report
# Entire folder, top-level only
indexly tag add --files "/path/to/folder" --tags projectX
# Entire folder recursively
indexly tag add --files "/path/to/folder" --tags projectX --recursive
📀 Profiles
indexly search "term1" --filetype .txt --date-from yyyy-mm-dd --save-profile invoice
indexly search "term2" --profile invoice
📤 Export
indexly search "inventory" --export-format pdf --output results.pdf
indexly search "order" --export-format txt --output results.txt
indexly search "invoice" --export-format json --output results.json
👁️ Watch Folder
indexly watch "C:/docs"
📈 Stats
indexly stats
📊 CSV Analyzer
indexly analyze-csv --file sample.csv
indexly analyze-csv --file sample.csv --export summary.md --format md
Returns column summaries, basic stats (mean, median, std, IQR), and optional markdown/txt export.
Supported File Types
.txt
,.csv
,.md
,.html
,.htm
.docx
,.xlsx
.pdf
(native & OCR).msg
,.eml
(Outlook email support).jpg
,.jpeg
,.png
,.gif
,.bmp
,.tiff
(image metadata only)
Modular Project Structure
extsearch/
├── indexly.py # CLI entry point (search/index/regex/watch/analyze)
├── extract_utils.py # Per-file extraction + virtual tag logic
├── export_utils.py # Export to TXT / PDF / JSON
├── fts_core.py # Indexing logic + tag extraction
├── cache_utils.py # Caching logic for searches
├── filetype_utils.py # Filetype support check + text routing
├── db_utils.py # DB schema + connection
├── csv_analyzer.py # CSV summary/stats export
├── ripple.py # Ripple animation
├── watcher.py # Folder monitoring
├── log_utils.py # Daily log writer
├── config.py # Settings and constants
├── search_profiles.json # Saved profiles
├── changelog.json # Changelog
└── *.log # Dated logs (e.g., 2025-07-03_index.log)
Notes
- Local + offline: no cloud dependency
- Duplicate indexing avoided via hash
- Tags extracted from text and metadata (virtual tags)
- Colorized output + CLI filters for power users
Roadmap
🛠️ Planned Features:
--exclude-path
and--exclude-ext
to skip certain patterns- Export all saved profiles in batch mode
- Auto-tagging via date or content logic
- Web UI with tag manager
- Export as Markdown/HTML/CSV
Author
Built by N K Franklin-Gent – fast, local, smart.
Last updated: 2025-09-07 ✅