Skip to content

Instantly share code, notes, and snippets.

@scrubbbbs
Last active March 11, 2025 14:48
Show Gist options
  • Save scrubbbbs/c7177a0a97e098c6bb10ae4afe8f6c35 to your computer and use it in GitHub Desktop.
Save scrubbbbs/c7177a0a97e098c6bb10ae4afe8f6c35 to your computer and use it in GitHub Desktop.
cbird help v0.8
„__„ CBIRD
{o,o} Content-Based Image Retrieval Database
|)__) https://github.com/scrubbbbs/cbird
-“–“- license: GPLv2 (see: -license)
Usage: /home/test/src/cbird/cbird [args...]
* Arguments are positional, and may be given multiple times
* Operations occur in the order given, can often be chained
* Definitions in <>, descriptions at the end of this help
* Optional values in []
* Alternatives separated by |
Setup
==============================================================================
-use <dir> set index location, in <dir>/_index, default current directory,
- use "@" to search for index in the parent tree of cwd
- use "@<path>" to search in parent tree of <path>
-create create index if there is not one already, otherwise prompt
-update [<dir>] create/refresh index, optionally only scan subdirectory <dir>
- index is created in -use <dir>, not the subdirectory
- use -i.include -i.exclude to filter paths considered by the indexer
-migrate convert index files to new format(s)
- use "-i.dryrun true" to test the migration before applying
-headless run without a window manager
-about system information
-args [option|<file>] load saved arguments file, one arg per line (default: global+local)
- "global" for ~/.config/cbird.args.txt
- "local" for _index/args.txt
- "none" to disable default processing
-h|-help help page
-version|--version version
-license|--license show software license and attribution
-v|-verbose enable detailed logging
-q|-quiet disable logging except for critical errors
Queries
==============================================================================
-dups exact duplicates using md5 hash
-dups-in <selector> exact duplicates in subset
-similar similar items in entire index
-similar-in <selector> similar items within a subset
-similar-to <file>|<selector> similar items to a file, directory, or subset
- <file> can point outside the index parent
- non-indexed files are temporarly indexed
- non-indexed videos search for a few evenly-spaced frame grabs
- -p.refl only works with this method
-weeds previously deleted files that came back
Selections
==============================================================================
* repeated select commands append to the current selection
* operations on current selection usually clear the selection (-group-by etc)
-select-none clear the selection
-select-all everything
-select-id <integer> one item by its unique id
-select-one <file> one item by path
-select-path <selector> subset by path, using selector
-select-type <type> subset by type
-select-errors items with errors from last operation (-update,-verify)
-select-result to chain queries: convert the last query result into a selection,
and clear the result
-select-sql <sql> items with sql statement [select * from media where ...]
-select-files <file> [<file>]... ignore index, existing files/directories of supported file types
-select-grid <file> [<file>]... ignore index, detect a grid of thumbnails,
break up into separate images
Batch Deletion
==============================================================================
* deletion uses system trash location/api by default
* set CBIRD_TRASH_DIR to your own location
-nuke-dups-in <dir> delete dups under <dir> only
-nuke-weeds delete all weeds
-nuke delete selection
Selection/Result Filtering
==============================================================================
-with[out] <prop>[#<func>] <expr> remove items if expression is false, invert with -without
first in each group (needle) is never filtered
-or-with[out] <prop>[#<func>] <expr> logical OR for the preceding -with[out]
logical AND by using multiple -with[out]
-first keep the first item
-chop remove the first item
-first-sibling keep one item from each directory
-head <int> keep the first N items
-tail <int> keep the last N items
Sorting/Grouping
==============================================================================
* -rev to reverse/sort descending
* additional sort key (multisort) with another -sort immediately after
-sort[-rev] <prop>[#<func>] sort selection or result groups
-sort-result[-rev] <prop>[#<func>] sort result by first member of each group
-group-by <prop>[#<func>] group selection by property, store in result (clears selection)
-sort-similar sort selection by similarity
-merge <selector> <selector> merge two selections by similarity, into a new list,
assuming first selection is sorted
Operations on Selection/Results
==============================================================================
* move/rename operations preserve the index information (useful for videos or large corpus)
-remove remove from the index (force re-indexing)
-nuke remove from index and move files to trash
-rename <find> <replace> [-vxp] rename selection with find/replace, ignoring/preserving file extension
v * verbose preview, show what didn't match and won't be renamed
x * execute the rename, by default only preview
p * <find> matches the full path instead of the file name
-move <dir> move selection to another location in the index directory
-verify verify md5 sums
-dump print selection information
Viewing
==============================================================================
* to take effect, options must appear before -show
-show show results browser for the current selection/results
-folders enable group view, group results from the same parent directory
-sets enable group view, group results with the same pair of directories
-exit-on-select "select" action exits with selected index as exit code, < 0 if canceled
-focus-first select the first item by default (default select last)
-max-per-page <int> maximum items on one page [12]
-no-delete disable file deletion, renaming/moving is allowed
-theme <string> select widget theme
Auto * detect Dark/Light from system theme if possible (Mac/Windows default)
Dark * built-in dark theme (Linux default)
Light * built-in light theme
Qt * do not set any theme, adopt the Qt/System theme
Miscellaneous
==============================================================================
-qualityscore img no-reference quality score
-simtest testfile run automated matching test
-jpeg-repair-script <file> script/program to repair truncated jpeg files (-verify) [~/bin/jpegfix.sh]
-compare-videos <file> <file> open a pair of videos in compare tool
-test-csv <file> read csv of src/dst pairs for a similar-to test, store results in match.csv
-test-image-loader <file> test image decoding
-test-video-decoder <file> test video decoding
-test-video <file> test video search
-vacuum compact/optimize database files
-list-index-params list current index parameters
-list-search-params list current search parameters
-list-formats list available image and video formats
-list-codecs list available video codecs
-video-thumbnail <file> <frame#> crop video thumbnail at frame
Search Parameters (for -similar*)
==============================================================================
-p.<key> value
* default value in [ ]
* alternate value in ( )
* flag names are combined with +, e.g. -p.refl h+v+b == -p.refl 7
* must appear before -similar or other queries to take effect
* special value "help" or "?" can be passed to get help
* keys:
alg <enum> Search algorithm [dct(0)]
dct (0) DCT image hash
fdct (1) DCT image hashes of features
orb (2) ORB descriptors of features
color (3) Color histogram
video (4) DCT image hashes of video frames
dht <int> DCT hash distance threshold (dct,fdct,video) [5]
odt <int> ORB descriptor distance threshold (orb) [25]
vradix <int> Divides the haystack by ~ 2^R but loses accuracy (video) [10]
vfm <int> Minimum number of frames matched per video [30]
vfn <int> Minimum percent of frames near each other [60]
fs <bool> Filter Self: remove item that matched itself [true]
mn <int> Minimum matches per needle [1]
mm <int> Maximum matches per needle [5]
mt <int> Maximum threshold to try, until minMatches are found [0]
refl <flags> Also search reflections of needle []
none (0) No flipping
h (1) Flip horizontally
v (2) Flip vertically
b (4) Flip horizontal and vertical
types <flags> Enabled needle media types [i]
i (1) Image files
v (2) Video files
a (4) Audio files
crop <bool> Enable de-letterbox/autocrop pre-filter [false]
vtrim <int> Number of frames to ignore at start/end (video) [300]
tm <bool> Enable template match result filter [false]
tnf <int> Template match number of needle features [100]
thf <int> Template match number of haystack features [1000]
tdht <int> Template matcher DCT hash threshold [7]
tscale <int> Template matcher scale factor % [200]
neg <bool> Enable negative match result filter [false]
fg <bool> Filter Groups: remove duplicate groups from result: {a,b}=={b,a} [true]
fp <bool> Filter Parent: remove items in the same directory as needle [false]
mg <int> Merge n-connected groups: {a,b},{a,c}=>{a,b,c} [0]
eg <bool> Expand groups to make pairs {a,b,c}=>{a,b}+{a,c} [false]
verbose <bool> Enable diagnostic/verbose output [false]
Index Parameters (for -update)
==============================================================================
-i.<key> value
* default value in [ ]
* alternate value in ( )
* flag names are combined with +, e.g. -p.types i+v == -p.types 3
* must appear before -update to take effect
* special value "help" or "?" can be passed to get help
algos <flags> Enabled algorithms [dct+fdct+orb+color+video]
dct (1) DCT image hash
fdct (2) DCT image hashes of features
orb (4) ORB descriptors of features
color (8) Color histogram
video (16) DCT image hashes of video frames
types <flags> Enabled media types [i+v+a]
i (1) Image files
v (2) Video files
a (4) Audio files
sync <bool> Ensures previous algos persist even if -i.algos changes [true]
dirs <bool> Enable recursive scan of subdirectories [true]
exclude <glob> Add glob/pattern to exclude matching paths []
include <glob> Add glob/pattern to include matching paths []
fsize <int> Minimum file size in bytes, ignore smaller files [1024]
links <bool> Follow symlinks to files and directories [false]
resolve <bool> Store resolved symlink if it is child of index root [false]
dups <bool> Follow duplicate inodes (hard links,symlinks,junctions) [false]
modtime <bool> Force using potentially unreliable file modification time [false]
crop <bool> Enable border crop/de-letterbox for images (video=>always enabled) [true]
nfeat <int> Number of features per image (fdct,orb) [400]
rsize <int> Dimension for prescaling images before processing (dct,fddt,orb,color) [400]
vht <int> Dct threshold for discarding nearby frame hashes (video) [8]
hwdec <list> Add hardware decoder <device-id>,family=<family>[,...] []
forkhw <bool> Run hardware decoders in a separate process (for buggy drivers/codecs) [false]
decthr <int> Max threads for a cpu video decoding job (0==auto) [0]
idxthr <int> Max threads for all jobs (0==auto) [0]
bsize <int> Size of database write batches [1024]
ljf <bool> Estimate job cost and process longest jobs first [true]
ignored <bool> Log all ignored files [false]
verbose <bool> Log links followed, all files queued for processing, etc [false]
dryrun <bool> Don't index any files, only show what changes would be made [false]
Definitions
==============================================================================
<file> path to file
<dir> path to directory
<item> file in the index
<group> list of items, usually a needle/target and its matches
<result> list of groups, from a query or group-by
<selection> list of items for further operations
<selector> defines a set of (indexed) items by path, matching expression
:<regular-expression> - pcre, prefixed with colon
<glob pattern> - path in the index with [*|?] wildcards (escape with \\*|\\?)
<dir>|<file> - existing file or directory
@ - use the current selection
<type> item media type (1=image,2=video,3=audio)
<find> source expression for string find/replace
<regular-expression> - pcre with optional captures for <replace>
* - shortcut for entire string (^.*$)
<replace> destination expression for string find/replace
<template> - replace entire string, must contain at least one capture
<string> - replace whole-words/strings, may not contain any capture
#0 #1 .. #n - capture: the nth capture from <find>, #0 captures the entire string
%n - special: the sequence number, with automatic zero-padding
{arg:<func>} - special: transform arg (after capture/special expansion) with function(s)
{<prop>[#<func>]} - special: insert property
<glob> pattern for file path/name matching
:<regular-expression> - not a glob, but a pcre, prefixed by colon
path glob - case-sensitive path glob (see QRegularExpression::wildcardToRegularExpression())
<binop> binary operator for <expr>
==, = - equal to
!= - not equal to
<, <=, >, >= - less-than/greater-than
~ - contains
! - does not contain
<expr> expression to test a value, returns true or false, default operator ==
:<regular-expression> - pcre matches value
[<binop>]<string> - compare using operator, string is converted to value's type
[<binop>]%needle - compare with needle (first item in group)
%<binop><string> - compare with needle, absolute difference
<expr>&&<expr> - logical AND, if both expressions are true -> true
<expr>||<expr> - logical OR, one expression is true -> true
%null - true if value is null
!%null - true if value is not null
%empty - true if value is empty (after conversion to string)
!%empty - true if value is not empty (after conversion to string)
<prop> item property for sorting, grouping, filtering
id - unique id
isValid - 1 if id != 0
md5 - checksum
type - 1=image,2=video,3=audio
path - file path
parentPath - archivePath if archive, or dirPath
dirPath - parent directory path
relPath - relative file path to cwd
name - file name
modified - date/time file was modified
created - date/time file was created
completeBaseName - file name w/o suffix
archive - archive/zip path, or empty if non-archive
suffix - file suffix
isArchived - 1 if archive member
archiveCount - number of archive members
contentType - mime content type
width - pixel width
height - pixel height
aspectRatio - pixel width/height
resolution - width*height
res - max of width, height
compressionRatio - resolution / file size
fileSize - compressed file size (bytes)
isWeed - 1 if tagged as weed (after query)
score - match score
matchFlags - match flags (Media::matchFlags)
exif#<tag1[,tagN]> - comma-separated EXIF tags, first available tag is used ("Exif." prefix optional)
iptc#<tag1[,tagN]> - comma-separated IPTC tags, first available tag is used ("Iptc." prefix optional)
xmp#<tag1[,tagN]> - comma-separated XMP tags, first available tag is used ("Xmp." prefix optional)
ffmeta#<tag1[,tagN] - comma-separated ffmpeg metadata tags, first available tag is used
text#<key> - loads image and returns QImage::text() value
<func> transform a property value or string
<func>#<func>[#<func>]... - chain of functions, separated by #
mid,from,len - substring from index (from) with length (len) (see: QString::mid)
trim - remove whitespace from beginning/end
upper - uppercase
lower - lowercase
title - capitalize first letter
date,<format-string> - parse value as date and format as string (see: QDateTime::toString())
year - shortcut for date,yyyy
month - shortcut for date,yyyy-MM
day - shortcut for date,yyyy-MM-dd
split,<regexp|string> - split into array with regexp or string
camelsplit - split into array on uppercase/lowercase (camelCase => [camel, Case])
join,<string> - join array with string
push,<string> - append string to end of array
pop - remove string from end of array
shift - remove string from front of array
peek - return string at given index
foreach,<func>[|<func>]... - apply function(s) separated by pipe (|) to each array element
add,<integer> - add integer arg to value
pad,<integer> - pad integer value with zeros, to width argument
to<type> - convert data type (date,time,string,bool,int,float)
Examples
==============================================================================
create index in cwd cbird -create -update
update index in ~/Pictures cbird -use ~/Pictures -update
find exact duplicates cbird -dups -show
find near duplicates cbird -similar -show
find near duplicates (video) cbird -update -p.alg video -p.dht 7 -p.vtrim 1000 -similar -show
group photo sets by month cbird -select-type 1 -group-by exif#Photo.DateTimeOriginal#month -folders -show
browse items, 16 per page cbird -select-all -max-per-page 16 -show
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment