FoxTrot Query Syntax
A FoxTrot query is composed of one or more words. Uppercase, lowercase and accents are ignored. The punctuation is ignored, except in the following cases:
- Wildcards
- words ending with an asterisk will match all words with this prefix (for example, word* will match word, words, or wordy.
- words starting with an asterisk will match all words with this suffix (for example, *ping will match ping, jumping, or dumping.
- words enclosed by asterisks will match all words containing these characters (for example, *box* will match box, boxer, shoebox, or shoeboxes.
- Excluded words
- words starting with a minus sign are exclusion words. For example, michigan -lake will find all documents containing michigan but not containing lake. Note that a minus sign inside a composed word is considered as a normal word separator. For example, re-open is considered as two words: re open.
- Quoted strings
- use quoted strings to search for a sequence of words. For example "lake michigan" will find lake michigan but will not find a small lake in michigan.
- quoted strings prefixed with a minus sign are exclusion phrases. For example michigan -"lake michigan" will find all documents containing michigan but not containing the expression lake michigan (whether they contain the single word lake or not).
- you can add excluded words at the beginning or at the end of a quoted string, to find documents that contain this quoted string not contiguously to these excluded words.
For example, "john -doe" will find documents that have at least one occurrence of john that is not part of the string john doe; but doe or even john doe can occur somewhere else in the document (for example, it will find john smith or john smith meets bob doe or even john smith meets john doe, but it will not find just john doe).
Another example: "-john -bob doe" will find documents that contain the word doe that is not part of the strings john doe nor bob doe (for example it will find greg doe or greg doe meets john smith but it will not find just john doe nor bob doe)
- Proximity searches
- FoxTrot gives a higher rank to documents for which the searched words are near to each other. However, if you want to find only the documents that contains the specified words in a given proximity range, you can use a quoted string, and specify the range (the maximum number of other words between the searched words) between braces immediately after the opening quote. For example, "{2} bob greg john" will find documents that have at least one occurrence of bob that is at most at 2 words of occurrences of greg and john. It will find bob, john and greg are friends or greg, john and bob are friends, but it will not find bob and john are friends of greg.
- you can also search for documents containing multiple quoted strings in a specified proximity range, by specifying the range (the maximum number of other words between the searched strings) between braces at the beginning of the query. For example, {4} "john doe" "bob smith" will find john doe is a friend of bob smith, but it will not find john doe is one of the best friends of bob smith.
- Boolean operator
- use the | character (vertical bar) to combine two (or more) words with an OR. For example, washington | boston will find documents that contain either washington or boston. You can also combine quoted strings, for example washington | boston | "new york" | "san francisco".
- Exact strings
- note: exact string searches are quite slow, even very slow when searching very frequent words. Please use with caution.
- strings enclosed between circumflex accents (^) are searched as exact strings: case matters, accents matters, spaces, punctuation, everything matters.
- note that the searched string must contain at least one whole word (composed of alphanumerical characters), and that partial words or wildcards are not handled inside an exact string. For example, searching for ^:-)^ is not allowed (but searching for ^hello :-)^ is); searching for ^ping^ will not find jumping; and searching for ^*ping^ will only find the word ping preceded by an asterisk character.
- exact string searches can be made insensible to case, accents etc. To do so, you can specify a set of flags between braces, after the leading circumflex accent (for example, ^{cd}“jérôme”^ will be case and diacritics insensible, and thus will find “Jerome” or “JÉRÔME”, but will not find «jérôme» which uses different quotes).
- case: the c flag is for case insensibility; ^{c}jérôme^ will also find JÉRÔME.
- diacritics: the d or a flag is for diacritics (accents) insensibility; ^{d}jérôme^ will also find jerome.
- punctuation: the p flag is for punctuation insensibility; all punctuation characters are ignored; ^{p}quick brown fox^ will also find quick, brown, fox.
- blanks: the b flag is for blanks insensibility; all blank characters are ignored, like spaces, tabs, line feeds, and other control characters; ^{b}quick brown fox^ will also find quick brown fox. However, note that ignoring blanks (or punctuation, or marks and symbols) does not allow to find merged or split words; for example, ^{b}re open^ will not find reopen, and ^{bp}reopen^ will not find re open or re-open. But ^{bp}re open^ will find both re open and re-open.
- marks and symbols: the m or s flag will ignore all mark and symbol characters (as defined by the unicode standard); ^{s}price $2000^ will also find price ©2000 or price 2000.
- composition: the k flag will ignore all character composition differences; ^{k}daemon…^ will also find dæmon...
- if the exact string you want to search starts with an open brace, then you need to specify a (possibly empty) set of sensibility flag, for example ^{}{NULL}^ will find {NULL}; note that ^{NULL}^ is an invalid query.
- if you want to search for a partial word, or to search for a string that do not contain any word, then you should enclose the the string between “double vertical line” characters (U+2016: ‖), instead of circumflex accents. However, you can only do this when also searching some words. For example, only searching for ‖:-)‖ is not allowed (but searching for hello ‖:-)‖ is); searching for dog ‖ping‖ will find jumping dog. You can also use the options between braces: jerome ‖{c}jérôme‖ will find both Jérôme and JÉRÔME, but not Jerome.
- Regular expressions (also known as “regex” or “grep”)
- note: regular expression searches are not available in FoxTrot Personal Search.
- note: regular expression searches can be intricate. Hence, you may only use them in addition to other criteria, to reduce the number of possible matches.
- strings enclosed between grave accents (`) are searched as regular expressions.
- here are a few examples:
- bob `(?i)[\w\.-]+@[\da-z\.-]+\.[a-z]{2,6}` will find any document containing both bob and an email address.
- bob `\b\d+/\d+/\d+\b` will find any document containing both bob and a weakly defined date format, with any number of digits (\b anchors the search to word boundaries).
- bob `\b\d\d/\d\d/\d\d\d\d\b` searches for strictly defined date format (mm/dd/yyyy or dd/mm/yyyy).
- bob `\b\d?\d/\d?\d/(\d\d)?\d\d\b` searches for a date format with 1 or 2 digits for the day and the month, and 2 or 4 digits for the year ({m}m/{d}d/{yy}yy or {d}d/{m}m/{yy}yy).
- bob `\b(((0?[7-9]|1[0-2])/\d?\d/(20)?13)|(0?[1-6]/\d?\d/(20)?14))\b` searches for a date range (between {0}7/{d}d/{20}13 and {0}6/{d}d/{20}14).
- regular expressions are case sensitive; prefix them with (?i) for case insensitivity.
- the dot (.) character matches any character except newline. To match any character including newline, prefix the expression with (?s).
- \w and \d match any ASCII alphanumeric or numeric character, respectively. To match any unicode alphanumeric or numeric character, prefix the expression with (*UCP).
- if the regular expression is preceded by a minus sign, FoxTrot will find documents that do not contain the regular expression. martin -`\b\d\d/\d\d/197\d\b` will find any document containing martin and no date in the 1970s.
- FoxTrot uses PCRE2 for searching regular expressions. For more information: regular-expressions.info, rexegg.com, PCRE syntax, PCRE2 pattern, regex101.com, regextester.com.
You can combine several special characters in the same query. Here are a few examples:
- "john doe" "bob smith" will find john doe meets bob smith, but not john smith meets bob doe
- restaurant chinese | vietnamese | korean boston | washington | "new york" will find a chinese restaurant in Washington as well as a vietnamese restaurant near New York. Note that the | operator have precedence, i.e. this query is evaluated like restaurant ( chinese | vietnamese | korean ) ( boston | washington | "new york" ).
- fox* -fox -foxtrot will find all documents containing a word that starts with fox but without any occurrence of fox (as a full word) or foxtrot.
- *box* -*box will find all documents containing a word that contains box but without any occurrence of a word ending with box. For example, it could find a document that contains boxer or shoeboxes but it will not find one that contains shoebox.
- "www.ctmdev.com" (or "www ctmdev com") will find www.ctmdev.com (as well as www ctmdev com or www+ctmdev/com). Note that a dot inside a word is considered as a word separator, so www.ctmdev.com (without quotes) will also find informations about ctmdev at www.somewhere-else.com.
- "big car*" "new york" -show* will find all documents that contain big car or big cars, that also contains new york, but that do not contain show or shows.
- "-pierre dupont" | "-jean pierre dupont" will find documents that contain dupont (but ignoring pierre dupont) or that contain pierre dupont (but ignoring jean pierre dupont); in other words, it will find all occurrences of dupont (but ignoring jean pierre dupont)
- note that a regular expression may contain the | alternation operator, but you can’t combine a regular expression with another one, or with another FoxTrot query expression, with the boolean | operator. Thus, bob `(smith|doe)` is a valid FoxTrot query, but bob `smith.*` | `doe.*` or bob | `smith.*` are not valid.
Asian text: chinese, japanese and korean text is searched as an exact sequence of characters, unless if it contains some spaces (or other punctuation characters) to delimit groups of characters. For example, searching 田中です will only find this exact string, but searching 田中 です will also find 田中はこの人です.