Feature Proposal: Bring together the SEARCH visions
Motivation
There are really way too many
FeatureRequests for small parts of TOM and QUERY functionality that will be implemented by completing the rearchitecting we've been beavering away at.
Description and Documentation
Pluggable Query Parsers
Foswiki now has a sufficiently complete
Query::Node
structure to represent the legacy
regex
,
word
and
keyword
SEARCH types - which means we are ready to move away from having the user specify the code engine that does the search - instead, they specify a
Parser
that will be used to convert their entered query into a
Query::Node
.
Create a mapping mechanism so that any Extension can add a new Parser.
Parsers output
Query::Nodes
, and can define their own Node types - clearly requiring more implementation work by the extension developer. (eg, a parser for SQL..)
Pluggable Evaluation Engines
To match these pluggable Parsers, we need to be able to register Evaluation engines for Nodes - with ways to prefer one engine over another (because we will have more than one able to serve the same result).
-
- ok, so this is hard - as it depends not just on what the operation is, but on its context - the
grep
algo may well be faster at regex matching the raw_text, but the BruteForce query engine is likely to be better at regex matching a particular formfield.
custom Operations - ie, non-core ones, will need to be in a different namespace to be obvious and debuggable.
Move Scope into the query.
in
regx
and
word
SEARCh types, we have
scope=name,topic,all
. These are redundant, and pretty misleading when you try to read them into query searches, where it is expressed as:
-
"text ~ '*Something*' OR topic ~ '*Something*'"
On the other hand, this current query implies query over a predefined 'ResultSet' we can call
webs
ie, it is possible to consider the above query as shorthand for:
-
"webs[text ~ '*Something*' OR topic ~ '*Something*'"]
giving us the opportunity to defined queries on other
ResultSets, including non-topic based ones:
eg.
-
"PrevisoulySavedResult[text ~ '*Something*''"]
-
"attachments[user='SvenDowideit']"
-
log[action='save' AND '12/12/2005'<date<'12/12/2008']
-
"webs[text ~ '*Something*' OR topic ~ '*Something*'"]
-
web='Sandbox' AND revisions[author='JoeBloggs']
- search for all revisions saved by 'JoeBlogs' in topics that are currently in the Sandbox web
-
revisions[web='Sandbox' AND author='JoeBloggs']
- search for all revisions saved by 'JoeBlogs' in topics that were ever in the Sandbox web (much more computationally expensive)
complete the ResultSets abstraction
ResultSets become ordered lists of TOM addresses, which can be linked directly to the Object Cache.
ResultSets themselves don't contain Objects, rather links to Ojects, so that the lifetime of parsed or complexly retrieved information is independent of possibly short lived lists.
At the same time,
ResultSets become save/name-able and reusable.
eg
Number of topics: 0
%SEARCH{type="query" "saveme[constraint]"=
additionally, built in
ResultSets will need to be able to be defined by developers in code -
-
log[action='save' AND '12/12/2005'<date<'12/12/2008']
-
tags[name='silly']
-
SQLDB1[SELECT * FROM evil LIMIT 12]
revision specific TOM address
We're still pondering the right way for a user to be able to see and specify a particular revision of an object, some discussion can be found at
QueryAcrossTopicRevisions
once you have a set of results of 'TOM' addresses / objects, you need a way to format them - which is where the Pluggable FORMAT engine I began to migrate out of SEARCH comes in - right now its biased towards different MACRO's being able to supply functions to call when a particular
$format
operator is found int he format string
This needs to be extended to support the 'Nodes' in the
ResultSet.
add grouping and filtering operators
for many of these Queries, you may want only the first() and last(), or only a unique() - for eg
- list all the unique names of topics a particular user has ever saved -
unique(name, revisions[author='JoeBloggs')
Changes intended:
- convert the Search::Parser for regex, word and keyword SEARCH into a proper parser that outputs
Query::Nodes
- removing the Search::Node
placeholder
- add Configure support, and Foswiki::Search support for a hash of
$Foswiki{cfg}{QueryParsers}{query} = "Foswiki::Query::Parser"; etc - adding a new parser is a single ine in a =Config.spec
- design a mapping Query::Nodes to evaluation engines.
.... more, i lots track
Impact
%EDITTABLE{format="|label,1|text,70|" changerows="off"}%
Implementation
- Contributors: Lots and lots of people, as this dates back to pre-dawn times.
Discussion
gosh. that is alot of work.
--
SvenDowideit - 16 Oct 2010
We shall conquer the world
As you know, I agree 100% with the vision. And I think pluggable Query + Eval engines are very interesting, I just wonder if they need to be delivered with 2.0? Perhaps if we get some experience with
ResultSets and pluggable formatters (and the rest), we would be in a really solid position to introduce that stuff (and pluggable
ResultSet implementations) in a 2.1 release. As I assume 'pluggable' also means publishing the API.
I am just nervous after the enormous dev cycle we had for 1.1, and the multi-month freezes of trunk... I am confident that 6-months between feature freeze would give us better quality releases.
Looking forward to helping break & un-break trunk
--
PaulHarvey - 16 Oct 2010
yes, I'm not expecting that it is all done by 2.0 - I wanted to try to tie it all together, so we can figure out what we need now, what is actually redundant, and all move in the same direction.
--
SvenDowideit - 16 Oct 2010
Removing myself as committed developer - I never committed to this, as I recall, though I do support it.
--
CrawfordCurrie - 24 Feb 2012
Setting to parked. Developers no longer active.