Feature Proposal: Provide a new comparitor to aid queries against fields which contain list data

Motivation

At the moment, if a form field contains a comma separated list of data, the only way to construct a query to test if the list contains a particular element is as follows:

"SomeField~'*SomeMemberElement*'"

But this also matches SomeMemberElements and ThenSomeMemberElement

Description and Documentation

Instead, we need a new comparator. We have =, ~, != and !~. Propose that we also add the following to represent "contains":

:= and !:=, Eg: "SomeField:='SomeMemberElement' AND SomeField!:='ThenSomeMemberElement'"

or (alternative):

$ and !$, Eg: "SomeField$'SomeMemberElement' AND SomeField!$'ThenSomeMemberElement'"

Problems:

  • Confine this to comma separated lists only?
    • Yes. Eventual mapping to structured/indexed data stores/databases requires us to limit what can be considered a list. We could support space separated lists using the $ if anybody thinks it's valuable, please discuss.

Impact

%WHATDOESITAFFECT%
edit

Implementation

-- Contributors: PaulHarvey - 25 Jan 2010

Discussion

We've also got =~ . So a contains can be expressed using a regular expression like "SomeField =~ '\bSomeMemberElement\b'"

which comes pretty close to contains.

-- MichaelDaum - 25 Jan 2010

Note that the =~ regexp syntax is really powerful, but is a complete bastard to map to SQL implementations that don't support regex searching. In that case you have no choice but to hoist the regex for late matching in the result set generated by the rest of the expression (or fall back to exhaustive eval). However SQL doesn't support an IN operator that matches the content of string values, either (SQL IN requires a list of values on the RHS, not a comma-separated string).

-- CrawfordCurrie - 25 Jan 2010

POSIX 1003.2 regular expressions are quite well supported among databases. Can't find a database that does not have them.

-- MichaelDaum - 25 Jan 2010

It should be reasonably easy to map a comma separated list:

:='SomeMemberElement, ThenSomeMemberElement, foo, bar'

to

IN ('SomeMemberElement', 'ThenSomeMemberElement', 'foo', 'bar')

... right?

Parking until we have a DB back-end to justify this.

-- PaulHarvey - 26 Jan 2010

You're missing the point about the lists. Yes, there's an IN operator that operates over lists in the DB, but you don't have a list, you have a comma-separated string (CHAR(*)). To map to a list you would have to map string values to lists in the DB when the DB is populated.

-- CrawfordCurrie - 26 Jan 2010

This is appropriate for DB back-ends. Unparking and placing on the list for 2.0

-- CrawfordCurrie - 17 Feb 2012

Michael's earlier comment about regex support is correct, but I've found that even with MongoDB, full regex-scans of many 10s of thousands of topics becomes enormously sluggish (several seconds).

My work-around has been ListFormfieldPlugin, which detects a dataform field type of list+foo (example, list+textboxlist) on save, and creates %META:LISTITEM{name=".." key="FormfieldName" value="item"}%, which I can set indexes for with MongoDBPlugin

Sadly, now, to do an indexed query for list-type formfields named 'List' which contain a 'foo' item now looks like %SEARCH{"listitems[key='List' AND value='foo']"}%

But, it's workable, and means we can do some very interesting (and horribly inefficient) nested searches which were previously almost timing-out under mongo, that now render in ~3s (we aim for < 1s average page render times).

Related: AllowTypedData

-- PaulHarvey - 17 Feb 2012

Changing to Parked, needs a developer to adopt.

-- Main.GeorgeClark - 19 Nov 2015 - 22:13

 
Topic revision: r9 - 19 Nov 2015, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy