Steven Erat's Blog Steven Erat Photography
 
 
Viewing By Entry
 
 

TalkingTree  Ray Camden - Verity for Fast Text Searching

 

Overview of some hidden gems in CFMX 7 Demonstration of isValid("url",form.yoururl)> Read release notes.... For example, can now change webservice SOAP headers

What is Verity? An incredibly powerful search engine, and the version in CF was at 2.6 for a long time, but CFMX 7 upgrades Verity to 5.5. It is a seperate product, you can go out and buy the full Verity K2 server from Verity for perhaps $60,000+, but you get it with ColdFusion MX Enterprise.

-Overview of Verity tags in CF, such as cfcollection, cfindex, cfsearch. File types include HTML, RTF, DOD, XLS, PDF, MP3, ZIP, and Data files.

-Question about version of PDF support in CFMX 7 from Damon Gentry... CFMX 7 does support recent versions of Acrobat PDF files, see the docs for a table of file version support.

-Question about time of indexing for large collections.... Ray says indexing does take time although he hasn't seen any problems with it so far, and he thinks Verity 5.5 has improved.

Text index searches are faster than SQL calls to databases. Verity also ranks the query result sets from a cfsearch, and you get a context (category) which is new in CFMX 7.

Demo of creating a collection and indexing it. The Cfcollection tag has a list function to see what collections exist, so he runs that and if its not there he creates a new collection of that name.

Then if any data is in a pre-existing collection, he runs a cfindex purge to clear that out, then an update to add new records.

Now use cfsearch to test the collection from a web form. Ooops...! A glitch, 0 records returned.... Ray takes a moment to reindex, in about 20 seconds.

Don't have to index all documents at one time. Can break it up to index a fraction of the total collection at a time with the update action, and schedule that during low traffic times.

-Question: If they index this way, can they defeat the Verity License Agreement in CFMX (250,000 docs in Enterprise). Ray didn't know, but it a informational message is logged stating the license has been exceeded.

Discussion of the new Verity status result that has extra information about Verity operations.... Discussion of if purging/adding works well, Ray says yes.

Demo of a press release CFC which encapsulates behaviors such as add, delete, edit a press release. Convert this cfc to work with Verity, such as in the addPressRelease function in addition to the SQL to enter a record, Verity is called with CFINDEX so that just that new row is indexed to update the Verity collection. By synchronizing SQL operations with Verity operations, the collection always matches what's in the database.

-Question about Verity storing folder info in file based indexing. Ray say yes, you can extract info on folders from search results.

-Question about how to restrict verity search results based on login authentication. Ray says yes you can, and you have custom fields to take advantage of so you can add a member only key as a custom field and filter on that, but in CFMX7 you can now use the category metadata to run filters against.

How to handle multilingual collections, ... can't hear question.... Tom Jordahl speaks up... there is a "any" locale.(??.. can't hear).

Default Simple Search Rules: "dark side" will only match "dark side", but "wookies,jawas,lions" will match any term in that list. Stemming is done by default to match instruction, instructions, instructional, etc... If you send a mixed case search string then the search WILL be CASE SENSITIVE. Best to always lowercase searche criteria for maximal matching. Recommendation to parse criteria from form input before passing to cfsearch, such as removing unneeded booleans.

-Question - couldn't hear, but Tom Jordahl threw out an answer from the back of the crowd. Nice to have *the* CF Verity engineer in the room :)

New in CFMX 7... Categories and trees. CategoryTree: sports/american, or category: football. There is also an Admin API to get available categories. Use the cfcollection attribute categories="yes" when collection is first built, and categoryTree and category attributes of cfindex while building the collection. Demo of how categories work by combining a form field for search criteria with another field for category.

-Question: Can category be updates? Yes.

Sexiest new feature is Suggestions. This is the "Did you mean?..." functionality to provide clues to user about similar searches that could be run that might return something useful. Demo by misspelling the word email as "emil" and Verity returns with "Did you mean email?". Use suggestions=yes, and suggestions are dependent on language that the collection is built in. Can adjust when suggesstions are offered such as if less than 5 results come back, and can adjust how many suggestions are offered. Ray says this is brain-dead simple and very powerful, so its a great feature that you should consider always using.

=Question about Verity Spider utility. Can categories be used... No.

-Question about seperate CF clusters? Cannot index/search same collection from different CFServers on different hosts with the same Verity engine. No. CFMX 7 has a Verity restiction which allows multiple instances to connect to K2, but not multiple hosts.

Demo again... can bold or otherwise highlight the search terms using any HTML or other markup. Default is bold, but again can use css or html as you wish. Use the contextHighlightBegin and contextHighlightEnd attributes on cfsearch.

Subsearches and Natural/Internet style searches demo'd. The latter allows you to enter a question like "how do I check my email" and return appropriate responses, using the internet style.

 


Comments

The version of Verity bundled with CFMX 7 still has PDF hit highlighting disabled :( You have to spend another $35-45k to get the "full" version from Verity. OR you can spend about $10k-$12k and get Texis (http://www.thunderstone.com/texis/site/pages/Texis...)
as we did.

-AD


 

 

Calendar

 
Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      

Search This Site

 
This is an exact search only

About This Site

 
I live west of Boston and work as a Software Engineer with ColdFusion and Flex, specializing in Linux. Recently I graduated in Professional Digital Photography from CDIA.
More about me

Recent Entries

 
A ColdFusion Trick for Lost D..
Starting ColdFusion9 Solr: Us..
Adobe LiveCycle DataServices ..

Recent Comments

 
Posted By Aaron Longnion:
Thanks Steven, I just ran into this problem, remembered your tweet about it, and found your blog on it. :)

Posted By srinyvas:
Hai, This information is very useful and i like your excellent writing skill. Can i copy this Content to my website top management colleges ...

Posted By Steven Erat:
@Wade - Glad I could help! Thanks for letting me know it worked for you too.

recently played

 
The Candid Frame #70 - Greg Gorman
by Ibarionex R. Perello
on The Candid Frame: A Photography Podcast

now playing, a plug-in for itunes

Categories

 
RSS Adobe (34)
RSS Bicycling (9)
RSS Blogging (39)
RSS Books (13)
RSS Breeze (13)
RSS CFMX Podcasts (10)
RSS ColdFusion (427)
RSS Computer Technology (51)
RSS Events (26)
RSS Flex (20)
RSS Gadgets (10)
RSS HiTech Industry (16)
RSS Java (25)
RSS Learning (57)
RSS Linux (70)
RSS Mac OS X (22)
RSS Macromedia (27)
RSS Meetup (35)
RSS New England (62)
RSS Odds & Ends (25)
RSS Outdoors (32)
RSS Personal (29)
RSS Photography (111)
RSS Photoshop (29)
RSS Podcasts (18)
RSS Rants (19)
RSS Restaurants (8)
RSS Science (34)
RSS Spain (16)
RSS Travel (42)
RSS Twitter (10)
RSS Video (20)
RSS Webcam (3)
RSS Writing (10)

Blogs I Read

 
Terrence Ryan
Ben Forta
Ray Camden
Kinky Solutions
Dan Vega
Gary Gilbert
Simeon Bateman
Red Hat Blogs
O'Reilly Digital Media
O'Reilly Radar
John Nack
The Strobist
Scott Kelby
Matt Kloskowski
Joe McNally
Digital Photography School
Engadget
Science Blog

RSS

 


Add to Google
Add to My Yahoo!

Aggregated By

 


Consumed By Feed-Squirrel.com
Aggregated by ColdFusionBlogger.org

Credits and Stuff

 
BlogCFC - Free ColdFusion Powered Blog Software
CJM Group - ColdFusion Website Hosting


 
 
blog | photos | flickr | referers | webcam | stats | about | contact
 
Copyright © 2010 Steven Erat. All rights reserved.
This is a personal weblog. The opinions expressed here represent my own and not those of my employer