Full Text Search in Ruby Library

Only available on StudyMode
  • Topic: Information retrieval, Searching, Boolean logic
  • Pages : 2 (599 words )
  • Download(s) : 136
  • Published : December 1, 2012
Open Document
Text Preview
Full text search is a technique for searching a document or database stored in the computer. A full text search engine examines all the words, in every stored document, to find a match of the keyword searched by the user. Many web sites and application programs provide full-text search capabilities.

There are quite a few choices when it comes to adding a full text search in a Ruby on Rails application. A choice can be made on the basis of the language the search engine is written in or the scalability options suited for the application.

Acts As Indexed being a pure Ruby implementation makes for a tool that is totally portable, and suitable for almost any application requiring full text search capabilities. Search queries support many standard boolean operators, namely exclusion of a term through the use of ’-’ and the matching of phrases through the use of quotation marks. It is useful in case of a simple site and need to implement a basic search very quickly.

Ferret is a full text search engine library written for ruby implemented in a rails application by the Acts As Ferret plugin. It is inspired by the Apache Lucene Java project. The first step to implementing a search is to get an index built and then the index is searched for the documents having the keyword. One of the more useful features especially in a web scenario is highlighting the matched words. This is made trivial by Index’s highlight method. It’s also possible to use Ferret as a more general purpose data store

Xapian is written in C++ with bindings to allow use from Perl, Python, PHP, Java, Tcl, C# and Ruby. An important feature of Xapian is the Ranked probabilistic search – important words get more weight than unimportant words so more relevant results appear at the top. It also supports Synonyms as an automatic form of query expansion and can even suggest spelling corrections for user supplied queries. Full range of structured boolean search operators (“stock NOT market”, etc)....
tracking img