Oracle Ultra Search Online Documentation Release 9.2 |
|
Every Ultra Search instance has a stoplist associated with it. A stoplist is a list of words that are ignored during the indexing process. These words are known as stopwords. Stopwords are not indexed because they are deemed not useful, or even disruptive, to the performance and accuracy of indexing.
During the installation process, a default stoplist is created for the Ultra Search product. Subsequently, when an Ultra Search instance is created, a copy of the default stoplist is created for the Ultra Search instance.
The default stoplist is created under the WKSYS schema. The default stoplist name is wk_stoplist. (This list is defined in the file $ORACLE_HOME/ultrasearch/admin/wk0pref.sql, which is run at installation).
You can modify the default stoplist by adding or removing stopwords from it. However, remember that these modifications do not affect existing Ultra Search instances. They only affect Ultra Search instances that are created after the modifications are made.
Modifying instance stoplists should be done as a last resort. The preferred method is to do one of the following:
- Modify the default stoplist before creating the instance.
- Replace the instance stoplist immediately after creating the instance.
Modifications made to the default stoplist are reflected in all other instance stoplists created after the time of modification.
Replacing the instance stoplist immediately after creating the instance affects only that instance. You first need to create a user-defined stoplist.
In both cases, the result is that the Ultra Search instance stoplist is modified and defined before initial crawling. This means that all documents collected by the Ultra Search crawler are evaluated against the correct stoplist. It is important to modify the stoplist before initial crawling to avoid having to recrawl all documents again.
If necessary, you can alter an instance stoplist after initial crawling with one of the following methods:
- Add stopwords to the instance stoplist.
- Define a new stoplist, and replace the instance stoplist with the new stoplist.
Choosing to add stopwords to the instance stoplist does not affect any documents already crawled or indexed. This operation is not an expensive operation.
Defining a new stoplist and replacing the instance stoplist with it invalidates the entire index. If you choose this method, you must force the Ultra Search crawler to recrawl all documents in the index. You can do this by selecting the "Process all documents" radio button in the Edit Schedule page. This is a very expensive operation. Therefore, this option should be the last resort.
(1) Modifying the default stoplist before creating the instance:
To add the stopword "web" to the default stoplist, log in as user WKSYS through SQL*Plus, and run the following statement:
EXEC ctx_ddl.add_stopword('wk_stoplist','web');To remove the stopword "web" from the default stoplist, log in as user WKSYS through SQL*Plus, and run the following statement:
EXEC ctx_ddl.remove_stopword('wk_stoplist','web');Subsequently, the stoplists of all new instances reflect the modifications made to the default stoplist.
(2) Replace the instance stoplist immediately after creating the instance:
You must create a new user-defined stoplist. Log in as the owner of the instance through SQL*Plus, and run the following statements:
BEGIN ctx_ddl.create_stoplist('example_stoplist'); ctx_ddl.add_stopword('example_stoplist','example_stopword'); ... (add more stopwords by repeated the previous line with new stopwords) ... END; /To replace an instance stoplist with this new stoplist, log in as the owner of the instance through SQL*Plus, and run the following statement:
ALTER INDEX wk$doc_path_idx rebuild parameters('replace stoplist example_stoplist');
(1) Add stopwords to the instance stoplist:
To add the stopword "web" to the instance stoplist, log in as the owner of the instance through SQL*Plus, and run the following statement:
ALTER INDEX wk$doc_path_idx rebuild parameters('add stopword web');(2) Replace the instance stoplist after initial crawling:
The method for replacing the instance stoplist after initial crawling is no different from replacing it before initial crawling. Remember that this is a very expensive operation because it entails recrawling of all documents. Remember also that if you choose this method, you must force the Ultra Search crawler to recrawl all documents in the index. Therefore, this method should be the last resort.
Copyright © 2002 Oracle Corporation. All Rights Reserved. |
|