Oracle® Ultra Search Administrator's Guide 10g Release 2 (10.2) Part Number B14222-01 |
|
|
View PDF |
This chapter contains the following topics:
Note: Some information in this chapter is generic to all types of Oracle Ultra Search installations and other informations are specific to installing and configuring Oracle Ultra Search with an Oracle Database release.If you are installing Oracle Ultra Search with the Oracle Application Server release, then refer to Chapter 3, "Using Oracle Ultra Search with Oracle Application Server". |
This section describes the system requirements for installing the Oracle Ultra Search .
Oracle Ultra Search hardware requirements are based on the amount of data that you plan to process using Oracle Ultra Search. Oracle Ultra Search uses Oracle Text as its indexing engine and the Oracle Database as its repository.
Sufficient RAM Along with the resource requirements for the database and the Text indexing engine, also consider the memory requirements of the Oracle Ultra Search crawler. The Oracle Ultra Search crawler is a pure Java program. When the crawler is launched, the Java Virtual Machine (JVM) is configured to start with 25MB and grow to 256MB. When crawling very large amounts of data, these values might need to be adjusted.
The Oracle Ultra Search administration tool is a J2EE 1.2 standard Web application. It can be installed and run on a separate host from the Oracle Ultra Search backend. You can install and run this on the same host as the Oracle Ultra Search backend. You need to allocate enough memory for the J2EE engine. Oracle recommends using the Oracle HTTP Server with the Oracle Application Server Containers for J2EE (OC4J). Allocate enough memory for the HTTP Server as well as for the Java Development Kit (JDK) that runs the J2EE engine.
Sufficient Disk Space As customer requirements vary widely, Oracle cannot recommend a specific amount of disk space. As a general guideline, the minimum requirements are as follows:
3GB of disk space is required for the Oracle Application Server Infrastructure or database and the Oracle Ultra Search backend.
15MB of disk space for the Oracle Ultra Search middle tier on top of the Web server's disk requirements.
For each remote crawler host 3GB of disk space is reequired.
Disk space for a large TEMPORARY
tablespace depends upon the amount of RAM on the host.
Disk space for the Oracle Ultra Search instance user's tablespace.
The Oracle Ultra Search instance user is a database user that you must create. All data that is collected and processed as part of crawling and indexing is stored in this user's schema.
You should create the tablespace as large as the total amount of data that you want to index. For example, if you estimate that the total amount of data to be crawled and indexed is 10GB, then create a tablespace that is at least 10GB for the Oracle Ultra Search instance user. Make sure to assign that tablespace as the default tablespace of the Oracle Ultra Search instance user.
The Oracle Ultra Search backend consists of the following:
Oracle Ultra Search database schema: Data dictionary and PL/SQL packages.
Oracle Ultra Search crawler: Java program plus supporting files, libraries, and so on.
Oracle Ultra Search remote crawler: Crawler residing on a remote Oracle home.
The Oracle Ultra Search backend is installed as part of the Oracle Database Server installation.
The Oracle Ultra Search middle tier includes the following:
Oracle Ultra Search administration tool
Oracle Ultra Search Java query API
Oracle Ultra Search query applications
This section describes Oracle Ultra Search postinstallation tasks. There are five steps to the postinstallation:
Use the following command to start the Oracle Ultra Search middle tier. You must run this command manually to start the Oracle Ultra Search middle tier after installation.
$ORACLE_HOME/bin/searchctl start
Use the Oracle Process Manager and Notification Server (OPMN) to start the OC4J_Portal instance. For example:
$ORACLE_HOME/opmn/bin/opmnctl startproc instancename=OC4J_Portal
The Oracle Ultra Search installer creates a default Oracle Ultra Search instance based on the Oracle Ultra Search test user. You can test the Oracle Ultra Search functionality based on the default instance after installation.
The default instance name is WK_INST
. It is created based on the database user WK_TEST
. The default user password is WK_TEST
.
For security purposes, WK_TEST
is locked after installation. The administrator should log on to the database as DBA, unlock the WK_TEST
user account, and set the password to be WK_TEST
. The password expires after the installation. If the password is changed to anything other than WK_TEST
, then you must also update the cached schema password using administration tool in the Edit Instance page after you change the password in the database.
The default instance is also used by the Oracle Ultra Search query application. Make sure to update the data-sources
.xml
file.
Caution: Storing clear text passwords indata-sources .xml poses a security risk. Avoid this by using password indirection to specify the password. This lets you enter the password in jazn-data .xml , which automatically gets encrypted, and point to it from data-sources .xml . For more information, refer to "Creating An Indirect Password" in Oracle Application Server Containers for J2EE Security Guide. |
The data-sources.xml
file is located in the $ORACLE_HOME/oc4j/j2ee/OC4J_SEARCH/config
directory. Under tag <data-sources>
add the following:
<data-source class="oracle.jdbc.pool.OracleConnectionCacheImpl" name="UltraSearchDS" location="jdbc/UltraSearchPooledDS" username="username" password="password" url="jdbc:oracle:thin:@database_host:oracle_port:oracle_sid" />
In the preceding syntax, the following variables were used:
username and password are the Oracle Ultra Search instance owner's database user name and password.
database_host is the host name of the back end database computer.
oracle_port is the port to the user's Oracle Database.
oracle_sid is the SID of the user's Oracle Database.
In addition to user name, password, and JDBC URL, data-sources
.xml
enables configuration of the connection cache size, and the cache scheme.
The following tag specifies the minimum and maximum limits of the cache size, the inactivity time out interval, and the cache scheme.
<data-source class="oracle.jdbc.pool.OracleConnectionCacheImpl" name="UltraSearchDS" location="jdbc/UltraSearchPooledDS" username="wk_test" password="wk_test" url="jdbc:oracle:thin:@localhost:1521:isearch" min-connections="3" max-connections="30" inactivity-timeout="30"> <property name="cacheScheme" value="1"/> </data-source>
If you are adding the data source to the default Oracle Ultra Search instance user WK_TEST
, then make sure to unlock WK_TEST
first.
There are three values for the caching schemes:
1
= DYNAMIC_SCHEME
2
= FIXED_WAIT_SCHEME
3
= FIXED_RETURN_NULL_SCHEME
Restart the Oracle Ultra Search middle tier, in the Database release. For example:
For the Oracle Application Server release, use Oracle Process Manager and Notification Server (OPMN) to start the OC4J_Portal instance. For example:
$ORACLE_HOME/opmn/bin/opmnctl startproc instancename=OC4J_Portal
If the database character set is changed after installation, you must reconfigure the Oracle Ultra Search backend to adapt to the new character set.
Two SQL scripts (wk0prefcheck.sql
and wk0idxcheck.sql
), located in $ORACLE_HOME/ultrasearch/admin/
, are used for reconfiguration:
wk0prefcheck.sql
is run by the wksys
user to reconfigure default cache character set and index preferences.
wk0idxcheck.sql
is needed for reconfiguring instances created before the database character set change (for example, the default instance). This script must be run by the instance owner, and wk0prefcheck.sql
must be run first because it depends on reconfigured default settings generated by wk0prefcheck.sql
.
Running wk0idxcheck.sql
also drops and recreates the Oracle Text index used by Oracle Ultra Search. If there are already data sources indexed, then you must force a recrawl of all of the data sources.
wk0idxcheck.sql
must be run once for each instance. For example, if there are two instances, inst1 and inst2, owned by owner1 and owner2, respectively, then wk0idxcheck.sql
should be run twice, once by owner1 and once by owner2.
Note: Oracle Ultra Search only supports database character sets supported by Oracle Text. For example, the AL32UTF8 character set is not supported. For Unicode support, use UTF8. For the complete list of supported database character sets, refer to the Oracle Text Reference for lexer types. |
To configure the middle-tier and infrastructure to work with OracleAS Metadata Repository after its character set has been changed, do the following:
Modify the character set of all Database Access Descriptors (DADs) accessing the metadata repository to the new database character set.
Using the Application Server Control Console, navigate to the middle-tier instance home page.
In the System Components section, click HTTP_Server.
On the HTTP_Server home page, click Administration.
On the HTTP_Server Administration page, select PL/SQL Properties. This opens the mod_plsql
Services page.
Scroll to the DADs section and click the name of the DAD that you want to configure. This opens the Edit DAD page.
In the NLS Language field, type in a NLS_LANG value whose character set is the same as the new character set for OracleAS Metadata Repository.
Click OK.
Repeat steps e to g for all DADs accessing OracleAS Metadata Repository.
Reconfigure the Oracle Ultra Search index as follows:
Connect to OracleAS Metadata Repository as WKSYS
and invoke the following SQL script to reconfigure the default cache character set and index preference:
ORACLE_HOME/ultrasearch/admin/wk0prefcheck.sql
Connect to OracleAS Metadata Repository as the default user (WKTEST
) and invoke the following SQL script:
ORACLE_HOME/ultrasearch/admin/wk0idxcheck.sql
The script requests you to enter the instance name (WK_INST
). Enter "y" when prompted to go ahead with the change.This script reconfigures the instance (in this case, the default instance). It also truncates the Oracle Text index used by Oracle Ultra Search and you must force a recrawl to rebuild the index.
Repeat step b for all Oracle Ultra Search instances that were created before you changed the database character set. Invoke the script as the instance owner, and then force a recrawl of all data sources, if necessary.
This section describes how to check whether your installation was successful.
If you log on to the Oracle Ultra Search administration tool successfully, then you have completed the Oracle Ultra Search administration tool configuration process. Do the following to check the Oracle Ultra Search Administration Tool:
Check whether the Web Server is running.
Attempt to log on to the administration tool:
Visit the following URL
http://
hostname.domainname:port
/ultrasearch/admin/index.jsp
In the preceding URL, hostname.domainname is the full name of the host where you have installed the Oracle Ultra Search middle tier, and port is the default Web server port.
Log on to the Oracle Ultra Search administration tool by entering the Oracle Ultra Search instance owner's database user name and password.During the installation of the Oracle Ultra Search backend, a new Ultra Search instance owner, WK_TEST
is created.
The first time any JSP page is accessed, it takes a few seconds to compile. Subsequent accesses are much faster.
After you verify that the Oracle Ultra Search administration tool is working, you should be able to run the Oracle Ultra Search query applications.
To test the Oracle Ultra Search query applications, do one of the following:
Visit the following URL:
http://
hostname.domainname:port
/ultrasearch/query/search.jsp
Follow the links in the Oracle Ultra Search welcome page: http://
hostname.domainname:port
/ultrasearch/index.html
Locations for query applications are listed in the following section. Access the query source code by going to the directories list. You can also see a working demonstration of each query JSP page with the URL root, and you can append the correct JSP file name at the end of the URL root.
The query application is shipped as $ORACLE_HOME/ultrasearch/ultrasearch_query.ear
.
Portlet is shipped as $ORACLE_HOME/ultrasearch/webapp/ultrasearch_portlet.ear
.
This section describes how to troubleshoot Oracle Ultra Search.
Query finds no results
refer to "Reconfigure the Oracle Ultra Search Backend for the Database Character Set".
Error when processing binary files
The Oracle Ultra Search crawler uses the Oracle Text INSO filter, ctxhx
, for processing of binary files. These are non-text, non-HTML files such as PDF files, Microsoft Word files, and so on. For Oracle Ultra Search to use the INSO filter, the shared library path environment variable must contain the $ORACLE_HOME/ctx/lib
path.
During installation, the Oracle Universal Installer automatically sets the variable to include $ORACLE_HOME/ctx/lib
. If you restart the database after the installation, then you must manually set your shared library path environment variable to include $ORACLE_HOME/ctx/lib
before starting the Oracle process. You must restart the database to pick up the new value for filtering to work.
On UNIX set the $LD_LIBRARY_PATH
environment variable to include $ORACLE_HOME/ctx/lib
.
On Windows set the $PATH
environment variable to include $ORACLE_HOME/bin
.
Error when crawling a file data source
If the globalization setting for an environment that starts the Oracle Database is not compatible with the target files' locale, then a file not found error occurs or files or directories with names containing the CJK character. This error occurs in a multibyte language environment like Chinese, Japanese, or Korean. This is because the crawler relies on the correct locale setting to read operating system files.
To correct this, set the correct locale, restart the Oracle Database and make Oracle Ultra Search to re-crawl the data source. For example:
Shutdown the Oracle Database instance:
SQL> shutdown immediate
Set the locale to 'ja'
with the following:
> setenv LANG ja > setenv LC_ALL ja
Restart the Oracle Database instance:
SQL> startup
Restart the Oracle Ultra Search schedule with a forced re-crawl.
Cannot log on to the Oracle Ultra Search administration tool
The ultrasearch.properties
file contains configuration information used by Oracle Ultra Search middle tier. The file is automatically configured by the Oracle Universal Installer, so there is no need to edit this file.
With a software or an advanced database installation, you must manually configure the Oracle Ultra Search administration tool by editing it. You must replace %THIN_JDBC_CONN_STR%
with a JDBC string to the database, and replace %DOMAIN%
with the domain name.
Here is an example of the ultrasearch.properties
file:
connection.driver=oracle.jdbc.driver.OracleDriver #If set, The JDBC connection URL specified here will override the dynamically #acquired one from Oracle Internet Directory. #This setting is also used by the query sample (gsearch.jsp) #Example: connection.url=jdbc:oracle:thin:@<host>:<port>:<sid> connection.url=%JDBC_CONN_STR% oracle.net.encryption_client=REQUESTED oracle.net.encryption_types_client=(RC4_56,DES56C,RC4_40,DES40C) oracle.net.crypto_checksum_client=REQUESTED oracle.net.crypto_checksum_types_client=(MD5) oid.app_entity_cn=m16bi.sgtcnsun03.cn.oracle.com domain=us.oracle.com
In the preceding example, the following variables were used:
connection.driver specifies the JDBC driver you are using.
connection.url specifies the database to which the middle tier connects. Oracle Ultra Search supports following formats:
host:port:SID (where host is the full host name of the Oracle Database instance running Oracle Ultra Search, port is the listener port number for the Oracle Database instance, and SID is the Oracle Database instance ID)
HA-aware string (for example, TNS keyword-value syntax)
oracle.net.encryption_client, oracle.net.encryption_types_client, oracle.net.crypto_checksum_client, and oracle.net.crypto_checksum_types_client control the properties of the secure JDBC connection made to the database. Refer to Oracle Database JDBC Developer's Guide and Reference for more information.
oid.app_entity_cn
specifies the Oracle Ultra Search middle tier application entity name.
domain
specifies the common domain for the Identity Management computer and the Oracle Ultra Search middle tier computer. This enables delegated administrative service (DAS) list of values to work with Internet Explorer. For example, if the Oracle Ultra Search middle tier in us.oracle.com and the Identity Management computer is uk.oracle.com, then the common domain is oracle.com.
Note: You need not to configure the JDBC connect string in theultrasearch.properties file. The database connect information is taken from Oracle Internet Directory. |
The Oracle Ultra Search remote crawler enables multiple crawlers to run in parallel on different hosts. However, all remote crawler hosts must share common resources, such as common directories and a common Oracle Ultra Search database.
The Oracle Ultra Search remote crawler is part of the Oracle Ultra Search backend. The crawler installation procedure is similar to the Oracle Ultra Search backend installation.
On each remote crawler host, the Oracle Ultra Search backend is installed under a common directory known as ORACLE_HOME
. The remote ORACLE_HOME
directory is referred to as $REMOTE_ORACLE_HOME
.
If you have not installed the Oracle HTTP Server during the Oracle Application Server installation, then you must perform the following steps manually for remote crawling:
Locate the file that defines the environment.
On UNIX, $REMOTE_ORACLE_HOME/ultrasearch/tools/remotecrawler/scripts/unix/define_env
On Windows, $REMOTE_ORACLE_HOME/ultrasearch/tools/remotecrawler/scripts/winnt/define_env
.bat
Replace %ORACLE_HOME%
with the value of the REMOTE_ORACLE_HOME
environment variable.
Replace %s_jreLocation%
with the directory path of a Java runtime environment (JRE) version 1.2.2 and higher. You should specify the root directory of the JRE.
Replace %s_jreJDBCclassfile%
with the full path and file name of the Oracle JDBC Thin driver version 12.
The remote crawler requires a communication channel between the backend database and the remote crawler host.
The mechanisms of communication are RMI and JDBC. Configuration of the remote crawler differs depending on which mechanism you use. The JDBC-based mechanism requires you to provide a database user (or role) during the registration process.
The registration process is done by running a SQL script on the Oracle Ultra Search remote crawler host. The SQL script connects over SQL*Plus to the Oracle backend database and registers the remote crawler host.
Locate the correct ORACLE_HOME
.
The Oracle Ultra Search middle tier is installed under a common directory known as ORACLE_HOME
. If you have installed other Oracle products prior to the Oracle Ultra Search middle tier, then you may have multiple ORACLE_HOME
directories on your host. The registration script requires that you enter the ORACLE_HOME
directory in which the Oracle Ultra Search middle tier is installed.
Locate the WKSYS
super-user password.
You must run the registration script as the WKSYS
super-user or as a database user who has been granted super-user privileges.
Start SQL*Plus.
Be sure to run the correct version of SQL*Plus, because multiple versions can reside on the same host if you have installed some other Oracle products. On UNIX platforms, make sure that the correct values for PATH
, ORACLE_HOME
and TNS_ADMIN
variables are set. On Windows, choose the correct menu item from the Start menu.
After you have identified how to run the correct SQL*Plus client, you must log on to the Oracle Ultra Search database. To do this, you might need to configure an Oracle Net service setting for the Oracle Ultra Search database.
After SQL*Plus is running, log on to the database using the schema and password that you located in Step 2.
Run the registration script.
Start up SQL*Plus as the WKSYS
super-user and enter the following:
@full_path_of_registration_script
The registration script for RMI-based remote crawling is the following:
$REMOTE_ORACLE_HOME/ultrasearch/tools/remotecrawler/scripts/<platform>/register.sql
For example, if the value for $REMOTE_ORACLE_HOME
on a UNIX host is /home/oracle10g
, then enter the following at the SQL*Plus prompt to register an RMI-based remote crawler:
@/home/oracle10g/ultrasearch/tools/remotecrawler/scripts/unix/register.sql
The RMI-based registration script prompts you for three variables:
RMI_HOSTNAME
: The remote hostname. This is where the RMI registry/daemon will run.
RMI_REGISTRY_PORT
: The port that the RMI registry is listening on.
ORACLE_HOME
: The Oracle home located in Step 1.
For example, /u01/oracle10g
on a UNIX host or d:/u01/oracle10g
on a Windows host. Remember to use forward slashes for Windows hosts.
The registration script for JDBC-based remote crawling is the following:
$REMOTE_ORACLE_HOME/ultrasearch/tools/remotecrawler/scripts/<platform>/register_jdbc.sql
For example, if you are running SQL*Plus on Windows, and $REMOTE_ORACLE_HOME
is in d:\Oracle\Oracle10g
, then enter the following at the SQL*Plus prompt to register a JDBC-based remote crawler:
@d:\Oracle\Oracle10g\ultrasearch\tools\remotecrawler\scripts\winnt\register_jdbc.sql
The JDBC-based registration script prompts for three variables:
LAUNCHER_NAME
: An arbitrary string used to identify a JDBC-based remote crawler launcher, which is needed when you start up the JDBC-based remote crawler launcher.
CONNECTUSER
: The database user (or role) that the JDBC-based remote crawler launcher will use to establish a database connection and listen for launch events.
ORACLE_HOME
: The Oracle home located in Step 1.
The registration script invokes the wk_crw.register_remote_crawler
PL/SQL API. The REMOTE_CRAWLER_HOSTNAME
and ORACLE_HOME
variables are used to compose arguments for the wk_crw.register_remote_crawler
API. You may optionally choose to call this API, especially if you need to register multiple remote crawlers programatically.
Verify and complete the remote crawler profile configuration. Be sure to enter the correct values for both variables. To verify that the registration has completed correctly, do the following:
Log on to the Oracle Ultra Search administration tool.
Click the Remote Crawler Profiles tab in the Crawler tab. You should see the remote crawler launcher you registered in the remote crawler profile list.
For RMI-based remote crawlers, you will see the host:port combination that uniquely identifies the RMI-subsystem.
For JDBC-based remote crawlers, you will see the Launcher name.
Click Edit to complete the configuration process for the remote crawler profile.
See Also: Oracle Net Services Administrator's Guide for information on how to configure a service setting |
If you enter wrong values for the register
.sql
script, then you need to unregister the remote crawler using the unregister
.sql
script. Run the unregister script the same way you ran the registration script. The unregister
.sql
script calls the wk_crw
.unregister_remote_crawler
PL/SQL API. After you have successfully unregistered the remote crawler, you can rerun the register
.sql
script.
Before you upgrade, log on to the Oracle Ultra Search administration tool. Stop and disable all crawler synchronization schedules in every Oracle Ultra Search instance. You can enable all crawler synchronization schedules after the upgrade.
To upgrade Oracle Ultra Search shipped with the Oracle Database release, do the following:
Run the Oracle Ultra Search backend upgrade. This includes upgrading the Oracle Ultra Search database schemas and server files. Install the new Oracle software, and run Oracle Database Upgrade Assistant to upgrade the database and Oracle Ultra Search component to the new release. See the Oracle Database Upgrade Guide for details.
Follow the steps in "Installing the Oracle Ultra Search Middle Tier" to install the new Oracle Ultra Search middle tier.
After upgrading to the current release, follow these post-upgrade configuration steps:
Set the ORACLE_HOME
and ORACLE_SID
environment variables to Oracle Database 10g.
Change directories to ORACLE_HOME/ultrasearch/admin/
.
Run the following statement:
sqlplus "sys/password as sysdba"
Run the following statement:
@wk0config.sql WKSYSPW JDBC_CONNSTR LAUNCH_ANYWHERE NET_SERVICE_NAME
In the preceding statement, the following parameters were used:
WKSYSPW
is the password for the WKSYS
schema.
JDBC_CONNSTR
is the JDBC connection string. Use the format hostname:port:sid
, such as machine1:1521:iasdb
, if the database is not in the Oracle Real Application Clusters environment.
If the database is in a Oracle Real Application Cluster environment, then use the TNS keyword-value format instead, because it enables connection to any node of the system:
(DESCRIPTION=(LOAD_BALANCE=yes) (ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP)(HOST=cls02a)(PORT=3001)) (ADDRESS=(PROTOCOL=TCP)(HOST=cls02b)(PORT=3001))) (CONNECT_DATA=(SERVICE_NAME=sales.us.acme.com)))
In the preceding syntax, the following parameters were used:
LAUNCH_ANYWHERE
is the mode of the database. Setting it to TRUE
indicates that the database is in Oracle Real Application Cluster mode, FALSE
indicates that the database is not in Oracle Real Application Cluster mode.
NET_SERVICE_NAME
is the network service name used by wk0config.sql
to establish the database connection. Setting it to ""
(empty string) while running wk0config.sql
from the database host eliminates the need to specify the network service name.
The following is an example of the post-upgrade script for a non-Oracle Real Application Cluster environment:
@wk0config.sql welcome1 machine:1521:iasdb FALSE""
The following is an example of the post-upgrade script for an Oracle Real Application Clusters environment:
@wk0config.sql welcome1 "(DESCRIPTION=(LOAD_BALANCE=yes) (ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP)(HOST=cls02a)(PORT=3001)) (ADDRESS=(PROTOCOL=TCP)(HOST=cls02b)(PORT=3001))) (CONNECT_DATA=(SERVICE_NAME=sales.us.acme.com)))" FALSE ""