Oracle® Ultra Search Administrator's Guide
10g Release 2 (10.1.2) Part No. B14041-01 |
|
Previous |
Next |
This chapter contains the following topics:
Note: Some information in this chapter is generic to all types of Oracle Ultra Search installations. Other information in this chapter is specific to installing and configuring Oracle Ultra Search with an Oracle Database release.If you are installing Oracle Ultra Search with the Oracle Application Server release, then you should also read Chapter 4, "Using Oracle Ultra Search with Oracle Application Server". |
This section describes the Oracle Ultra Search system requirements.
Oracle Ultra Search hardware requirements are based on the quantity of data that you plan to process using Oracle Ultra Search. Oracle Ultra Search uses Oracle Text as its indexing engine and the Oracle Database as its repository.
See Also: Oracle Text Application Developer's Guide and Oracle Database Performance Tuning Guide |
Sufficient RAM Along with the resource requirements for the database and the Text indexing engine, also consider the memory requirements of the Oracle Ultra Search crawler. The Oracle Ultra Search crawler is a pure Java program. When the crawler is launched, the Java Virtual Machine (JVM) is configured to start with 25MB and grow to 256MB. When crawling very large amounts of data, these values might need to be adjusted.
The Oracle Ultra Search administration tool is a J2EE 1.2 standard Web application. It can be installed and run on a separate host from the Oracle Ultra Search backend. You might want to install and run this on the same host as the Oracle Ultra Search backend. Regardless of your choice, allocate enough memory for the J2EE engine. Oracle recommends using the Oracle HTTP Server with the Oracle Application Server Containers for J2EE (OC4J). Allocate enough memory for the HTTP Server as well as for the Java Development Kit (JDK) that runs the J2EE engine.
Sufficient Disk Space Because customer requirements vary widely, Oracle cannot recommend a specific amount of disk space. As a general guideline, the minimum requirements are as follows:
Approximately 3GB of disk space for the Oracle Application Server Infrastructure or database and the Oracle Ultra Search backend.
15MB of disk space for the Oracle Ultra Search middle tier on top of the Web server's disk requirements.
For each remote crawler host, the same amount of disk space as needed to install the Oracle Ultra Search backend.
Disk space for a large TEMPORARY
tablespace. Create a TEMPORARY
tablespace as large as possible, depending on the amount of RAM on the host.
Disk space for the Oracle Ultra Search instance user's tablespace.
The Oracle Ultra Search instance user is a database user that you must explicitly create. All data that is collected and processed as part of crawling and indexing is stored in this user's schema.
You should create the tablespace as large as the total amount of data that you want to index. For example, if you estimate that the total amount of data to be crawled and indexed is 10GB, then create a tablespace that is at least 10GB for the Oracle Ultra Search instance user. Make sure to assign that tablespace as the default tablespace of the Oracle Ultra Search instance user.
The Oracle Ultra Search backend consists of the following:
Oracle Ultra Search database schema: Data dictionary and PL/SQL packages.
Oracle Ultra Search crawler: Java program plus supporting files, libraries, and so on.
Oracle Ultra Search remote crawler: Crawler residing on a remote Oracle home.
The Oracle Ultra Search backend is installed as part of the Oracle Database Server installation, which is accomplished using the Oracle Universal Installer.
See Also: Oracle Universal Installer Concepts Guide. |
The Oracle Ultra Search middle tier includes the following:
Oracle Ultra Search administration tool
Oracle Ultra Search Java query API
Oracle Ultra Search query applications
This section covers Oracle Ultra Search postinstallation tasks. There are five steps to the postinstallation:
This section explains these five steps in more detail.
For the Oracle Database release, use the following command to start the Oracle Ultra Search middle tier. You must run this command manually to bring up the Oracle Ultra Search middle tier after installation.
$ORACLE_HOME/bin/searchctl start
For the Oracle Application Server release, use Oracle Process Manager and Notification Server (OPMN) to start the OC4J_Portal instance. For example:
$ORACLE_HOME/opmn/bin/opmnctl startproc instancename=OC4J_Portal
The Oracle Ultra Search installer creates a default Oracle Ultra Search instance based on the default Oracle Ultra Search test user. You can test Oracle Ultra Search functionality based on the default instance after installation.
The default instance name is WK_INST
. It is created based on the database user WK_TEST
. The default user password is WK_TEST
.
For security purposes, WK_TEST
is locked after the installation. The administrator should log on to the database as DBA role, unlock the WK_TEST
user account, and set the password to be WK_TEST
. The password expires after the installation. If the password is changed to anything other than WK_TEST
, then you must also update the cached schema password using administration tool Edit Instance page after you change the password in the database.
The default instance is also used by the Oracle Ultra Search query application. Make sure to update the data-sources
.xml
file.
Caution: Storing clear text passwords indata-sources .xml poses a security risk. Avoid this by using password indirection to specify the password. This lets you enter the password in jazn-data .xml , which automatically gets encrypted, and point to it from data-sources .xml . For more information, see "Creating An Indirect Password" in Oracle Application Server Containers for J2EE Security Guide.
|
The data-sources.xml
file is located in the $ORACLE_HOME/oc4j/j2ee/OC4J_SEARCH/config
directory. Under tag <data-sources>
add the following:
<data-source class="oracle.jdbc.pool.OracleConnectionCacheImpl" name="UltraSearchDS" location="jdbc/UltraSearchPooledDS" username="username" password="password" url="jdbc:oracle:thin:@database_host:oracle_port:oracle_sid" />
In the preceding syntax, the following variables were used:
username and password are the Oracle Ultra Search instance owner's database user name and password.
database_host is the host name of the back end database computer.
oracle_port is the port to the user's Oracle Database.
oracle_sid is the SID of the user's Oracle Database.
In addition to user name, password, and JDBC URL, data-sources
.xml
allows configuration of the connection cache size, and the cache scheme.
The following tag specifies the minimum and maximum limits of the cache size, the inactivity time out interval, and the cache scheme.
<data-source class="oracle.jdbc.pool.OracleConnectionCacheImpl" name="UltraSearchDS" location="jdbc/UltraSearchPooledDS" username="wk_test" password="wk_test" url="jdbc:oracle:thin:@localhost:1521:isearch" min-connections="3" max-connections="30" inactivity-timeout="30"> <property name="cacheScheme" value="1"/> </data-source>
If you are adding the data source for the default Oracle Ultra Search instance user WK_TEST
, then make sure to unlock WK_TEST
first.
There are three values for the caching schemes:
1
= DYNAMIC_SCHEME
2
= FIXED_WAIT_SCHEME
3
= FIXED_RETURN_NULL_SCHEME
For the database release, stop and restart the Oracle Ultra Search middle tier. For example:
For the Oracle Application Server release, use Oracle Process Manager and Notification Server (OPMN) to start the OC4J_Portal instance. For example:
$ORACLE_HOME/opmn/bin/opmnctl startproc instancename=OC4J_Portal
You must reconfigure the Oracle Ultra Search backend to adapt to the new character set if one of the following has occurred:
You choose a database character set (for example, UTF8) other than the default WE8ISO8859P1 character set during the Oracle Ultra Search installation.
The database character set is changed after installation.
Two SQL scripts (wk0prefcheck.sql
and wk0idxcheck.sql
), located in $ORACLE_HOME/ultrasearch/admin/
, are used for reconfiguration:
wk0prefcheck.sql
is run by the wksys
user to reconfigure default cache character set and index preferences.
wk0idxcheck.sql
is needed for reconfiguring instances created before the database character set change (for example, the default instance). This script must be run by the instance owner, and wk0prefcheck.sql
must be run first because it depends on reconfigured default settings generated by wk0prefcheck.sql
.
Running wk0idxcheck.sql
also drops and recreates the Oracle Text index used by Oracle Ultra Search. If there are already data sources indexed, then you must force a recrawl of all of the data sources.
wk0idxcheck.sql
must be run once for each instance. For example, if there are two instances, inst1 and inst2, owned by owner1 and owner2, respectively, then wk0idxcheck.sql
should be run twice, once by owner1 and once by owner2.
Note: Oracle Ultra Search only supports database character sets supported by Oracle Text. For example, the AL32UTF8 character set is not supported. For Unicode support, use UTF8. For the complete list of supported database character sets, see the Oracle Text Reference for lexer types. |
This section describes how to check that your installation was successful.
If you log on to the Oracle Ultra Search administration tool successfully, then you have completed the Oracle Ultra Search administration tool configuration process. Do the following to check the Oracle Ultra Search Administration Tool:
Check that the Web Server is running.
Attempt to log on to the administration tool:
Visit the following URL
http://
hostname.domainname:port
/ultrasearch/admin/index.jsp
In the preceding URL, hostname.domainname is the full name of the host where you have installed the Oracle Ultra Search middle tier, and port is the default Web server port.
Log on to the Oracle Ultra Search administration tool by entering the Oracle Ultra Search instance owner's database user name and password.During the installation of the Oracle Ultra Search backend, a new Ultra Search instance owner, WK_TEST
is created.
The first time any JSP page is accessed, it takes a few seconds to compile. Subsequent accesses are much faster.
After you verify that the Oracle Ultra Search administration tool is working, you should be able to run the Oracle Ultra Search query applications.
To test the Oracle Ultra Search query applications, do one of the following:
Visit the following URL:
http://
hostname.domainname:port
/ultrasearch/query/search.jsp
Follow the links in the Oracle Ultra Search welcome page: http://
hostname.domainname:port
/ultrasearch/index.html
Locations for query applications are listed in the following section. Access the query source code by going to the directories list. You can also see a working demonstration of each query JSP page with the URL root, and you can append the correct JSP file name at the end of the URL root.
The query application is shipped as $ORACLE_HOME/ultrasearch/ultrasearch_query.ear
.
Portlet is shipped as $ORACLE_HOME/ultrasearch/webapp/ultrasearch_portlet.ear
.
This section describes how to troubleshoot Oracle Ultra Search.
Query finds no results
See "Reconfigure the Oracle Ultra Search Backend for the Database Character Set".
Error when processing binary files
The Oracle Ultra Search crawler uses the Oracle Text INSO filter, ctxhx
, for processing of binary files. These are non-text, non-HTML files such as PDF files, Microsoft Word files, and so on. For Oracle Ultra Search to be able to use the INSO filter, the shared library path environment variable must contain the $ORACLE_HOME/ctx/lib
path.
At installation, the Oracle Universal Installer automatically sets the variable to include $ORACLE_HOME/ctx/lib
. If you restart the database after the installation, then you must manually set your shared library path environment variable to include $ORACLE_HOME/ctx/lib
before starting the Oracle process. You must restart the database to pick up the new value for filtering to work.
On UNIX set the $LD_LIBRARY_PATH
environment variable to include $ORACLE_HOME/ctx/lib
.
On Windows set the $PATH
environment variable to include $ORACLE_HOME/bin
.
Error when crawling a file data source
If the globalization setting for the environment that starts the Oracle Database is not compatible with the target files' locale, then a file not found error occurs or files or directories with names containing the CJK character. This error occurs in a multibyte language environment like Chinese, Japanese, or Korean. This is because the crawler relies on the correct locale setting to read operating system files.
To correct this, set the correct locale, restart the Oracle Database, and force Oracle Ultra Search to re-crawl the data source. For example:
Shutdown the Oracle Database instance:
SQL> shutdown immediate
Set the locale to 'ja'
with the following:
> setenv LANG ja > setenv LC_ALL ja
Restart the Oracle Database instance:
SQL> startup
Restart the Oracle Ultra Search schedule with a forced re-crawl.
Cannot log on to the Oracle Ultra Search administration tool
The ultrasearch.properties
file contains configuration information used by Oracle Ultra Search middle tier. You do not need to edit this file, because it is automatically configured by the Oracle Universal Installer.
However, with a software-only or an advanced database installation, you must manually configure the Oracle Ultra Search administration tool by editing it. You must replace %THIN_JDBC_CONN_STR%
with a JDBC string to the database, and replace %DOMAIN%
with the domain name.
Here is an example of the ultrasearch.properties
file:
connection.driver=oracle.jdbc.driver.OracleDriver #If set, The JDBC connection URL specified here will override the dynamically #acquired one from Oracle Internet Directory. #This setting is also used by the query sample (gsearch.jsp) #Example: connection.url=jdbc:oracle:thin:@<host>:<port>:<sid> connection.url=%JDBC_CONN_STR% oracle.net.encryption_client=REQUESTED oracle.net.encryption_types_client=(RC4_56,DES56C,RC4_40,DES40C) oracle.net.crypto_checksum_client=REQUESTED oracle.net.crypto_checksum_types_client=(MD5) oid.app_entity_cn=m16bi.sgtcnsun03.cn.oracle.com domain=us.oracle.com
In the preceding example, the following variables were used:
connection.driver specifies the JDBC driver you are using.
connection.url specifies the database to which the middle tier connects. Oracle Ultra Search supports following formats:
host:port:SID (where host is the full host name of the Oracle Database instance running Oracle Ultra Search, port is the listener port number for the Oracle Database instance, and SID is the Oracle Database instance ID)
HA-aware string (for example, TNS keyword-value syntax)
oracle.net.encryption_client, oracle.net.encryption_types_client, oracle.net.crypto_checksum_client, and oracle.net.crypto_checksum_types_client control the properties of the secure JDBC connection made to the database. See Oracle Database JDBC Developer's Guide and Reference for more information.
oid.app_entity_cn
specifies the Oracle Ultra Search middle tier application entity name.
domain
specifies the common domain for the Identity Management machine and the Oracle Ultra Search middle tier machine. This enables delegated administrative service (DAS) list of values to work with Internet Explorer. For example, if the Oracle Ultra Search middle tier in us.oracle.com and the Identity Management machine is uk.oracle.com, then the common domain is oracle.com.
Note: You no longer need to configure the JDBC connect string in theultrasearch.properties file. The database connect information is taken from Oracle Internet Directory.
|
The Oracle Ultra Search remote crawler allows multiple crawlers to run in parallel on different hosts. However, all remote crawler hosts must share common resources, such as common directories and a common Oracle Ultra Search database.
The Oracle Ultra Search remote crawler is part of the Oracle Ultra Search backend. The crawler installation procedure is the same as installing the Oracle Ultra Search backend.
On each remote crawler host, the Oracle Ultra Search backend is installed under a common directory known as ORACLE_HOME
. You should have been prompted by the Oracle Universal Installer to enter this directory. The remote ORACLE_HOME
directory is referred to as $REMOTE_ORACLE_HOME
.
If you choose not to install the Oracle HTTP Server during the Oracle Application Server installation, then you must perform the following steps manually for remote crawling:
Locate the file that defines the environment.
On UNIX, $REMOTE_ORACLE_HOME/ultrasearch/tools/remotecrawler/scripts/unix/define_env
On Windows, $REMOTE_ORACLE_HOME/ultrasearch/tools/remotecrawler/scripts/winnt/define_env
.bat
Replace %ORACLE_HOME%
with the value of the REMOTE_ORACLE_HOME
environment variable.
Replace %s_jreLocation%
with the directory path of a Java runtime environment (JRE) version 1.2.2 and higher. You should specify the root directory of the JRE.
Replace %s_jreJDBCclassfile%
with the full path and file name of the Oracle JDBC Thin driver version 12.
The remote crawler requires a communication channel between the backend database and the remote crawler host.
The mechanisms of communication are RMI and JDBC. Configuration of the remote crawler differs depending on which mechanism you use. The JDBC-based mechanism requires you to supply a database user (or role) during the registration process.
The registration process is done by running a SQL script on the Oracle Ultra Search remote crawler host. The SQL script connects over SQL*Plus to the Oracle backend database and registers the remote crawler host.
Locate the correct ORACLE_HOME
.
The Oracle Ultra Search middle tier is installed under a common directory known as ORACLE_HOME
. If you have installed other Oracle products prior to the Oracle Ultra Search middle tier, then you could have multiple ORACLE_HOME
directories on your host. The registration script requires that you enter the ORACLE_HOME
directory in which the Oracle Ultra Search middle tier is installed.
Locate the WKSYS
super-user password.
You must run the registration script as the WKSYS
super-user or as a database user that has been granted super-user privileges.
Start SQL*Plus.
Be sure to run the correct version of SQL*Plus, because multiple versions can reside on the same host if you have previously installed some Oracle products. On UNIX platforms, make sure that the correct values for PATH
, ORACLE_HOME
and TNS_ADMIN
variables are set. On Windows platforms, choose the correct menu item from the Start menu.
After you have identified how to run the correct SQL*Plus client, you must log on to the Oracle Ultra Search database. To do this, you might need to configure an Oracle Net service setting for the Oracle Ultra Search database.
After SQL*Plus is running, log on to the database using the schema and password that you located in Step 2.
Run the registration script.
Start up SQL*Plus as the WKSYS
super-user and enter the following:
@full_path_of_registration_script
The registration script for RMI-based remote crawling is the following:
$REMOTE_ORACLE_HOME/ultrasearch/tools/remotecrawler/scripts/<platform>/register.sql
For example, if the value for $REMOTE_ORACLE_HOME
on a UNIX host is /home/oracle10g
, then enter the following at the SQL*Plus prompt to register an RMI-based remote crawler:
@/home/oracle10g/ultrasearch/tools/remotecrawler/scripts/unix/register.sql
The RMI-based registration script prompts you for three variables:
RMI_HOSTNAME
: The remote hostname. This is where the RMI registry/daemon will run.
RMI_REGISTRY_PORT
: The port that the RMI registry is listening on.
ORACLE_HOME
: The Oracle home located in Step 1.
For example, /u01/oracle10g
on a UNIX host or d:/u01/oracle10g
on a Windows host. Remember to use forward slashes for Windows hosts.
The registration script for JDBC-based remote crawling is the following:
$REMOTE_ORACLE_HOME/ultrasearch/tools/remotecrawler/scripts/<platform>/register_jdbc.sql
For example, if you are running SQL*Plus on Windows, and $REMOTE_ORACLE_HOME
is in d:\Oracle\Oracle10g
, then enter the following at the SQL*Plus prompt to register a JDBC-based remote crawler:
@d:\Oracle\Oracle10g\ultrasearch\tools\remotecrawler\scripts\winnt\register_jdbc.sql
The JDBC-based registration script prompts for three variables:
LAUNCHER_NAME
: An arbitrary string used to identify a JDBC-based remote crawler launcher, which is needed when you start up the JDBC-based remote crawler launcher.
CONNECTUSER
: The database user (or role) that the JDBC-based remote crawler launcher will use to establish a database connection and listen for launch events.
ORACLE_HOME
: The Oracle home located in Step 1.
The registration script invokes the wk_crw.register_remote_crawler
PL/SQL API. The REMOTE_CRAWLER_HOSTNAME
and ORACLE_HOME
variables are used to compose arguments for the wk_crw.register_remote_crawler
API. You may optionally choose to call this API, especially if you need to register multiple remote crawlers programatically.
Verify and complete the remote crawler profile configuration. Be sure to enter the correct values for both variables. To verify that the registration has completed correctly, do the following:
Log on to the Oracle Ultra Search administration tool.
Click the Remote Crawler Profiles tab in the Crawler tab. You should see the remote crawler launcher you registered in the remote crawler profile list.
For RMI-based remote crawlers, you will see the host:port combination that uniquely identifies the RMI-subsystem.
For JDBC-based remote crawlers, you will see the Launcher name.
Click Edit to complete the configuration process for the remote crawler profile.
See Also: Oracle Net Services Administrator's Guide for information on how to configure a service setting |
If you enter any wrong values for the register
.sql
script, then you should unregister the remote crawler using the unregister
.sql
script. Run the unregister script the same way as you ran the registration script. The unregister
.sql
script calls the wk_crw
.unregister_remote_crawler
PL/SQL API. After you have successfully unregistered the remote crawler, you can rerun the register
.sql
script.
Before you upgrade, log on to the Oracle Ultra Search administration tool. Stop and disable all crawler synchronization schedules in every Oracle Ultra Search instance. You can enable all crawler synchronization schedules after the upgrade.
To upgrade Oracle Ultra Search shipped with the Oracle Database release, do the following:
Run the Oracle Ultra Search backend upgrade. This includes upgrading the Oracle Ultra Search database schemas and server files. Install the new Oracle software, and run Oracle Database Upgrade Assistant to upgrade the database and Oracle Ultra Search component to the new release. See the Oracle Database Upgrade Guide for details.
Follow the steps in "Installing the Oracle Ultra Search Middle Tier" to install the new Oracle Ultra Search middle tier.
After upgrading to the current release, follow these post-upgrade configuration steps:
Set the ORACLE_HOME
and ORACLE_SID
environment variables to Oracle Database 10g.
Change directories to ORACLE_HOME/ultrasearch/admin/
.
Run the following statement:
sqlplus "sys/password as sysdba"
Run the following statement:
@wk0config.sql WKSYSPW JDBC_CONNSTR LAUNCH_ANYWHERE NET_SERVICE_NAME
In the preceding statement, the following parameters were used:
WKSYSPW
is the password for the WKSYS
schema.
JDBC_CONNSTR
is the JDBC connection string. Use the format hostname:port:sid
, such as machine1:1521:iasdb
, if the database is not in the Oracle Real Application Clusters environment.
If the database is in a Oracle Real Application Cluster environment, then use the TNS keyword-value format instead, because it allows connection to any node of the system:
(DESCRIPTION=(LOAD_BALANCE=yes) (ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP)(HOST=cls02a)(PORT=3001)) (ADDRESS=(PROTOCOL=TCP)(HOST=cls02b)(PORT=3001))) (CONNECT_DATA=(SERVICE_NAME=sales.us.acme.com)))
In the preceding syntax, the following parameters were used:
LAUNCH_ANYWHERE
is the mode of the database. Setting it to TRUE
indicates that the database is in Oracle Real Application Cluster mode; FALSE
indicates that the database is not in Oracle Real Application Cluster mode.
NET_SERVICE_NAME
is the network service name used by wk0config.sql
to establish the database connection. Setting it to ""
(empty string) while running wk0config.sql
from the database host eliminates the need to specify the network service name.
The following is an example of the post-upgrade script for a non-Oracle Real Application Cluster environment:
@wk0config.sql welcome1 machine:1521:iasdb FALSE""
The following is an example of the post-upgrade script for an Oracle Real Application Clusters environment:
@wk0config.sql welcome1 "(DESCRIPTION=(LOAD_BALANCE=yes) (ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP)(HOST=cls02a)(PORT=3001)) (ADDRESS=(PROTOCOL=TCP)(HOST=cls02b)(PORT=3001))) (CONNECT_DATA=(SERVICE_NAME=sales.us.acme.com)))" FALSE ""