Oracle9i Application Developer's Guide - Large Objects (LOBs) Release 2 (9.2) Part Number A96591-01 |
|
This chapter contains the following sections:
This description has been extracted from an article by Samir S. Shah in Java Developer's Journal. Reprinted by permission of Java Developer's Journal.
Toolset Used
Today building an information repository is essential for businesses. the information repository helps establish a paperless office and allows data to be shared in or outside an enterprise.
With the toolset mentioned earlier, you can build an enterprise-class, scalable web-enabled multimedia-rich information repository that incorporates various forms of media. This repository includes unstructured data, such as document files, video clips, photographs, ... and sound files. It uses Java and Oracle's Large Objects (LOBs).
This section describes how you can build such an information repository for storing and searching documents such as Microsoft Word, HTML, and XML files which are stored in a LOB column of a database table.
The example used here populates the repository with Microsoft Word resumes, indexes it using Oracle Text (interMedia Text), and reads the repository using Java streams from a servlet.
See Figure 14-1.
Building repositories using Java and Oracle8i/9i has several benefits. The documents can inherently take advantage of the transaction management and ACID (Atomocity, Concurrency, Integrity, and Durability) properties of the relational database. This means that changes to an internal LOB can be committed or rolled-back. Moreover, because the unstructured data is stored by the database, your applications can seamlessly take advantage of database features such as backup and recovery. This helps Administrators who would no longer have to perform separate database and file system backups for relational information and documents.
All data in the database, including structured (relational) and unstructured (document files), can be written, searched, and accessed using SQL. The SQL statements can be executed from Java using JDBC.
Oracle8i and Oracle9i support several types of LOB columns. One type, BLOBs, can house binary information such as audio, video, images, and comments internally in the database. Each row can store up to 4 gigabytes of data. The application described here uses a BLOB data type to store Microsoft Word resumes.
The Oracle database stores a locator in-line with the data. The locator is a pointer to the actual location of the data (LOB value). The LOB data can be stored in the same or a separate table. the advantage of using the locator is that the database will not have to scan the LOB data each time it reads multiple rows because only the LOB locator value is read. The actual LOB data is read only when required.
When working with Java and LOBs, first execute the SELECT statement to get the LOB locator, then read or write LOBs using JDBC.
Note: The JDBC driver's Oracle type extension package, oracle.sql, is used to read and write from an oracle database. |
The actual LOB data is materialized as a java stream from the database, where the locator represents the data in the table. The following code reads the resume of an employee whose employee number is 7900. Employee number is stored in a LOB column called "resume" in table, sam_emp.
Statement st = cn.createStatement(); ResultSet rs = st.executeQuery ("Select resume from sam_emp where empno=7900"); rs.next(); oracle.sql.BLOB blob=((OracleResultSet)rs).getBLOB(1); InputStream is=blob.getBinaryStream();
The documents can be written to LOB columns using Java, PL/SQL, or a bulk loading utility called Oracle SQL*Loader. To insert a new row, perform the following:
getBinaryOutputStream()
method of this object to write your document or any binary information to that column. For example, to insert information about a new employee whose employee number is 9001 in table sam_emp, first insert all the structured information along with an empty BLOB using JDBC. next select the LOB column, resume, of the same row to get the oracle.sql.BLOB object (the locator).st.execute("INSERT INTO sam_emp(empno, resume) VALUES(9001,empty_blob())"); ResultSet rs = st.executeQuery( "select resume from sam_emp where empno=9001 for update"); rs.next(); oracle.sql.BLOB blob = ((OracleResultSet)rs).getBLOB(1); OutputStream os = blob.getBinaryOutputStream();
Optionally, you can use java.awt.FileDialog
class and java.io
package to dynamically select and read a file from your PC. Then, load it to a LOB column using the preceding code.
The way you search and retrieve documents does not depend on how you load the documents. For example, you can store the documents using PL/SQL or SQL*Loader, then search and retrieve using Java servlets.
The following code (steps 2-5) inserts MyResume.doc in the resume column of sam_emp table.
You must have CREATE DIRECTORY privilege in Oracle.
create or replace directory MY_FILES as 'C:\MY_DATA';
declare f_lob bfile; b_lob blob; begin insert into sam_emp(empno,ename,resume) values ( 9001, 'Samir',empty_blob() ) return documents into b_lob; f_lob := bfilename( 'MY_FILES', 'MyResume.doc' ); dbms_lob.fileopen(f_lob, dbms_lob.file_readonly); dbms_lob.loadfromfile ( b_lob, f_lob, dbms_lob.getlength(f_lob) ); dbms_lob.fileclose(f_lob); commit; end; /
Documents stored in the LOB columns can be indexed using Oracle Text (interMedia Text). Oracle9i Text provides you with advanced search capabilities such as fuzzy, stemming, proxy, phrases, and more. It can also generate thematic searches and gist. The documents can be indexed using `create index' database command.
The following code shows you how the index was built on the resume column of the sam_emp table. Once the index is created, the Java applications can search the repository by simply submitting SELECT statements.
The following steps index all the Microsoft Word formatted resumes stored in the resume column to the sam_emp table.The resumes can then be searched using SQL.
alter table sam_emp add constraint pk_sam_emp primary key(empno);
create index ctx_doc_idx on sam_emp(resume) indextype is ctxsys.context parameters ('filter CTXSYS.INSO_FILTER');
The following code lists the servlet `MyServletCtx'. It searches the term passed to it as a parameter in the resume column of table, sam_emp. The servlet returns the rows matching the search criteria in HTML table format. The employee names in the HTML table are hyperlinked to another servlet, `MyServlet`, which reads the entire resumé from the database, in its original format.
1234567890123456789012345678901234567890123456789012 package package1; import javax.servlet.*; import javax.servlet.http.*; import java.io.*; import java.sql.*; /** * This servlet searches documents stored in Oracle8i * database repository using SQL and JDBC. The hit * list is displayed in html table with hyper links. * JDK 1.1.7 and Oracle Thin JDBC 1.22 complient * driver is used. * * @author Samir Shah * @version 1.0 **/ public class MyServletCtx extends HttpServlet{ Connection cn; public void init(ServletConfig parm1) throws ServletException { super.init( parm1); try{ DriverManager.registerDriver( (new oracle.jdbc.driver.OracleDriver())); cn =DriverManager.getConnection ("jdbc:oracle:thin:@sshah:1521:o8i", "scott", "tiger"); } catch (SQLException se){se.printStackTrace();} } public void doGet(HttpServletRequest req, HttpServletResponse res) throws IOException{ doPost(req,res); } public void doPost(HttpServletRequest req, HttpServletResponse res) throws IOException{ PrintWriter out = res.getWriter(); res.setContentType("text/html"); //The term to search in resume column String term = req.getParameter("term"); if (term == null) term="security"; out.print("<html>"); out.print("<body>"); out.print("<H1>Search Result</H1>"); out.print("<table border=1 bgcolor=lightblue>"); out.print("<tr><th>ID#</th><th>Name</th></tr>"); out.print("<tr>"); try{ Statement st = cn.createStatement(); //search the term in resume column using SQL String query = "Select empno,ename from sam_emp" + " where contains(resume,'"+term+"')>0"; ResultSet rs = st.executeQuery(query); while (rs.next()){ out.print("<td>"+ rs.getInt(1)+"</td>"); out.print("<td>" + "<A HREF=http://sshah:8080/" + "servlet/MyServlet?term=" + rs.getString(1) + " target=Document>" + rs.getString(2) + "</A></td>"); out.print("</tr>"); } out.print("</table>"); out.print("</body>"); out.print("</html>"); }//try catch (SQLException se){se.printStackTrace();} } }
The document retrieval using Java is similar to writing documents to the repository. The section, "How this Application Uses LOBs" describes how to read LOBs from the database.
The following code in `MyServlet' reads a Microsoft Word resumé from the table, sam_emp. It sets the content type, then streams it out to the browser using an output stream.
1234567890123456789012345678901234567890123456789012 package package1; import javax.servlet.*; import javax.servlet.http.*; import java.sql.*; import java.io.*; import oracle.jdbc.driver.*; import oracle.sql.*; //for oracle.sql.BLOB /** * This class reads the entire document from the * resume LOB column. It takes one parameter,term, * to search a specific employee from the sam_emp * table and returns the doucement stored in that * row. * * JDK 1.1.7, Oracle Thin JDBC 1.22 complient driver * Use Oracle JDBC Type extends package oracle.sql. * * @author Samir Shah * @version 1.0 **/ public class MyServlet extends HttpServlet{ Connection cn; public void doGet(HttpServletRequest req, HttpServletResponse res) { try{ doPost(req,res); }catch (IOException ie){ie.printStackTrace();} } public void init(ServletConfig parm1) throws ServletException { super.init( parm1); try{ DriverManager.registerDriver( (new oracle.jdbc.driver.OracleDriver())); cn =DriverManager.getConnection( "jdbc:oracle:thin:@sshah:1521:o8i", "scott", "tiger"); } catch (SQLException se){se.printStackTrace();} } public void doPost(HttpServletRequest req, HttpServletResponse res) throws IOException { InputStream is=null; oracle.sql.BLOB blob=null; res.setContentType("application/msword"); OutputStream os = res.getOutputStream(); String term = req.getParameter("term"); if (term==null) term="9001"; try{ Statement st = cn.createStatement(); ResultSet rs = st.executeQuery ("Select resume from sam_emp"+ " where empno="+term); while (rs.next()){ blob=((OracleResultSet)rs).getBLOB(1); is=blob.getBinaryStream(); } int pos=0; int length=0; byte[] b = new byte[blob.getChunkSize()]; while((length=is.read(b))!= -1){ pos+=length; os.write(b); } }//try catch (SQLException se) { se.printStackTrace(); } finally { is.close(); } } }
This section showed you how to store, search and retrieve Word documents using LOB data types and Java.
You can also store, index, parse and transform XML documents using the Oracle9i database. By storing XML documents in the database, there is no need to administer and manage multiple repositories for relational and XML data. Oracle9i and Oracl9i Application Server are XML-enabled whereby you can run the Oracle XML Parser for Java and parse and transform XML files in the database before outputting to an application server.
Design and Build a LOB and interMedia Based Web site. The Web site must include video 'thumbnails' where by users can click a specific thumbnail to see a short 6 - 8 second video clip.
Here are some ideas for setting up your LOB-based web-site:
create table video_clips ( move_id integer, thumbnail ordsys.ordimage, movie ordsys.ordvideo);
See Note 2.
Your disk space depends on the frame size, frame rate, and compression settings. One second of video, at 720x480 pixels, 30 frames per second (fps), takes roughly 3.6MB of disk space. 720x480 is pretty big for the web but should be fine if this is on an intranet. 30 fps looks very smooth but might not be necessary. Test a sample to see what 320x240 looks like. Check if there is sufficient detail present. If not, increase the resolution until you are satisfied.
|
Copyright © 1996, 2002 Oracle Corporation. All Rights Reserved. |
|