Oracle® Application Server Containers for J2EE Support for JavaServer Pages Developer's Guide
10g Release 2 (10.1.2) B14014-02 |
|
Previous |
Next |
The JSP container in OC4J provides standard globalization support (also known as National Language Support, or NLS) according to the JSP specification, and also offers extended support for servlet environments that do not support multibyte parameter encoding.
Standard Java support for localized content depends on the use of Unicode for uniform internal representation of text. Unicode is used as the base character set for conversion to alternative character sets. (The Unicode version depends on the JDK version. You can find the Unicode version through the Sun Microsystems Javadoc for the java.lang.Character
class.)
This chapter describes key aspects of JSP support for globalization and internationalization. The following sections are included:
Note: For detailed information about Oracle Application Server Globalization Support, see the Oracle Application Server Globalization Guide. |
The following sections cover standard ways to statically or dynamically specify the content type for a JSP page. There is also discussion of an Oracle extension method that enables you to specify a non-IANA (Internet Assigned Numbers Authority) character set for the JSP writer object.
The page
directive has two attributes, pageEncoding
and contentType
, that affect the character encoding of the JSP page source (during translation) or response (during runtime). The contentType
attribute also affects the MIME type of the response. The function of each attribute is as follows:
You can use contentType
to set the character encoding of the page source and response, and the MIME type of the response.
You can use pageEncoding
to set the character encoding of the page source. The main purpose of this attribute, which was introduced in the JSP 1.2 specification, is to allow you to set a page source character encoding that is different than the response character encoding. However, this setting also acts as a default for the response character encoding if there is no contentType
attribute that specifies a character set.
There is more information about the relationship between contentType
and pageEncoding
later in this section.
Use the following syntax for contentType
:
contentType="TYPE; charset=character_set"
Alternatively, to set the MIME type while using the default character set:
contentType="TYPE"
Use the following syntax for pageEncoding
:
pageEncoding="character_set"
Use the following syntax to set everything:
<%@ page ... contentType="TYPE; charset=character_set" pageEncoding="character_set" ... %>
TYPE
is an IANA MIME type; character_set
is an IANA character set. When specifying a character set through the contentType
attribute, the space after the semicolon is optional.
Here are some examples of contentType
and pageEncoding
settings:
<%@ page language="java" contentType="text/html" %>
or:
<%@ page language="java" contentType="text/html; charset=ISO-8859-1" %>
or:
<%@ page language="java" contentType="text/html; charset=ISO-8859-1" pageEncoding="US-ASCII" %>
Without any page
directive settings, default settings are as follows:
The default MIME type is text/html
for traditional JSP pages; it is text/xml
for JSP XML documents.
The default for the page source character encoding (for translation) is ISO-8859-1
(also known as Latin-1) for traditional JSP pages; it is UTF-8
or UTF-16
for JSP XML documents.
The default for the response character encoding is ISO-8859-1
for traditional JSP pages; it is UTF-8
or UTF-16
for JSP XML documents.
The determination of UTF-8
versus UTF-16
is according to "Autodetection of Character Encodings" in the XML specification, at the following location:
http://www.w3.org/TR/REC-xml.html
Be aware, however, that there is a relationship between pageEncoding
and contentType
regarding character encodings, as documented in Table 9-1.
Table 9-1 Effect of pageEncoding and contentType on Character Encodings
pageEncoding Status | contentType Status | Page Source Encoding Status | Response Encoding Status |
---|---|---|---|
Specified |
Specified |
According to |
According to |
Specified |
Not specified |
According to |
According to |
Not specified |
Specified |
According to |
According to |
Not specified |
Not specified |
According to default |
According to default |
Be aware of the following important usage notes.
A page
directive that sets contentType
or pageEncoding
should appear as early as possible in the JSP page.
When a page is a JSP XML document, any pageEncoding
setting is ignored. The JSP container will instead use the XML encoding declaration of the document. Consider the following example:
<?xml version="1.0" encoding="EUC-JP" ?> <jsp:root xmlns:jsp="http://java.sun.com/JSP/Page" version="1.2"> <jsp:directive.page contentType="text/html;charset=Shift_Jis" /> <jsp:directive.page pageEncoding="UTF-8" /> ...
The effective page encoding would be EUC-JP
, not UTF-8
.
You should use pageEncoding
only for pages where the byte sequence represents legal characters in the target character set.
You should use contentType
only for pages or response output where the byte sequence represents legal characters in the target character set.
The target character set of the response output (as specified by contentType
, for example) should be a superset of the character set of the page source. For example, UTF-8
is the superset of Big5
, but ISO-8859-1
is not.
The parameters of a page
directive are static. If a page discovers during execution that a different character set specification is necessary for the response, it can do one of the following:
Use the servlet response object API to set the content type during execution, as described in "Dynamic Content Type Settings".
or:
Forward the request to another JSP page or to a servlet.
A traditional JSP page source (not a JSP XML document) written in a character set other than ISO-8859-1
must set the appropriate character set in a page
directive (through the contentType
or pageEncoding
attribute). The character set for the page encoding cannot be set dynamically, because the JSP container has to be aware of the setting during translation.
This manual, for simplicity, assumes the typical case that the page text, request parameters, and response parameters all use the same encoding (although other scenarios are technically possible). Request parameter encoding is controlled by the browser, although Netscape and Internet Explorer browsers follow the setting you specify for the response parameters.
The IANA maintains a registry of MIME types. See the following site for a list of types:
http://www.iana.org/assignments/media-types-parameters
The IANA maintains a registry of character encodings at the following site. Use the indicated "preferred MIME name" if one is listed:
http://www.iana.org/assignments/character-sets
You should use only character sets from the IANA list, except for any additional Oracle extensions as described in "Oracle Extension for the Character Set of the JSP Writer Object".
For situations where the appropriate content type for the HTTP response is not known until runtime, you can set it dynamically in the JSP page. The standard javax.servlet.ServletResponse
interface specifies the following method for this purpose:
void setContentType(java.lang.String contenttype)
Important: To use dynamic content type settings in an OC4J environment, you must enable the JSPstatic_text_in_chars configuration parameter. See "JSP Configuration Parameters" for a description.
|
The implicit response
object of a JSP page is a javax.servlet.http.HttpServletResponse
instance, where the HttpServletResponse
interface extends the ServletResponse
interface.
The setContentType()
method input, like the contentType
setting in a page
directive, can include a MIME type only, or both a character set and a MIME type. For example:
response.setContentType("text/html; charset=UTF-8");
or:
response.setContentType("text/html");
As with a page
directive, the default MIME type is text/html
for traditional JSP pages or text/xml
for JSP XML documents, and the default character encoding is ISO-8859-1
.
Set the content type as early as possible in the page, before writing any output to the JspWriter
object.
The setContentType()
method has no effect on interpreting the text of the JSP page during translation. If a particular character set is required during translation, that must be specified in a page
directive, as described in "Content Type Settings in the page Directive".
Note: In servlet 2.2 and higher environments, such as OC4J, theresponse object has a setLocale() method that takes a java.util.Locale object as input and sets the character set based on the specified locale. For example, the following method call results in a character set of Shift_JIS :
response.setLocale(new Locale("ja", "JP")); For dynamic specification of the character set, the most recent call to |
In standard usage, the character set of the content type of the response
object, as determined by the page
directive contentType
parameter or the response.setContentType()
method, automatically becomes the character set of the JSP writer object as well. The JSP writer object is a javax.servlet.jsp.JspWriter
instance.
There are some character sets, however, that are not recognized by IANA and therefore cannot be used in a standard content type setting. For this reason, OC4J provides the static setWriterEncoding()
method of the oracle.jsp.util.PublicUtil
class:
static void setWriterEncoding(JspWriter out, String encoding)
You can use this method to specify the character set of the JSP writer directly, overriding the character set of the response
object. The following example uses Big5
as the character set of the content type, but specifies MS950
, a non-IANA Hong Kong dialect of Big5
, as the character set of the JSP writer:
<%@ page contentType="text/html; charset=Big5" %> <% oracle.jsp.util.PublicUtil.setWriterEncoding(out, "MS950"); %>
Note: Use thesetWriterEncoding() method as early as possible in the JSP page.
|
The servlet specification has a method, setCharacterEncoding()
, in the javax.servlet.ServletRequest
interface. This method is useful in case the default encoding of the servlet container is not suitable for multibyte request parameters and bean property settings, such as for a getParameter()
call in Java code or a jsp:setProperty
tag to set a bean property in JSP code.
The setCharacterEncoding()
method and equivalent Oracle extensions affect parameter names and values, specifically:
Request object getParameter()
method output
Request object getParameterValues()
method output
Request object getParameterNames()
method output
jsp:setProperty
settings for bean property values
These topics are covered in the following sections:
Beginning with the servlet 2.3 specification, the setCharacterEncoding()
method is specified in the javax.servlet.ServletRequest
interface as the standard mechanism for specifying a nondefault character encoding for reading HTTP requests. The signature of this method is as follows:
void setCharacterEncoding(java.lang.String enc) throws java.io.UnsupportedEncodingException
The enc
parameter is a string specifying the name of the desired character encoding and overrides the default character encoding. Call this method before reading request parameters or reading input through the getReader()
method, which is also specified in the ServletRequest
interface.
There is also a corresponding getter method:
String getCharacterEncoding()
In pre-2.3 servlet environments, the setCharacterEncoding()
method is not available. For such environments, Oracle provides two alternative mechanisms:
oracle.jsp.util.PublicUtil.setReqCharacterEncoding()
static method (preferred)
translate_params
configuration parameter (or equivalent code)