The rewrite valve implements URL rewrite functionnality in a way that is very similar to mod_rewrite from Apache HTTP Server.
The rewrite valve is configured as a regular valve, by adding the following to server.xml as child of an Engine or Host element (or inside a context.xml file):
<Valve className="org.jboss.web.rewrite.RewriteValve" />
The valve will then use a rewrite.properties file containing the rewrite directives, located according to the container it is assocaited to:
The rewrite.properties file contains a list of directives which closely resemble the directives used by mod_rewrite, in particular the central RewriteRule and RewriteCond directives.
Note: This section is a modified version of the mod_rewrite documentation, which is Copyright 1995-2006 The Apache Software Foundation, and licensed under the under the Apache License, Version 2.0.
Syntax: RewriteCond TestString CondPattern
The RewriteCond directive defines a rule condition. One or more RewriteCond can precede a RewriteRule directive. The following rule is then only used if both the current state of the URI matches its pattern, and if these conditions are met.
TestString is a string which can contain the following expanded constructs in addition to plain text:
| HTTP headers: | connection & request: | |
|---|---|---|
|
HTTP_USER_AGENT HTTP_REFERER HTTP_COOKIE HTTP_FORWARDED HTTP_HOST HTTP_PROXY_CONNECTION HTTP_ACCEPT |
REMOTE_ADDR REMOTE_HOST REMOTE_PORT REMOTE_USER REMOTE_IDENT REQUEST_METHOD SCRIPT_FILENAME REQUEST_PATH CONTEXT_PATH SERVLET_PATH PATH_INFO QUERY_STRING AUTH_TYPE |
|
| server internals: | date and time: | specials: |
|
DOCUMENT_ROOT SERVER_NAME SERVER_ADDR SERVER_PORT SERVER_PROTOCOL SERVER_SOFTWARE |
TIME_YEAR TIME_MON TIME_DAY TIME_HOUR TIME_MIN TIME_SEC TIME_WDAY TIME |
THE_REQUEST REQUEST_URI REQUEST_FILENAME HTTPS |
These variables all correspond to the similarly named HTTP MIME-headers and Servlet API methods. Most are documented elsewhere in the Manual or in the CGI specification. Those that are special to the rewrite valve include those below.
Other things you should be aware of:
CondPattern is the condition pattern, a regular expression which is applied to the current instance of the TestString. TestString is first evaluated, before being matched against CondPattern.
Remember: CondPattern is a perl compatible regular expression with some additions:
.
All of these tests can
also be prefixed by an exclamation mark ('!') to
negate their meaning.
RewriteCond %{REMOTE_HOST} ^host1.* [OR]
RewriteCond %{REMOTE_HOST} ^host2.* [OR]
RewriteCond %{REMOTE_HOST} ^host3.*
RewriteRule ...some special stuff for any of these hosts...
Example:
To rewrite the Homepage of a site according to the ``User-Agent:'' header of the request, you can use the following:
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*
RewriteRule ^/$ /homepage.max.html [L]
RewriteCond %{HTTP_USER_AGENT} ^Lynx.*
RewriteRule ^/$ /homepage.min.html [L]
RewriteRule ^/$ /homepage.std.html [L]
Explanation: If you use a browser which identifies itself as 'Mozilla' (including Netscape Navigator, Mozilla etc), then you get the max homepage (which could include frames, or other special features). If you use the Lynx browser (which is terminal-based), then you get the min homepage (which could be a version designed for easy, text-only browsing). If neither of these conditions apply (you use any other browser, or your browser identifies itself as something non-standard), you get the std (standard) homepage.
| RewriteMap | |||
Syntax: RewriteMap name rewriteMapClassName optionalParameters The maps are implemented using an interface that users must implement. Its class name is org.jboss.web.rewrite.RewriteMap, and its code is:
package org.jboss.web.rewrite;
public interface RewriteMap {
public String setParameters(String params);
public String lookup(String key);
}
RewriteRuleSyntax: RewriteRule Pattern Substitution The RewriteRule directive is the real rewriting workhorse. The directive can occur more than once, with each instance defining a single rewrite rule. The order in which these rules are defined is important - this is the order in which they will be applied at run-time. Pattern is a perl compatible regular expression, which is applied to the current URL. ``Current'' means the value of the URL when this rule is applied. This may not be the originally requested URL, which may already have matched a previous rule, and have been altered. Some hints on the syntax of regular expressions:
Text:
. Any single character
[chars] Character class: Any character of the class ``chars''
[^chars] Character class: Not a character of the class ``chars''
text1|text2 Alternative: text1 or text2
Quantifiers:
? 0 or 1 occurrences of the preceding text
* 0 or N occurrences of the preceding text (N > 0)
+ 1 or N occurrences of the preceding text (N > 1)
Grouping:
(text) Grouping of text
(used either to set the borders of an alternative as above, or
to make backreferences, where the Nth group can
be referred to on the RHS of a RewriteRule as $N)
Anchors:
^ Start-of-line anchor
$ End-of-line anchor
Escaping:
\char escape the given char
(for instance, to specify the chars ".[]()" etc.)
For more information about regular expressions, have a look at the perl regular expression manpage ("perldoc perlre"). If you are interested in more detailed information about regular expressions and their variants (POSIX regex etc.) the following book is dedicated to this topic:
Mastering Regular Expressions, 2nd Edition In the rules, the NOT character ('!') is also available as a possible pattern prefix. This enables you to negate a pattern; to say, for instance: ``if the current URL does NOT match this pattern''. This can be used for exceptional cases, where it is easier to match the negative pattern, or as a last default rule. Note: When using the NOT character to negate a pattern, you cannot include grouped wildcard parts in that pattern. This is because, when the pattern does NOT match (ie, the negation matches), there are no contents for the groups. Thus, if negated patterns are used, you cannot use $N in the substitution string! The substitution of a rewrite rule is the string which is substituted for (or replaces) the original URL which Pattern matched. In addition to plain text, it can include
Back-references are identifiers of the form $N (N=0..9), which will be replaced by the contents of the Nth group of the matched Pattern. The server-variables are the same as for the TestString of a RewriteCond directive. The mapping-functions come from the RewriteMap directive and are explained there. These three types of variables are expanded in the order above. As already mentioned, all rewrite rules are applied to the Substitution (in the order in which they are defined in the config file). The URL is completely replaced by the Substitution and the rewriting process continues until all rules have been applied, or it is explicitly terminated by a flag. There is a special substitution string named '-' which means: NO substitution! This is useful in providing rewriting rules which only match URLs but do not substitute anything for them. It is commonly used in conjunction with the C (chain) flag, in order to apply more than one pattern before substitution occurs. Additionally you can set special flags for Substitution by appending [flags] as the third argument to the RewriteRule directive. Flags is a comma-separated list of any of the following flags:
| |||