mod_rewrite

mod_rewrite overview

mod_rewrite is a powerful regular-expressions-based URL manipulation module for IIS. It allows of performing URL modifications depending on various conditions—HTTP headers, states of server variables etc.

Flexible and easily manageable rewriting of loads of URLs is possible due to map files support.

URL manipulations are established using plain text configuration files with a bunch of directives inside. Configurations may occur on different levels:

  • Global (per-server) configuration directives are placed into httpd.conf file residing in mod_rewrite folder. There are also several tags that allow to apply directives to specific locations: <VirtualHost>, <Directory>, <DirectoryMatch>, <Files>, <FilesMatch>, <Location> and <LocationMatch>.
  • Per-site configurations are located in .htaccess files that may be put to any directory under the web site (including site root) and that are applied to this directory and all its subdirectories.

All configuration files are reloaded automatically every time the file is changed. It is allowed to change file content using third party programs and scripts.

In most cases ISAPI_Rewrite is used to rewrite requested URL. In addition to rewriting, mod_rewrite can modify, create or remove any HTTP headers of the client REQUEST. Module is capable of doing rewriting, proxying, redirection or blocking of original client request to a server.

Rewriting causes server to continue request processing with a new URL as if it has been originally requested by a client. New URL can include query string section (following the question mark) and may point to any plain static files, scripts (like ASP), programs (like EXE), etc. within the same web application (which usually means the same web site). Rewriting is completely transparent to the user and web site applications because it is done internally on a server and before web application receives the request.

Proxying causes the resulting URL to be internally treated as a target on another server and immediately (i.e. rules processing stops here) passed to the remote server. Response of the remote server will then be passed back to the client. Proxy requires you to specify fully qualified URL, starting from protocol, host name, etc.

Redirection results in sending of immediate response with a redirect instruction (HTTP response code 301 or 302) setting substituted URL as a new location. You can use an absolute URL format (as required by RFC 2616) in redirection instruction to redirect the request to a different host, port and protocol. If this information is omitted, mod_rewrite will automatically prepend URL with the current request's protocol, server name and directory location.

Rules are applied in order of appearance in configuration files. Directory level configuration files are processed file by file starting from the parents. Rules from global (server) configuration file are applied first. Order of rules is important because substitution result of one rule will become a source for subsequent rules application.

mod_rewrite directives

RewriteBase

Description: Explicitly sets the base URL for per-directory rewrites.
Syntax: RewriteBase URL-path
Default: RewriteBase requested-directory-path
Context: directory, .htaccess
Module: mod_rewrite

When RewriteRule directive is used in per-directory configuration files (.htaccess) it will automatically strip the local directory prefix from the path and apply rules only to the remainder. RewriteBase directive allows you to explicitly specify a base for the rules, i.e. the part that will be stripped.

For directory context RewriteBase will be empty by default but for .htaccess configurations it will contain virtual path to requested directory.

RewriteCond

Description: Defines a condition for the following RewriteRule
Syntax: RewriteCond TestString CondPattern
Context: server config, virtual host, directory, .htaccess
Module: mod_rewrite

RewriteCond directive defines a single condition for the following RewriteRule, RewriteHeader or RewriteProxy directive. There can be one or more conditions preceding a rule directive and the rule will only apply if all of the conditions are met.

TestString additionally to plain text can contain the following constructs:

  • Back references to RewriteRule pattern using syntax $N.
  • Back references to preceding RewriteCond patterns using syntax %N.
  • RewriteMap expansions using syntax ${mapname:key|default}.
  • HTTP header value with the syntax %{HTTP:header}.
  • Server variable using the syntax %{ NAME_OF_VARIABLE }.
    Here is the list of available server variables:

    HTTP_USER_AGENT
    HTTP_REFERER
    HTTP_COOKIE
    HTTP_FORWARDED
    HTTP_HOST
    HTTP_PROXY_CONNECTION
    HTTP_ACCEPT

    REMOTE_ADDR
    REMOTE_HOST
    REMOTE_PORT
    REMOTE_USER
    REMOTE_IDENT
    REQUEST_METHOD
    SCRIPT_FILENAME
    PATH_INFO
    QUERY_STRING
    AUTH_TYPE

    DOCUMENT_ROOT
    SERVER_NAME
    SERVER_ADDR
    SERVER_PORT
    SERVER_PROTOCOL
    SERVER_SOFTWARE

    API_VERSION
    THE_REQUEST
    REQUEST_URI
    REQUEST_FILENAME
    HTTPS

    TIME_YEAR
    TIME_MON
    TIME_DAY
    TIME_HOUR
    TIME_MIN
    TIME_SEC
    TIME_WDAY
    TIME

    Additionally all special IIS server variables are supported.

CondPattern specifies a regular expression that will be applied to the instance of TestString.

The following special values are also supported:

  • Prefix regular expression with '!' symbol to specify negation pattern.
  • '<CondPattern' Treats CondPattern as a plain string that will be lexicographically compared as more than.
  • '>CondPattern' Lexicographically less then comparison.
  • '=CondPattern' Lexicographically equals comparison.
  • '-d' TestString is existing directory.
  • '-f' TestString is existing file.
  • '-s' TestString is a file of nonzero size.

The following values are unsupported because they are senseless in IIS:

  • '-l' Link.
  • '-x' Has executable permissions.
  • '-F' Is existing file, via subrequest.
  • '-U' Is existing URL, via subrequest.

RewriteCond directive may be accompanied by the following flags:

  • 'nocase|NC'

    This flag makes the Pattern case-insensitive.
     
  • 'ornext|OR'

    This flag combines subsequent RewriteCond directives with logical OR instead of implicit AND.
  • O

    Normalizes string before processing. Normalization includes removing of URL-encoding, illegal characters, etc. Also, IIS normalization of URL completely removes query string from it.

Note! RewriteCond directive affects only ONE subsequent RewriteRule, RewriteHeader or RewriteProxy directive.

RewriteEngine

Description: Enables or disables runtime rewriting engine
Syntax: RewriteEngine on|off
Default: RewriteEngine on
Context: server config, virtual host, directory, .htaccess
Module: mod_rewrite

RewriteEngine enables or disables rewriting runtime. Use RewriteEngine off instead of commenting out rewrite rules if you need to disable mod_rewrite module or specific .htaccess file.

RewriteHeader

Description: Rewrites any HTTP header in request
Syntax: RewriteHeader HeaderName: Pattern Substitution [flags]
Context: server config, virtual host, directory, .htaccess
Module: mod_rewrite

RewriteHeader directive is more general variant of RewriteRule directive and is designed to rewrite not only the URL part of client request, but any HTTP header. Technically RewriteRule directive is equivalent to RewriteHeader URL Pattern Substitution [flags]. This directive can be used to rewrite, create or delete any HTTP headers in the client request before it is processed by other applications on IIS.

HeaderName: specifies the name of HTTP header that will be rewritten.

Pattern, Substitution and flags are the same as for RewriteRule directive.

Note! RewriteHeader directive has no equivalent in Apache (see Compatibility chart).

RewriteLog

Description: Sets the name and path to mod_rewrite log file
Syntax: RewriteLog file-path
Default: RewriteLog installdir\rewrite.log
Context: server config, virtual host, directory, .htaccess
Module: mod_rewrite

RewriteLog directive sets the name of the log file where ISAPI_Rewrite will log all its actions.

Example:

RewriteLog "C:\local\path\rewrite.log"

RewriteLogLevel

Description: Sets the level of logging into rewrite.log
Syntax: RewriteLogLevel Level
Default: RewriteLogLevel 0
Context: server config, virtual host, directory, .htaccess
Module: mod_rewrite

RewriteLogLevel directive sets the verbosity of logging output. The default value of 0 means no logging, while maximum level of 9 means all actions will be logged.

Using higher values of logging may slow down mod_rewrite operation. We recommend you to disable logging by setting log level to 0 after debugging of your ruleset has been completed.

RewriteMap

Description: Defines a key to value mapping function
Syntax: RewriteMap MapName MapType:MapSource
Context: server config, virtual host, directory, .htaccess
Module: mod_rewrite

RewriteMap directive is used to define a key to value lookup function. This is useful when you need to map large amount of values since it is much faster than doing it with rule matching. There are three types of mapping:

  • txt: mapping using text file map;
  • rnd: random value select from multiple choices;
  • int: internal functions;
  • dbd: mapping using external database.

MapName is the name of mapping function that will be used to refer to this map from RewriteRule command. Make sure that every mapping is defined with a unique name. You can call mapping function in the Substitution value of RewriteRule directive using the following syntax:

${ MapName : LookupKey | DefaultValue }

If this construction is found in Substitution, mod_rewrite will lookup for the key in the map file and if one is found, substitute the construct by its value. If no key is found, optional DefaultValue will be used. If no DefaultValue is specified, it will be substituted by an empty string.

RewriteMap directive may be accompanied by the following flag:

  • 'nocase|NC'

    This flag makes the comparison of mapfile values with corresponding rule submatch case-insensitive.

Here is an example of using maps:

RewriteMap examplemap txt:/path/to/file/map.txt [NC]

Then you may use this map in RewriteRule as follows:

RewriteRule ^/ex/(.*) ${examplemap:$1}

The following combinations for MapType and MapSource can be used:

txt: Plain text mapping. The source is a Windows file system path to a valid text file. Text file should be of the following format:

#This is a comment
key1 value1 #Another comment
key2 value2
keyN valueN

rnd: Random multiple values lookup. Source is the path to a text file of the following format:

#This is a comment
key1 value1|value2|value3
key2 value4|value5|value6|valueN

int: Internal function call. Source should be one of the following pre-defined internal functions:

  • toupper: Converts the key to all upper case.
  • tolower: Converts the key to all lower case.
  • escape: Encode special characters to hex values.
  • unescape: Unencode hex values to special characters.

dbd: Database-driven mapping. Below is the example of how to use it to rewrite SEO-friendly URLs to original physical pages.

What you need is a database containing two columns: OriginalURL (with real physical URLs) and SEO_URL (with pretty SEO-friendly links). Having created the database you are ready to use this config:

DBDriver mssql
DBDParams "Data Source=server;Initial Catalog=database;User ID=user;Password=password"
DBDPrepareSQL "select OriginalURL from seo_mapping where `SEO_URL` =@KEY" seo_map_select

RewriteEngine On
RewriteMap map_dbd dbd:seo_map_select
RewriteCond ${map_dbd:$1|NOT_FOUND} (.*)
RewriteCond %1 !NOT_FOUND 
RewriteRule (.+) %1 [L]

RewriteOptions

Description: Specifies special options
Syntax: RewriteOptions Options
Context: server config, virtual host, directory, .htaccess
Module: mod_rewrite

RewriteOptions directive can set special options for mod_rewrite. Currently only one option is available:

inherit: forces current config to inherit all options and rules from the parent. This means that all rules from the parent config will be executed again but from the context as if they were written in the current config.

RewriteProxy

Description: Proxies request to a distant server
Syntax: RewriteProxy Pattern Substitution [flags]
Context: server config, virtual host, directory, .htaccess
Module: mod_rewrite

RewriteProxy causes the resulting URL to be internally treated as a target on another server and immediately (i.e. rules processing stops here) passed to the remote server. Response of the remote server will then be passed back to the client. Proxy requires you to specify fully qualified URL, starting from protocol, host name, etc.

Syntax and operation are the same as for RewriteRule directive, but RewriteProxy supports some additional flags:

  • A (Add authentication headers)

    Allows passing of an authentication information from proxy to an internal server when client authentication against a proxy server is used. Proxy module will append headers

    X-ISRW-Proxy-AUTH-TYPE,
    X-ISRW-Proxy-AUTH-USER,
    X-ISRW-Proxy-LOGON-USER,
    X-ISRW-Proxy-REMOTE-USER

    corresponding to server variables

    AUTH_TYPE,
    AUTH_USER,
    LOGON_USER,
    REMOTE_USER

    to a request sent to a proxied server.

  • C (use Credentials)

    Proxy module will try to login on a remote server with the credentials specified in the URL or basic authentication headers. With this flag you can use http://user:password@host.com/path/ syntax as a URL within substitution string.

Note! RewriteProxy directive has no equivalent in Apache (see Compatibility chart).

RewriteRule

Description: Defines URL rewriting rule
Syntax: RewriteRule Pattern Substitution [flags]
Context: server config, virtual host, directory, .htaccess
Module: mod_rewrite

RewriteRule directive defines a single URL rewriting operation. It may occur more that once in configuration file with each instance defining a URL rewriting rule. RewriteRule directive consists of URL matching Pattern, Substitution string and optional set of flags.

Pattern is a Perl-compatible regular expression which will be matched against current URL. The current URL can be originally requested URL or URL already altered by preceding rules. URL never includes protocol or host name and starts from the first slash character. Also current URL differs depending on a level of configuration that is applied, i.e. for directory level configuration current directory name will be omitted from the URL to match. Please see Regular expression syntax section of the documentation for more information on building regular expressions.

Preceding pattern by a '!' character will negate entire expression. Negated pattern cannot generate submatches so you cannot use $N references in substitution.

Substitution string specifies format string to generate new URL if the Pattern was matched. In addition to plain text it can include:

  • back references to RewriteRule pattern ($N);
  • back references to RewriteCond patterns (%N);
  • server variables (%{VARNAME});
  • conditional format patterns (?Ntrue_string:false_string);
  • grouping parenthesis ‘(‘ and ‘)

The following escape sequences are also allowed:

\a The bell character.
\f The form feed character.
\n The newline character.
\r The carriage return character.
\t The tab character.
\v A vertical tab character.
\x A hexadecimal character - for example \x0D.
\x{} A possible Unicode hexadecimal character - for example \x{1A0}
\cx The ASCII escape character x, for example \c@ is equivalent to escape-@.
\e The ASCII escape character.
\dd An octal character constant, for example \10.
\\ Single back slash character '\'

RewriteRules are applied in the order of appearance in configuration file, starting from the parent configuration files. Each rule will apply only if its Pattern matches a URL and all connected conditions (RewriteCond) are also matched. After that URL is completely replaced by a Substitution and rewriting process continues until the end of configuration file or the rule with any of the flags terminating rules processing. Special string '-' (dash) in Substitution means no substitution and is useful when you need to apply the rule while leaving original URL untouched.

Additionally here is the list of supported flags that can affect rule behavior. Apache-compatible flags are green, unsupported flags are red, and Ape-specific flags are yellow. All unsupported flags will be ignored.

  • caselower|CL

    Changes the case of substitution result to lower.
     
  • caseupper|CU

    Changes the case of substitution result to upper.
     
  • chain|C

    Chains current rule with the next rule. Next rule will be executed only if current rule is matched. Chain can be subsequent.
     
  • cookie|CO=NAME:VAL:domain[:lifetime[:path]]

    Sets a cookie header with the fields specified and send it to the client with the response to the current request.
     
  • env|E=VAR:VAL

    Sets an environment variable.
     
  • forbidden|F

    Sends immediate 403 FORBIDDEN response to the client. Stops rule processing and all other subsequent processing on this request.
     
  • gone|G

    Sends immediate 410 GONE response to the client. Stops rule processing and all other subsequent processing on this request.
     
  • handler|H=Content-handler

    Unsupported. Explicitly specifies handler for a request. In IIS world this can be achieved by rewriting file extension of requested file but there is no direct translation from Apache handlers to IIS file extensions.
     
  • last|L

    Stops rewriting process here and doesn’t apply any more rules from the current configuration file. Descendant .htaccess files will still be applied if any.
     
  • loop|LP

    Re-runs current single rule in the loop while it’s pattern and conditions are matched. Number of iterations is limited to a value of 200 to avoid infinite loops.
     
  • next|N

    Re-runs entire rewriting process starting from the beginning of current configuration file. Number of iterations is limited to a value of 200 to avoid infinite loops.
     
  • nocase|NC

    This flag makes the Pattern case-insensitive.
     
  • noescape|NE

    This flag prevents mod_rewrite from applying the usual URI escaping rules to the rewriting result. With this flag set special characters (e.g. '%', '$', ';', etc) will NOT be escaped into their hexcode equivalents ('%25', '%24', '%3B' respectively).
     
  • nosubreq|NS

    Forces the rewriting engine to skip a rewriting rule if the current request is an internal sub-request. Supported only in IIS7.
     
  • normalize|O

    Normalization includes removing of an URL-encoding, illegal characters, etc. Also normalization of URL completely removes query string from it. When O flag is not set, normalization takes place. If you set O at the end of the rule, normalization will not occur.
     
  • proxy|P

    Forces the resulting URL to be internally treated as a target on another server and immediately (i.e. rules processing stops here) passed to the remote server. Response of the remote server will then be passed back to the client. Proxy requires you to specify fully qualified URL, starting from protocol, host name, etc. ISAPI_Rewrite uses ISAPI extension to handle proxy requests. You can read more about this in configuring proxy chapter.
     
  • passthrough|PT

    Unsupported or always on. Result is always passed through the next handler in IIS.
     
  • qsappend|QSA

    Appends current query string data to a substitution string instead of replacing it by a substitution. Use this when you need to add more query string parameters while preserving original parameters.
     
  • redirect|R [=code]

    Forces server to send immediate response with redirect instruction, providing Substitution as a new location. It can optionally prefix Substitution with http://thishost[:thisport]/ bringing URL to a valid absolute form. If no code is given, a HTTP response of 302 (MOVED TEMPORARILY) will be used. You can optionally specify any code from 3xx range.
     
  • skip|S=num

    Forces the rewriting engine to skip the next num rules in sequence, if the current rule matches.
     
  • statuscode|SC=200..500

    Forces server to send immediate response with custom status code page.
     
  • type|T=MIME-type
     
    Forces the MIME-type of the target file to be MIME-type. This can be used to set up the content-type based on some conditions.
     
  • unmanglelog|U
     
    Unmangle log. Log the URL as it was originally requested and not as the URL was rewritten.

Examples

Note! All rules in these examples are intended for httpd.conf file. In mod_rewrite base path for rules is different depending on a directory where you put .htaccess file. Initial leading slash only exist if you put rules in httpd.conf, while in .htaccess files virtual path to these files is truncated. Rules that rely on a root path are preceded with RewriteBase / directive to allow them to work in any location within httpd.conf and directory level .htaccess files.

Simple search engine friendly URLs

This example demonstrates how to easily hide query string parameters using loop flag. Suppose you have URL like http://www.mysite.com/foo.asp?a=A&b=B&c=C and you want to access it as http://www.myhost.com/foo.asp/a/A/b/B/c/C

The following rule is destined to perform this transformation:

RewriteEngine on
RewriteRule ^(.*?\.asp)/([^/]*)/([^/]*)(/.+)? $1$4?$2=$3 [NC,LP,QSA]

Note! This rule may break page-relative links to CSSs, images, etc. This is happening due to a change of the base path (parent folder of the page) that is being used by a browser to calculate complete resource URI. This problem occurs only if you use directory separator as a replacement character. There are three possible solutions:

  1. Use the rule given below. It does not affect base path.
  2. Directly specify correct base path for a page with the help of <base href="/folder/page.asp"> tag.
  3. Change all page-relative links to either root-relative or absolute form.

There also exist many variations of this rule with different separator characters and file extensions. For example, to use URLs like http://www.myhost.com/foo.asp~a~A~b~B~c~C the following rule can be implemented:

RewriteEngine on
RewriteRule ^(.*?\.asp)~([^~]*)~([^~]*)(.*) $1$4?$2=$3 [NC,LP,QSA]

Keyword rich URLs

In previous example we have used general technique to simply hide query string markers. But much more useful solution for search engine optimization would be making your URL keyword rich. Consider the following URL: http://www.mysite.com/productpage.asp?productID=127 This is a common situation for most web sites. But you can significantly increase rating of your page in search engines by using the following URL format instead: http://www.mysite.com/products/our_super_tool.asp Keywords “our super tool” in this URL will be indexed and improve page rank. But “our_super_tool” cannot be used to retract productID=127 directly. Several solutions to this problem exist.

The first solution that we would recommend if you have short URL format with only few parameters is to include in URL both keywords and numeric identifiers. In this case your URL may look as: http://www.mysite.com/products/our_super_tool_127.asp Only one rule will be needed to achieve this rewrite:

RewriteEngine on
RewriteBase /
RewriteRule ^products/[^?/]*_(\d+)\.asp /productpage.asp?productID=$1

Another, more complex but effective, solution is to create 1 to 1 map file and use it to map “our_super_tool” to 127. This solution is useful for some long URLs with many parameters and will allow you to hide even numeric identifier. The URL will look as http://www.mysite.com/products/our_super_tool.asp. Please note that “our_super_tool” part should uniquely identify the product and it’s identifier. Here is an example for this solution:

RewriteEngine on
RewriteBase /
RewriteMap mapfile txt:mapfile.txt
RewriteRule ^products/([^?/]+)\.asp /productpage.asp?productID=${mapfile:$1}

And you will need to create mapfile.txt map file with the following content:

one_product 1
another_product 2
our_super_tool 127
more_products 335

Advantage of this method is that you can use it to combine quite complex URL transformations.

Use IIS as reverse proxy

Assume you have internet server running IIS and several backend servers or applications running other platform or machine. These servers are not directly accessible from the internet but you need to provide access to these servers for others. Here is an example of how to simply map entire content of one web site into the folder on another site:

RewriteEngine on
RewriteBase /
RewriteRule mappoint(.+) http://sitedomain$1 [NC,P]

Emulating host-header-based virtual sites

For example you have registered two domains www.site1.com and www.site2.com. Now you can create two different sites using single physical site. Here is a rules example:

RewriteEngine on

#Fix missing trailing slash char on folders
RewriteCond %{HTTP:Host} (.*)
RewriteRule ([^.?]+[^.?/]) http\://%1$1/ [R]

#Emulate site1
RewriteCond %{HTTP:Host} (?:www\.)?site1\.com
RewriteRule (.*) /site1$1 [NC,L]

#Emulate site2
RewriteCond %{HTTP:Host} (?:www\.)?site2\.com
RewriteRule (.*) /site2$1 [NC,L]

Now just place your sites in /site1 and /site2 directories. Note that www.site1.com and www.site2.com should be somehow mapped in IIS to this web site to allow mod_rewrite intercept the request.

Or you can use more generic rules to map any request to the folder with the same name as the host name in request:

RewriteEngine on

#Fix missing trailing slash char on folders
RewriteCond %{HTTP:Host} (.*)
RewriteRule ([^.?]+[^.?/]) http\://%1$1/ [R]

#Map requests to the folders
RewriteCond %{HTTP:Host} (www\.)?(.+)
RewriteRule (.*) /%2$1 

Directory names for sites should be like /somesite1.com, /somesite2.info, etc.

Blocking inline-images (stop hot linking)

Assume you have some pages with inline GIF graphics under http://www.mysite.com/. Some other sites incorporate this graphics via hyperlinks to their pages. This adds useless traffic to your site and you want to stop this practice.

While you cannot 100% protect the images from inclusion with mod_rewrite, only HotlinkBlocker product can do this, you can at least restrict the cases when browser sends HTTP Referer header. The following rules will allow only access to the images if referer is from the same host or empty.

RewriteEngine on
RewriteCond %{HTTP:Host}#%{HTTP:Referer} ^([^#]+)#(?!http://\1.*).+
RewriteRule .*\.(?:gif|jpg|png) /block.gif [NC]

Moving site location

This is very usual problem when you move web site from one domain name to another, or just another folder. You want to redirect all requests from one web site to another preserving the requested resource name and parameters. This is incredibly useful especially when you want to preserve page ranks of existing pages and external links. The following configuration should be used on old web server:

RewriteEngine on
#Permanent redirect to update old links
RewriteRule (.+) http://newserver.com$1 [R=301]

Browser-dependent content

It is sometimes necessary to provide browser-dependent content at least for important top-level pages, i.e. one has to provide a full-featured version for the Internet Explorer, a minimum-featured version for the Lynx browsers and an average-featured version for all others.

RewriteEngine on

RewriteCond %{HTTP:User-Agent} MSIE
RewriteRule /foo\.htm /foo.IE.htm [L]

RewriteCond %{HTTP:User-Agent} (?:Lynx|Mozilla/[12])
RewriteRule /foo\.htm /foo.20.htm [L]

RewriteRule /foo\.htm /foo.32.htm [L]

Block annoying robots

Here is a useful example to block a number of known robots and retractors by their user agents. Please note this rule is long and we have split it into lines. In order to work correctly no spaces can be added to the end or beginning of the lines:

RewriteEngine on

#Block spambots
RewriteCond %{HTTP:User-Agent} (?:Alexibot|Art-Online|asterias|BackDoorbot|Black.Hole|\
BlackWidow|BlowFish|botALot|BuiltbotTough|Bullseye|BunnySlippers|Cegbfeieh|Cheesebot|\
CherryPicker|ChinaClaw|CopyRightCheck|cosmos|Crescent|Custo|DISCo|DittoSpyder|DownloadsDemon|\
eCatch|EirGrabber|EmailCollector|EmailSiphon|EmailWolf|EroCrawler|ExpresssWebPictures|ExtractorPro|\
EyeNetIE|FlashGet|Foobot|FrontPage|GetRight|GetWeb!|Go-Ahead-Got-It|Go!Zilla|GrabNet|Grafula|\
Harvest|hloader|HMView|httplib|HTTrack|humanlinks|ImagesStripper|ImagesSucker|IndysLibrary|\
InfonaviRobot|InterGET|Internet\sNinja|Jennybot|JetCar|JOC\sWeb\sSpider|Kenjin.Spider|Keyword.Density|\
larbin|LeechFTP|Lexibot|libWeb/clsHTTP|LinkextractorPro|LinkScan/8.1a.Unix|LinkWalker|lwp-trivial|\
Mass\sDownloader|Mata.Hari|Microsoft.URL|MIDown\stool|MIIxpc|Mister.PiX|Mister\sPiX|moget|\
Mozilla/3.Mozilla/2.01|Mozilla.*NEWT|Navroad|NearSite|NetAnts|NetMechanic|NetSpider|Net\sVampire|\
NetZIP|NICErsPRO|NPbot|Octopus|Offline.Explorer|Offline\sExplorer|Offline\sNavigator|Openfind|\
Pagerabber|Papa\sFoto|pavuk|pcBrowser|Program\sShareware\s1|ProPowerbot/2.14|ProWebWalker|ProWebWalker|\
psbot/0.1|QueryN.Metasearch|ReGet|RepoMonkey|RMA|SiteSnagger|SlySearch|SmartDownload|Spankbot|spanner|\
Superbot|SuperHTTP|Surfbot|suzuran|Szukacz/1.4|tAkeOut|Teleport|Teleport\sPro|Telesoft|The.Intraformant|\
TheNomad|TightTwatbot|Titan|toCrawl/UrlDispatcher|toCrawl/UrlDispatcher|True_Robot|turingos|\
Turnitinbot/1.5|URLy.Warning|VCI|VoidEYE|WebAuto|WebBandit|WebCopier|WebEMailExtrac.*|WebEnhancer|\
WebFetch|WebGo\sIS|Web.Image.Collector|Web\sImage\sCollector|WebLeacher|WebmasterWorldForumbot|\
WebReaper|WebSauger|Website\seXtractor|Website.Quester|Website\sQuester|Webster.Pro|WebStripper|\
Web\sSucker|WebWhacker|WebZip|Wget|Widow|[Ww]eb[Bb]andit|WWW-Collector-E|WWWOFFLE|\
Xaldon\sWebSpider|Xenu's|Zeus) [NC]
RewriteRule .* - [F]

Dynamically generated robots.txt

robots.txt is a file that search engines use to discover URLs that should or should not be indexed. But creation of this file for large sites with lots of dynamic content can be a very complex task. Have you ever thought about robots.txt dynamically generated from a script? Let's write robots.asp script:

<%@ Language=JScript EnableSessionState=False%>
<%

//The script must return plain text
Response.ContentType="text/plain";

/*

Place generation code here

*/

%>

Now make it available as robots.txt using single rule:

RewriteEngine on
RewriteRule /robots\.txt /robots.asp [NC]

Emulating load balancing

This example emulates some kind of DNS Round-Robin load balancing technique. Suppose you have main site www.mysite.com and a number of web servers which you have registered as www[1-9].mysite.com If you install ISAPI_Rewrite on the main server, you can spread traffic randomly between all servers by redirecting initial client request to some specific server. Once redirected, client will continue using this specific server. While this solution is not ideal, it can really spread your traffic and help to avoid problem with preserving session state.

Use the following rule to redirect clients:

RewriteEngine on

RewriteMap hosts rnd:hosts.txt

RewriteCond %{HTTP:Host} (www)\.mysite.com [NC]
RewriteRule (.*) http://${hosts:%1}.mysite.com$1 [R]

And here is hosts.txt file content:

www www1|www2|www3|www4|www5|www6|www7|www8|www9