mod_proxy

mod_proxy overview

mod_proxy provides a set of instruments for flexible adjustment of forward or reverse proxy on your server.

Forward and reverse proxy

Helicon Ape mod_proxy module can be configured in both forward and reverse proxy mode.

Forward proxy is an intermediate server residing between the client and destination server. To get content from destination server the client sends a request to proxy specifying destination server as the target and proxy then requests the content from destination server and returns it to the client. The client should have forward proxy configured correctly to be able to access other sites.

Forward proxy is typically used to provide Internet access to internal clients that are otherwise restricted by firewall.

Forward proxy is initiated by ProxyRequests directive. As forward proxy allows clients to access arbitrary sites through your server and hide their true credentials, prior to forward proxy activation you need to secure your server so that only authorized users could access the proxy.

Reverse proxy, on the contrary, looks to the client like an ordinary web server. No special configuration on the client is needed. The client makes conventional requests for content in the namespace of reverse proxy. Reverse proxy then decides where to send those requests and returns requested content as if it was the destination.

Reverse proxy is typically used to provide Internet access to a server protected by firewall. Reverse proxy may also be used in the role of load balancer distributing load among several back-end servers, or provide caching for a slower back-end server. In addition, reverse proxies can be used simply to bring several servers into the same URL space.

Reverse proxy is initiated by ProxyPass directive or by [P] flag after RewriteRule directive. There's no need to enable ProxyRequests to configure reverse proxy.

Here are very basic examples of forward and reverse proxy configurations:

Forward Proxy

ProxyRequests On
ProxyVia On
<Proxy *>
	Order deny,allow
	Deny from all
	Allow from internal.example.com
</Proxy>

Reverse Proxy

ProxyRequests Off
<Proxy *>
	Order deny,allow
	Allow from all
</Proxy>
ProxyPass /foo http://foo.example.com/bar
ProxyPassReverse /foo http://foo.example.com/bar

mod_proxy as balancer with PHP sticky sessions

Load-balanced proxy server does not look like something new for mod_proxy, but by default it won't work with PHP sessions and many other applications. But don't fall in despair! Below is a rather simple solution for that issue.

Say you have 2 backend servers: www1.example.com and www2.example.com. You should add the following to your backend vhost configuration:

RewriteEngine On
RewriteRule .* - [CO=BALANCEID:balancer.www1:.example.com]

Then do the same for www2, not forgetting to change the cookie value to reflect this. Now you need to tell your frontend proxy that it should look for this cookie, and which server each "route" refers to:

ProxyPass / balancer://cluster/ lbmethod=byrequests stickysession=BALANCEID
ProxyPassReverse / balancer://cluster/
<Proxy balancer://cluster>
	BalancerMember http://www1.example.com route=www1
	BalancerMember http://www2.example.com route=www2
</Proxy>

Each new incoming request will be directed to the backend server according to your load-balancing method, and any subsequent requests from that user (assuming they have cookies enabled) will then go back to the same backend server. When they close their browser and the cookie expires, the "binding" is reset and they'll get a new random server next time they connect.

mod_proxy directives

NoProxy

Description: Hosts, domains, or networks that will be connected to directly
Syntax: NoProxy host [host] ...
Context: server config, virtual host
Module: mod_proxy

NoProxy directive is only applicable to proxy servers within intranets. NoProxy directive stores the list of subnets, IP addresses, hosts and/or domains, separated by spaces. A request to the host from the list will be processed directly without forwarding to ProxyRemote proxy server(s).

Example:

ProxyRemote * http://firewall.mycompany.com:81
NoProxy .mycompany.com 192.168.112.0/21

The host arguments for NoProxy may be everything from the list:

  • Domain
    Domain is a partially qualified DNS domain name preceded by a period. It represents a list of hosts which logically belong to the same DNS domain or zone (i.e., the suffixes of the hostnames all end up with Domain).

Example:

.com .domain.org.

To distinguish Domains from Hostnames, Domains are always written with preceding period.

Note! Domain name comparisons are performed regardless of case, and Domains are always supposed to be anchored to the root of DNS tree, therefore .MyDomain.com and .mydomain.com. (note the trailing period) are considered equal. As domain comparison does not involve a DNS lookup, it is much more efficient than SubNet comparison.

  • SubNet
    SubNet is a partially qualified internet address in numeric (dotted quad) form, optionally followed by a slash and the netmask, specified as the number of significant bits in the SubNet. It is used to represent a subnet of hosts which can be reached over a common network interface. In the absence of the explicit net mask it is assumed that omitted (or zero valued) trailing digits specify the mask. (In this case, the netmask can only be multiples of 8 bits wide.)

Example:

192.168 or 192.168.0.0

the subnet 192.168.0.0 with an implied netmask of 16 valid bits (may be used in the netmask form 255.255.0.0)

192.168.112.0/21

the subnet 192.168.112.0/21 with a netmask of 21 valid bits (may be used in the form 255.255.248.0)
As a degenerate case, a SubNet with 32 valid bits is the equivalent to an IPAddr, while a SubNet with zero valid bits (e.g., 0.0.0.0/0) is the same as the constant _Default_, matching any IP address.

  • IPAddr
    IPAddr represents a fully qualified internet address in numeric (dotted quad) form. Usually, this address represents a host, but there need not necessarily be a DNS domain name connected with the address.

Example:

192.168.123.7

Note! An IPAddr does not need to be resolved by the DNS system, so it can result in more effective apache performance.

  • Hostname
    A Hostname is a fully qualified DNS domain name which can be resolved to one or more IPAddrs via the domain name service (DNS). It represents a logical host (in contrast to Domains, see above) and must be resolvable to at least one IPAddr (or often to a list of hosts with different IPAddrs).

Example:

prep.ai.mit.edu
www.apache.org

Note! In many situations, it is more effective to specify an IPAddr in place of a Hostname since a DNS lookup can be avoided.

Hostname comparisons are done regardless of case, and Hostnames are always assumed anchored to the root of the DNS tree, therefore two hosts WWW.MyDomain.com and www.mydomain.com. (note the trailing period) are considered equal.

ProxyBlock

Description: Words, hosts, or domains that are banned from being proxied
Syntax: ProxyBlock *|word|host|domain [word|host|domain] ...
Context: server config, virtual host
Module: mod_proxy

ProxyBlock directive specifies a list of words, hosts and/or domains, separated by spaces. HTTP, HTTPS, and FTP access to the sites whose names contain specified words, hosts or domains will be blocked by proxy server. The proxy module will also attempt to determine IP addresses of list items and cache them for match against as well. That may slow down the server startup time.

Example:

ProxyBlock somesite.com some-host.ru sub.domain.gov

sub.domain.gov will also be matched if requested by IP address.

Note that domain would also be sufficient to match domain.gov.

Note also that

ProxyBlock * 

blocks connections to all sites.

ProxyIOBufferSize

Description: Determine size of internal data throughput buffer
Syntax: ProxyIOBufferSize bytes
Default: ProxyIOBufferSize 8192
Context: server config, virtual host
Module: mod_proxy

ProxyIOBufferSize directive allows o set internal buffer size, which acts as a temporary buffer for the data between input and output. The size must be less or equal to 8192.

Note! There are hardly any cases when you need to change that value.

ProxyMaxForwards

Description: Maximium number of proxies that a request can be forwarded through
Syntax: ProxyMaxForwards number
Default: ProxyMaxForwards 10
Context: server config, virtual host
Module: mod_proxy

ProxyMaxForwards directive specifies the maximum number of proxies through which a request may walk, if no Max-Forwards header exists in the request. This directive allows to avoid infinite proxy

Example:

ProxyMaxForwards 15 

ProxyPass

Description: Maps remote servers into the local server URL-space
Syntax: ProxyPass [path] !|url [key=value key=value ...]]
Context: server config, virtual host, directory
Module: mod_proxy

ProxyPass directive allows remote servers to be mapped into the space of the local server; the local server does not act as a proxy in the conventional sense, but appears to be a mirror of the remote server. path is the name of a local virtual path; url is a partial URL for the remote server and cannot include a query string.

Note! If you put ProxyPass directive into httpd.conf file, it's necessary to explicitly specify path parameter; BUT when using this directive inside <Location> section or .htaccess, this parameter shall be omitted (mod_proxy will automatically apply path specified in <Location> section or path to .htaccess file as a ProxyPass path).

Example:

ProxyPass /app/ http://backend.domain.com/

equals to

<Location /app/>
ProxyPass http://backend.domain.com/
</Location>

ProxyRequests directive should usually be set off when using ProxyPass.

Say you have a local server http://domain.com/ then

ProxyPass /mirror/foo/ http://backend.domain.com/

will cause a local request to http://domain.com/mirror/foo/bar to be internally converted into a proxy request to http://backend.domain.com/bar.

The ! directive is used when you don't want to reverse proxy some subdirectory.

Example:

ProxyPass /mirror/foo/i !
ProxyPass /mirror/foo http://backend.domain.com

will proxy all requests to /mirror/foo to backend.domain.com except requests to /mirror/foo/i.

Note! Order is important. you need to put the exclusions before the general ProxyPass directive.

It is also possible to use pooled connections to the backend server. key=value are used to for adjustment of connection pooling.

Parameter
Default
Description
timeout
-
Connection timeout in seconds. If not set, *** will wait until the free connection is available. This directive is used for limiting the number of connections to the backend server
loadfactor
1
Worker load factor. Used with BalancerMember. It is a value between 1 and 100 that defines the normalized weighted load applied to the server (balancer member)
route
-
Route of the server (balancer member) when used inside load balancer. The route is a value appended to session id
redirect
-
Redirection Route of the server (balancer member). This value is usually set dynamically to enable safe removal of the node from the cluster. If set, all requests without session id will be redirected to balancer member whose route parameter equals this value

 

And here is the list of parameters used when proxy acts as load balancer:

Parameter
Default
Description
lbmethod
byrequests
Balancer load-balance method. Possible values: byrequests, to perform weighted request counting or bytraffic, to perform weighted traffic byte count balancing
stickysession
-
Balancer sticky session name. Common values are JSESSIONID or PHPSESSIONID, they depend on the backend application server that supports sessions
nofailover
off
If set to on, the session will break if the balancer member is in error state or disabled. Set this value to on if backend servers do not support session replication
timeout
0
Balancer timeout in seconds. If set, this will be the maximum time to wait for a free balancer member
maxattempts
1
Maximum number of failover attempts before giving up


Example:

ProxyPass /folder/proxy/balancer/fake/ balancer://cluster1/
<Proxy balancer://cluster1>
	BalancerMember http://localhost:80/folder/proxy/balancer/real/ loadfactor=1
	BalancerMember http://localhost:81/folder/proxy/balancer/real/ loadfactor=1
</Proxy>

ProxyPass /ape/proxy/balancer/faketraff/ balancer://cluster2/ lbmethod=bytraffic

<Proxy balancer://clustersession>
	BalancerMember http://localhost:80/ape/proxy/balancer/real/ loadfactor=100 route=p80
	BalancerMember http://localhost:81/ape/proxy/balancer/real/ loadfactor=1 route=p81
	BalancerMember http://localhost:82/ape/proxy/balancer/real/ loadfactor=1 route=p82
BalancerMember http://localhost:83/ape/proxy/balancer/real/ loadfactor=1 route=p83 redirect=p82 status=+d </Proxy>

When used inside a <Location> section, the first argument is omitted and the local directory is obtained from the <Location>.

If you need more flexible reverse proxy configuration, consider using RewriteRule directive with [P] flag.

ProxyPassReverse

Description: Adjusts the URL in HTTP response headers sent from a reverse proxied server
Syntax: ProxyPassReverse [path] url
Context: server config, virtual host, directory
Module: mod_proxy

ProxyPassReverse directive allows to adjust the URL in the Location, Content-Location and URI headers of HTTP redirect responses. This is necessary when using reverse proxy to avoid by-passing reverse proxy because of HTTP redirects on the backend servers which stay behind reverse proxy.

Only aforementioned HTTP response headers will be rewritten. This means that if the proxied content contains absolute URL references, they will by-pass the proxy.

path is the name of a local virtual path; url is a partial URL for the remote server - they are used the same way as in ProxyPass directive.

Example:

Say the local server has address http://domain.com/.

ProxyPass /mirror/foo/ http://backend.domain.com/
ProxyPassReverse /mirror/foo/ http://backend.domain.com/
ProxyPassReverseCookieDomain backend.domain.com public.domain.com
ProxyPassReverseCookiePath / /mirror/foo/ 

The above code will cause a local request to http://domain.com/mirror/foo/bar to be internally treated as a proxy request to http://backend.domain.com/bar (ProxyPass functionality). It will also take care of redirects sent bybackend.domain.com server: when it redirects http://backend.domain.com/bar to http://backend.domain.com/quux, *** adjusts this to http://domain.com/mirror/foo/quux before forwarding the HTTP redirect response to the client.

Note that ProxyPassReverse directive can also be used in conjunction with proxy pass-through feature (RewriteRule ... [P]) from mod_rewrite because it doesn't depend on a corresponding ProxyPass directive.

When used inside a <Location> section, the first argument is omitted and the local directory is obtained from the <Location>.

ProxyPassReverseCookieDomain

Description: Adjusts the Domain string in Set-Cookie headers from a reverse- proxied server
Syntax: ProxyPassReverseCookieDomain internal-domain public-domain
Context: server config, virtual host, directory
Module: mod_proxy

ProxyPassReverseCookieDomain is used similarly to ProxyPassReverse, but it rewrites domain string in Set-Cookie headers.

ProxyPassReverseCookiePath

Description: Adjusts the Path string in Set-Cookie headers from a reverse- proxied server
Syntax: ProxyPassReverseCookiePath internal-path public-path
Context: server config, virtual host, directory
Module: mod_proxy

ProxyPassReverseCookiePath is used similarly to ProxyPassReverse, but it rewrites the path string in Set-Cookie headers.

ProxyReceiveBufferSize

Description: Network buffer size for proxied HTTP and FTP connections
Syntax: ProxyReceiveBufferSize bytes
Default: ProxyReceiveBufferSize 0
Context: server config, virtual host
Module: mod_proxy

ProxyReceiveBufferSize directive specifies an explicit TCP/IP network buffer size for proxied HTTP and FTP connections to provide increased throughput. It has to be greater than 512 or set to 0 to specify that system's default buffer size should be used.

Example:

ProxyReceiveBufferSize 2048

ProxyRemote

Description: Remote proxy used to handle certain requests
Syntax: ProxyRemote match remote-server
Context: server config, virtual host
Module: mod_proxy

ProxyRemote specifies remote proxies for this proxy. match is either the name of a URL-scheme supported by remote server, or a partial URL for which remote server should be used, or * to indicate the server should be addressed for all requests. remote-server is a partial URL for the remote server (only http protocol is supported.

Example:

ProxyRemote http://thissite.com/ http://thatsite.com:8000
ProxyRemote * http://othersite.com
ProxyRemote ftp http://ftpproxy.domain.com:8080

ProxyRemote directive also supports reverse proxy configuration - a backend webserver can be embedded into a virtualhost URL space even if that server is hidden behind another forward proxy.

ProxyRemoteMatch

Description: Remote proxy used to handle requests matched by regular expressions
Syntax: ProxyRemoteMatch regex remote-server
Context: server config, virtual host
Module: mod_proxy

ProxyRemoteMatch is identical to the ProxyRemote directive, except that the first argument is a regular expression that is matched against the requested URL.

ProxyRequests

Description: Enables forward (standard) proxy requests
Syntax: ProxyRequests On|Off
Default: ProxyRequests Off
Context: server config, virtual host
Module: mod_proxy

ProxyRequests directive enables or disables forward proxy functionality.

If you are implementing reverse proxy configuration, this option should be set to off.

Warning! Do not enable ProxyRequests feature until your server is secured. Open proxy servers are dangerous for your network as well as for the Internet as a whole.

ProxyTimeout

Description: Network timeout for proxied requests
Syntax: ProxyTimeout seconds
Default: ProxyTimeout 300
Context: server config, virtual host
Module: mod_proxy

ProxyTimeout directive allows you to specify a timeout for proxy requests. This is useful when you have a slow application server, and rather than wait unlimited time, it's better to return a timeout response.

 

ProxyVia

Description: Information provided in the Via HTTP response header for proxied requests
Syntax: ProxyVia On|Off|Full|Block
Default: ProxyVia Off
Context: server config, virtual host
Module: mod_proxy

ProxyVia directive controls the use of the Via: HTTP header by the proxy. It is destined to control the flow of proxy requests through the chain of proxy servers.

The following values may be assigned to this directive:

  • Off (default) - no special processing is performed. If a request or reply contains Via: header, it is passed through unchanged.
  • On - each request and reply will get Via: header line added for the current host.
  • Full - each generated Via: header line will be appended by Helicon Ape version shown as a Via: comment field.
  • Block - every proxy request will have all its Via: header lines removed. No new Via: header will be generated.