why does a remote address continue to crawl after redirect?

ISAPI_Rewrite is a powerful URL manipulation engine based on regular expressions
User avatar
Posts: 4
Joined: 22 Apr 2013, 14:51

why does a remote address continue to crawl after redirect?

22 Apr 2013, 15:09

Hi. I am using my shared hosts IIS6 module with version 2 of ISAPI_Rewrite. My .htaccess file looks like this:

RewriteCond %{REMOTE_ADDR} (199\.\d{1,3}\.\d{1,3}\.\d{1,3}|\b182\.\d{1,3}\.\d{1,3}\.\d{1,3})
RewriteRule (.*) http://www.dietrichsvault.com/CASP_BotCatch.aspx?htaccess=1&agentblock=0&ipblock=1 [IRP]

In reviewing my log files, I see that an agent from 199.187.122.91 is redirected to my botcatch form but continues to crawl other pages. I want this ip address to be redirected permanently to CASP_BotCAtch.aspx. How can I do this or does this agent have an index file they are referring to, to crawl the pages on my site?

Thanks for any and all assistance.

User avatar
Posts: 871
Joined: 12 Mar 2012, 09:54

Re: why does a remote address continue to crawl after redire

23 Apr 2013, 07:54

Can you please fix the rules as follows to ensure the proper redirect:

RewriteCond %{REMOTE_ADDR} (199\.\d{1,3}\.\d{1,3}\.\d{1,3}|\b182\.\d{1,3}\.\d{1,3}\.\d{1,3})
RewriteRule (.*) http\://www.dietrichsvault.com/CASP_BotCatch.aspx\?htaccess=1&agentblock=0&ipblock=1 [I,RP]

And I believe it's not .htaccess but httpd.ini file if you use ISAPI_Rewrite 2...

User avatar
Posts: 4
Joined: 22 Apr 2013, 14:51

Re: why does a remote address continue to crawl after redire

23 Apr 2013, 13:31

Thanks for the reply Anton. I will correct the regex errors in the rewrite rule. I tested this on my own ip address and it worked. Is there something more complicated I need to do to it?

My shared hosting provider uses .htaccess. I thought it was httpd.ini as well. But it is version 2.

thanks again

User avatar
Posts: 871
Joined: 12 Mar 2012, 09:54

Re: why does a remote address continue to crawl after redire

24 Apr 2013, 04:59

"Is there something more complicated I need to do to it?"
- I don't think so. It should work as expected.

"My shared hosting provider uses .htaccess. I thought it was httpd.ini as well. But it is version 2."
- really weird... .htaccess is used in ISAPI_Rewrite 3 and httpd.ini in version 2. But if it works fine for you, then it's fine.

User avatar
Posts: 4
Joined: 22 Apr 2013, 14:51

Re: why does a remote address continue to crawl after redire

24 Apr 2013, 11:18

Anton, one last question.

Please confirm my understanding. An ipaddress that I want to redirect makes its request to my website, it gets redirected based on the regex in the rewrite condition. So any attempt to access my site by the offending ipaddress as matched in the regex condition is "blocked/redirected" no matter what page it requests. If this is true, then "it" must have a cached copy of my site index somewhere -- correct? Is this what you and your colleagues have seen?

And based on your experience how persistent are these problem/questionable crawlers/visitors? If you redirect them one time, do they keep coming back or do they give up at some point, try a new ipaddress or new agent name--my guess is all three.

Thank you for your time and excellent help. It has been a learning experience. I am going to monitor my logs to make sure that I am not blocking good bots and real customers. My blocking is in the class A ipaddress range...but are related to suspect countries that I don't want on my site anyway. If I miss a customer from Kazackistan, well that's not my target audience anyway. thanks again

User avatar
Posts: 871
Joined: 12 Mar 2012, 09:54

Re: why does a remote address continue to crawl after redire

25 Apr 2013, 07:57

"Please confirm my understanding. An ipaddress that I want to redirect makes its request to my website, it gets redirected based on the regex in the rewrite condition. So any attempt to access my site by the offending ipaddress as matched in the regex condition is "blocked/redirected" no matter what page it requests."
- yes, this is true.

"And based on your experience how persistent are these problem/questionable crawlers/visitors? If you redirect them one time, do they keep coming back or do they give up at some point, try a new ipaddress or new agent name--my guess is all three."
- you are probably right - all three. Although I don't have any statistics...

Return to ISAPI_Rewrite 2.x

Who is online

Users browsing this forum: No registered users and 8 guests