ISAPI Filter编程重写URL

黄景胜
2023-12-01
问题描述:要在一个Web站点上实现二级域名Url的重写。例如:http://abc.company.com 重写到 http://company.com/usersite.asp?sitename=abc 其中abc二级域名不定,根据用户申请的名字来决定。服务器上有一个DNS服务。可以吧*.compay.com所有二级域名正确解析到company.com。经过重写后。浏览器中的地址是http://abc.company.com 但实际请求的地址是 http://company.com/usersite.asp?sitename=abc 让用户感觉好像在访问 http://abc.company.com

好像在Apache上用Apache的mod_rewrite,配置一下就可以轻松的解决问题了。据说微软的IIS7中也支持了这种特性,IIS7可以在 IIS 请求管道的任何地方执行一个HttpModule,下面是IIS7给的配置:
<?xml version="1.0" encoding="UTF-8"?>
<configuration> <configSections>
<section name="rewriter" 
requirePermission="false" 
type="Intelligencia.UrlRewriter.Configuration.RewriterConfigurationSectionHandler, Intelligencia.UrlRewriter" />
</configSections>
<system.web>
<httpModules>
<add name="UrlRewriter" type="Intelligencia.UrlRewriter.RewriterHttpModule, Intelligencia.UrlRewriter" />
</httpModules>
</system.web>
<system.webServer>
<modules runAllManagedModulesForAllRequests="true">
<add name="UrlRewriter" type="Intelligencia.UrlRewriter.RewriterHttpModule" />
</modules>
<validation validateIntegratedModeConfiguration="false" />
</system.webServer>
<rewriter>
<rewrite url="~/products/(.+)" to="~/products.aspx?category=$1" />
</rewriter>
</configuration>


遗憾的是服务器使用的是Win2003+IIS6。并且服务器上同时部署了asp程序和asp.net程序如果使用 UrlRewriter.net组件,只能在HttpModule一级重写,只能作用于所有使用 % WINDIR %\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll处理请求的后缀文件。如 .aspx .ascx请求。但是asp是用%WINDIR%\WINDOWS\system32\inetsrv\asp.dll来处理请求的。请求不会分检到Asp.net的HttpModule中。所以对于asp的请求,UrlRewriter.net组件的HttpModule重写不能起效。.jpg .gif .htm也不能被重写。看来只能用ISAPI Filters  http://msdn2.microsoft.com/en-us/library/ms525908.aspx 来重写了。

SAPI Filters有两个非常著名工程:
1. Helicon Techs ISAPI Rewrite:  http://www.isapirewrite.com/ 提供一个99美元(可免费试用30天)的ISAPI URL重写产品完整版,以及一个免费的轻量级版本。
2. Ionics ISAPI Rewrite:  http://cheeso.members.winisp.net/IIRF.aspx 全免费开源组件。

Helicon Techs ISAPI Rewrite的产品完整版已经解决了这个问题。可是Helicon Techs的付费手段少,在付费线购买太难用了,如果像支付宝一样好用(点击付款连再次确认都没有,直接钱就出去了 呵呵)估计会增加购买量。

看来自己动手丰衣足食的时候到了,从Ionics ISAPI Rewrite的基础上改一个可以支持二级域名重写的ISAPI Filter吧。下载Ionics Isapi Rewrite Filter(下面简写为IIRF)  http://www.codeplex.com/IIRF/Release/ProjectReleases.aspx?ReleaseId=5018 。最后一个版本是1.2.12c Beta 千万不要用1.2.12b,1.2.12b有内存泄露问题。1.2.12c修复了这个问题(加了free( myCopy )部分的代码)。

下载后主要有8 个文件。IsapiRewrite4.c TestDriver.c IirfConfig.h IirfConstants.h IirfRequestContext.h RewriteRule.h pcre-5.0/ pcre.h pcre-5.0/pcre.lib 其中的TestDriver.c是测试正则表达式的exe可以先不看。RewriteRule.h pcre-5.0/ pcre.h pcre-5.0/pcre.lib 是正则表达式解析引擎。

修改makefile中:VC="C:\Program Files\Microsoft Visual Studio 8\VC" 和PCRESOURCE=E:\Project\InUrlRewrite\SRC\pcre-5.0 部分。到.net 2.0的console下cd进入下载目录使用 nmake –f makefile编译。让工程可以正确编译。

搭建一个可以测试自己编译的ISAPI Filter的环境。建一个WebSite在ISAPI Filters添加刚刚Build好的IsapiRewrite4.dll。重启IIS。在IsapiRewrite4.ini中设置输出Log等级为5(输出详细Log)指定Log输出位置。修改%WINDIR%\system32\drivers\etc\hosts文件。在域名解析中加入127.0.0.1 abc.company.com 和 127.0.0.1 company.com这两行便于测试。在新建的WebSite中加入一个test.asp文件,test.asp中只用写Response.Write(Query("Domain")) 提供重写后的测试。

到这里所有的编程前需要的环境都部署好了。现在可以开始改 IIFR了。看完了所有的代码,发现核心的内容都在IsapiRewrite4.c中,这个文件总共有2600多行。核心的函数是DoRewrites。对外的Entry-Point Functions是GetFilterVersion HttpFilterProc TerminateFilter通过这三个方法向IIS暴露 ISAPI Filter内部的功能,IIS会调用ISAPFilter的这三个方法。在 http://msdn2.microsoft.com/en-us/library/ms525572.aspx 中有详细的介绍。

主要修改DoRewrites就基本上可以满足要求了。
1. 添加一个新的表示Pattern -> [D]。
2. 修改ApplyRules方法对Url的处理,当表示Pattern是[D]时,从GetServerVariable("HTTP_HOST", pfc)中取得http_host,拼凑到OriginalUrl前再进行正则表达式的匹配。
经过上面两部就实现了对二级域名重写的功能。 

注意:IIFR的License要求(在IsapiRewrite4.c顶部有详细的描述)很严格。可能不允许再生产,和商业使用。

修改后的工程:
1.修改RewriteRule.h 在RewriteRule结构体中加入 boolean IsMatchDomain 这个Field标识新的Pattern. 修改后的结构:
typedef struct RewriteRule {
pcre * RE;
char * Pattern;
char * Replacement;
boolean IsRedirect;
int RedirectCode;
boolean IsForbidden;
boolean IsNotFound;
boolean IsLastIfMatch;
boolean IsCaseInsensitive;
boolean RecordOriginalUrl;
boolean IsMatchDomain;

// any condition that applies
struct RewriteCondition * Condition;

// doubly-linked list
struct RewriteRule * next;
struct RewriteRule * previous;
} RewriteRule, *P_RewriteRule;


2.修改IsapiRewrite4.c中的ParseRuleModifierFlags方法加入对IsMatchDomain的处理。修改后的方法:
void ParseRuleModifierFlags(char * pModifiers, RewriteRule *rule)
{
char MsgBuffer[512];

rule->IsRedirect= FALSE;
rule->IsForbidden= FALSE;
rule->IsLastIfMatch= FALSE;
rule->IsNotFound= FALSE;
rule->IsCaseInsensitive= FALSE;
rule->RecordOriginalUrl= FALSE;
rule->IsMatchDomain = FALSE;

if (pModifiers==NULL) return; // no flags at all

sprintf_s(MsgBuffer,512,"ParseRuleModifierFlags: %s", pModifiers);
LogMsg(2, MsgBuffer);

if ((pModifiers[0] != [) ||
(pModifiers[strlen(pModifiers)-1] != ])) {
LogMsg(1, "WARNING: Badly formed RewriteRule modifier flags.");
return;
}
else {
char * p1, *p2;
char * StrtokContext= NULL;
p1= pModifiers+1; // skip leading [
pModifiers[strlen(pModifiers)-1]=0; // remove trailing ]

p2= strtok_s(p1, ",", &StrtokContext); // split by commas
while (p2 != NULL) {
if (config->LogLevel >= 5 ) {
sprintf_s(MsgBuffer,512,"ParseRuleModifierFlags: token %s", p2);
LogMsg(5, MsgBuffer);
}

if (p2[0]==R) { // redirect
rule->IsRedirect= TRUE;
rule->RedirectCode= REDIRECT_CODE_DEFAULT; // use the default redirect code
if ((p2[1]!=0) && (p2[1]===) && (p2[2]!=0)) {
int n= atoi(p2+2);
if ((n <= REDIRECT_CODE_MAX) && (n >= REDIRECT_CODE_MIN))
rule->RedirectCode= n;
}
}
else if ((p2[0]==F) && (p2[1]==0)) { // forbidden (403)
LogMsg(5, "rule: Forbidden");
rule->IsForbidden= TRUE;
}
else if ((p2[0]==N) && (p2[1]==F) && (p2[2]==0)) { // not found (404)
LogMsg(5, "rule: Not found");
rule->IsNotFound= TRUE;
}
else if ((p2[0]==L) && (p2[1]==0)) { // Last rule to process if match
LogMsg(5, "rule: Last");
rule->IsLastIfMatch= TRUE;
}
else if ((p2[0]==I) && (p2[1]==0)) { // case-insensitive
LogMsg(5, "rule: Case Insensitive match");
rule->IsCaseInsensitive= TRUE;
}
else if ((p2[0]==U) && (p2[1]==0)) { // Unmangle URLs
LogMsg(5, "rule: Unmangle URLs");
rule->RecordOriginalUrl= TRUE;
}
else if ((p2[0]==D) && (p2[1]==0)) { // Match Domain Rule URLs
LogMsg(5, "rule: Match Domain URLs");
rule->IsMatchDomain = TRUE;
}
else {
sprintf_s(MsgBuffer,512,"WARNING: unsupported RewriteRule modifier flag %s", p2);
LogMsg(1, MsgBuffer);
}

p2= strtok_s(NULL, ",", &StrtokContext); // next token
}

// consistency checks
if (rule->IsForbidden && rule->IsRedirect)
LogMsg(1, "WARNING: Conflicting modifier flags - F,R");
if (rule->IsForbidden && rule->IsLastIfMatch)
LogMsg(1, "WARNING: Redundant modifier flags - F,L");
if (rule->IsForbidden && rule->IsNotFound)
LogMsg(1, "WARNING: Conflicting modifier flags - F,NF");
if (rule->IsNotFound && rule->IsLastIfMatch)
LogMsg(1, "WARNING: Redundant modifier flags - NF,L");
if (rule->IsNotFound && rule->IsRedirect)
LogMsg(1, "WARNING: Conflicting modifier flags - NF,R");
if (rule->IsRedirect && rule->IsLastIfMatch)
LogMsg(1, "WARNING: Redundant modifier flags - R,L");

}
return;
}


3.修改IsapiRewrite4.c中的ApplyRules方法加入对请求中httphost部分的处理。修改后的方法:

int ApplyRules( 
   HTTP_FILTER_CONTEXT * pfc, 
   char * subject, 
   int depth, 
   /* out */ char **result, 
   /* out */ boolean *pRecordOriginalUrl

{
RewriteRule * current= config->rootRule;
int retVal= 0; // 0 = do nothing, 1 = rewrite, 403 = forbidden, other = redirect
char MsgBuffer[512];
int c=0;
int RuleMatchCount, i; 
int *RuleMatchVector;
#if _WRITE_LOG
sprintf_s(MsgBuffer,512,"ApplyRules (depth=%d)", depth);
LogMsg(3, MsgBuffer);
#endif
if (current==NULL) {
#if _WRITE_LOG
  LogMsg(2, "ApplyRules: No configuration available.");
#endif
  return 0;
}


// The PCRE doc says vector length should be 3n?? why? seems like it ought to be 2n. or maybe 2n+1.
// In any case we allocate 3n. 
RuleMatchVector= (int *) malloc((config->MaxMatchCount*3)*sizeof(int)); 

// The way it works: First we evaluate the URL request, against the RewriteRule pattern. 
// If there is a match, then the logic evaluates the Conditions attached to the rule. 
// This may be counter-intuitive, since the Conditions appear BEFORE the rule in the file, 
// but the Rule is evaluated FIRST. 

// TODO: employ a MRU cache to map URLs
while (current!=NULL) {
  c++;

  LogMsg(3, "subject");
  LogMsg(3, subject);
    
  if(current->IsMatchDomain && depth==0)
  {
    char *serverHost;
    char *originalUrlWithDomain;
    serverHost= GetServerVariable("HTTP_HOST", pfc);
    originalUrlWithDomain = (char *) pfc->AllocMem(pfc, strlen(serverHost)+strlen(subject)+1, 0);
    strcpy(originalUrlWithDomain,serverHost);
    strcat(originalUrlWithDomain,subject);
    subject = originalUrlWithDomain;
  }

    RuleMatchCount = pcre_exec( 
     current->RE, /* the compiled pattern */
     NULL, /* no extra data - we didnt study the pattern */
     subject, /* the subject string */
     strlen(subject), /* the length of the subject */
     0, /* start at offset 0 in the subject */
     0, /* default options */
     RuleMatchVector, /* output vector for substring position information */
     config->MaxMatchCount*3); /* number of elements in the output vector */

  // return code: >=0 means number of matches, <0 means error

  if (RuleMatchCount < 0) {
   if (RuleMatchCount== PCRE_ERROR_NOMATCH) {
#if _WRITE_LOG
    sprintf_s(MsgBuffer,512,"Rule %d : %d (No match)", c, RuleMatchCount );
    LogMsg(3, MsgBuffer);
#endif
   }
   else {
#if _WRITE_LOG  
      sprintf_s(MsgBuffer,512,"Rule %d : %d (unknown error)", c, RuleMatchCount);
      LogMsg(2, MsgBuffer);
#endif
   }
  }
  else if (RuleMatchCount == 0) {
#if _WRITE_LOG
    sprintf_s(MsgBuffer,512,"Rule %d : %d (The output vector (%d slots) was not large enough)", 
     c, RuleMatchCount, config->MaxMatchCount*3);
   LogMsg(2, MsgBuffer);
#endif
  }
  else {
   // we have a match and we have substrings
   boolean ConditionResult= FALSE;

   PcreMatchResult RuleMatchResult;
   PcreMatchResult CondMatchResult; 
#if _WRITE_LOG
   sprintf_s(MsgBuffer,512,"Rule %d : %d matches", c, RuleMatchCount);
   LogMsg(2, MsgBuffer);
#endif
   // easier to pass these as a structure
   RuleMatchResult.Subject= subject;
   RuleMatchResult.SubstringIndexes= RuleMatchVector;
   RuleMatchResult.MatchCount= RuleMatchCount;

   // The fields in CondMatchResult may be filled by the EvaluateConditionList(), but
   // we must init them because the EvaluateConditionList may never be called. The
   // results reflect only the "last" Condition evaluated. This may or may not be
   // the final Condition in the file; the evaluation engine wont evaluate
   // Conditions unnecessarily. Check the readme for more details.
   CondMatchResult.Subject= NULL;
   CondMatchResult.SubstringIndexes= NULL;
   CondMatchResult.MatchCount= 0;

   // evaluate the condition list, if there is one. 
   ConditionResult= 
    (current->Condition==NULL) || 
    EvaluateConditionList(pfc, 
         &RuleMatchResult, 
         &CondMatchResult, 
         current->Condition);

   // Check that any associated Condition evaluates to true, before 
   // applying this rule. 
   if ( ConditionResult ) {

    char *ts1;
    char *newString;

    // generate the replacement string
    // step 1: substitute server variables, if any.
    ts1= ReplaceServerVariables(pfc, current->Replacement);

    // step 1: substitute back-references as appropriate.
    newString= GenerateReplacementString(ts1, // current->Replacement,
             &RuleMatchResult, 
             &CondMatchResult);
    free(ts1);
    FreeMatchResult(&CondMatchResult);

    if (sizeof(MsgBuffer)-28> strlen(newString)) { 
#if _WRITE_LOG
     sprintf_s(MsgBuffer,512,"Result (length %d): %s", strlen(newString), newString);
     LogMsg(2,MsgBuffer);
#endif
    }
    else {
#if _WRITE_LOG  
      LogMsg(2,"(Log Buffer too small to show new string)");
     sprintf_s(MsgBuffer,512,"Result length: %d", strlen(newString));
     LogMsg(3,MsgBuffer);
#endif
    }

    // set output params
    *result= newString; 

    // if the current rule asks to record the original URL, then set the OUT flag. 
    *pRecordOriginalUrl |= current->RecordOriginalUrl;

    // check modifiers
    if (current->IsRedirect) {
     retVal = current->RedirectCode; // = redirect
    }
    else if (current->IsForbidden) {
     // no recurse 
#if _WRITE_LOG
     LogMsg(2,"returning: Forbidden");
#endif
     retVal = 403; // = forbidden
    }
    else if (current->IsNotFound) {
     // no recurse 
#if _WRITE_LOG  
      LogMsg(2,"returning: Not Found");
#endif
      retVal = 404; // = not found
    }
    else {
     // rewrite
     retVal= 1;
     if (current->IsLastIfMatch) {
      // no recurse
#if _WRITE_LOG
      LogMsg(2,"Last if Match");
#endif
      break;
      //current= NULL; // as a way to stop the loop
     }
     else {
      // by default, we recurse on the RewriteRules.
      if (depth < config->IterationLimit) {
       char * t; 
       int rv= ApplyRules(pfc, newString, depth+1, &t, pRecordOriginalUrl);
       if (rv) { 
        *result= t; // a newly allocated string
        retVal= rv; // for return to caller
        free(newString); // free our string, we no longer need it
       }
       // else, no match on recursion, so dont free newString (keep the existing result).
      }
      else {
#if _WRITE_LOG
       sprintf_s(MsgBuffer,512,"Iteration stopped; reached limit of %d cycles.", 
         config->IterationLimit);
       LogMsg(2,MsgBuffer);
#endif
      }
     }
    }

    break; // break out of while loop on the first match
   }

  }

  // We did not break out of the loop. 
  // Therefore, this rule did not apply. 
  // Therefore, go to the next rule. 
  if (current!=NULL)
   current= current->next;
}

free(RuleMatchVector);
#if _WRITE_LOG
sprintf_s(MsgBuffer,512,"ApplyRules: returning %d", retVal);
LogMsg(3,MsgBuffer);
#endif
return retVal;
}


转载: http://zhangsichu.com/blogview.asp?Content_Id=82

 类似资料: