Skip Ribbon Commands
Skip to main content

Robin | zevenseas | SharePoint Blog

:

The zevenseas Community > Blogs > Robin | zevenseas | SharePoint Blog > Posts > Experiences with URL Rewriting and SharePoint
September 23
Experiences with URL Rewriting and SharePoint

With IIS7 comes the URL rewrite module.. and the beauty of this is, that you can write your own provider to do the magic... And that’s just the thing that we did in our current project!

Why? Well.. because, instead of rewriting ‘dirty’ urls to ‘friendly’ urls, we do it the other way around. We take ‘friendly’ urls and make them ‘dirty’ again so that our solution can do it’s thing..  We can do this, because we also wrote our own navigation provider, meaning that we have complete control of the urls that are being rendered (they are either friendly or dirty) on the page.

Let me give you an example of such an url translation:

(on the summary.aspx page, we have two webparts that, based on the querystring, either shows an overview of all the products within a category or shows detailed information about the selected product)

This is how the web.config section looks of the url rewrite module

<rewrite>
  <rules>
    <rule name="ContosoUrlRewrite" enabled="true" patternSyntax="Wildcard" stopProcessing="false">
      <match url="*" />
      <action type="Rewrite" url="{ReplaceProvider:{REQUEST_URI}}" appendQueryString="false" />
    </rule>
  </rules>
  <outboundRules>
    <preConditions>
      <preCondition name="ResponseIsHtml1">
        <add input="{RESPONSE_CONTENT_TYPE}" pattern="^text/html" />
      </preCondition>
    </preConditions>
  </outboundRules>
  <providers>
    <provider name="ReplaceProvider" type="Contoso.Web.Common.RewriterProvider, Contoso.Web.Common, Version=1.0.0.0, Culture=neutral, PublicKeyToken=f582335141651861">
      <settings />
    </provider>
  </providers>
</rewrite>

As you can see, we apply the rewrite action on the REQUEST_URI on every url that is handled by IIS by using our own provider.  So, how does the provider look like then eh?

First the REQUEST_URI is passed to the Rewrite method, based on what the value contains we are going to parse the value and thus rewriting the url OR we do nothing at all  if the url is valid (for example, when a dirty url is being used we just pass the dirty url through and don’t do anything with it).
Next, if the url contains the world “onlineshop” it means that it was a friendly url, so we need to rewrite it and determine what kind of url we need to pass back (either a category url or category plus productid url).

You can also notice the ‘Avoid302’ method, what SharePoint does when you navigate to the root of a site/web (like http://www.contoso.com) it redirects you to the /pages/default.aspx or /default.aspx with a 302 status. SEO wise, a 302 wise is not ‘friendly’ and it should be a ‘301’ instead. But, none of them is even more perfect! In our project, we only use the /pages/default.aspx for our homepages so we can ‘redirect’ the user straight away instead of SharePoint having to do this for us thereby avoiding a 302 or 301..

/// <summary>
/// This is where the magic happens.. from IIS we get every request passed through,
/// by only filtering where the url contains 'onlineshop' we rewrite only the urls that
/// are used in the product and category pages.
/// </summary>
/// <param name="value"></param>
/// <returns></returns>
public string Rewrite(string value)
{      
    //If the url contains a ~ symbol it means that requested 
    //url comes from the layouts or another virtual directory, 
    //therefore splitting it and only returning the server relative url
    if (value != null && value.Contains("~"))
    {
        var parts = value.Split(new[] { '~' });
        if (parts.Length > 1)
        {
            return parts[1];
        }
    }

    //Rewriting category and product urls
    if (value.Contains("/onlineshop"))
        return ParseUrl(value);

    if (value.EndsWith("/"))
        return Avoid302(value);

    //If nothing is corresponding then we leave the url as it is..
    return value;
}
 
/// <summary>
/// This parser is responsible for figuring out 
/// what kind of url is requested and thus 
/// determined what the return Url is going to be so that
/// our code can handle properly.
/// </summary>
/// <param name="incoming"></param>
/// <returns></returns>
private string ParseUrl(string incoming)
{
    string url = String.Empty;           

    //Check if we got a product
    url = GetProductUrl(incoming);
    if (url != null)
        return url;

    //Cehck if we got a category.
    url = GetCategoryUrl(incoming);
    if (url != null)
        return url;

    return null;
}

/// <summary>
/// The awesome RegEx comes from no other than 
/// Emile Bosch (http://nl.linkedin.com/in/ebosch)
/// </summary>
/// <param name="incoming"></param>
/// <returns></returns>
private string GetCategoryUrl(string incoming)
{
    var r = new Regex("onlineshop/(.*)");
    var m = r.Match(incoming);
    if (m.Success)
    {
        var category = m.Groups[1].Value;
        return BuildUrl(category, string.Empty);
    }
    return null;
}

/// <summary>
/// The awesome RegEx comes from no other than 
/// Emile Bosch (http://nl.linkedin.com/in/ebosch)
/// </summary>
/// <param name="incoming"></param>
/// <returns></returns>
private string GetProductUrl(string incoming)
{
    var productRegex = new Regex("(onlineshop/(.*?)/([aA-zZ_0-9-]+-([0-9]+)))",
        RegexOptions.Multiline | RegexOptions.Compiled);

    var x = productRegex.Match(incoming);
    if (x.Success)
    {
        var category = x.Groups[2].Value;
        var prodid = x.Groups[4].Value;
        return BuildUrl(category, prodid);
    }
    return null;
}

/// <summary>
/// This method builds the Url so that our logic doesn't even know that url was 'clean'
/// </summary>
/// <param name="categoryId"></param>
/// <param name="productId"></param>
/// <returns></returns>
private string BuildUrl(string categoryId, string productId)
{
    categoryId = categoryId.Replace("/", "_");
   
    if (!string.IsNullOrEmpty(productId))
    {
        return "/onlineshop/pages/summary.aspx?category=" + categoryId + "&productid=" + productId;
    }

    return "/onlineshop/pages/summary.aspx?category=" + categoryId;
}
 
/// <summary>
/// This is to avoid the 302 that SharePoint likes to give when landing on the "/"
/// of a SharePoint site/web and redirects you to default.aspx. 
/// </summary>
/// <param name="value"></param>
/// <returns></returns>
private string Avoid302(string value)
{
    return value + "pages/default.aspx";
}
 

This is how we solved the url rewriting bit. Some things for your consideration, there is not HttpContenxt (and thus no SPContext) at the time that the Url Rewriting takes place. Therefore, no logic can be used to do even more fancy stuff.. yet it also means that it’s very light weight and has less performance impact. Also, when a Url has been rewritten, the rewrite is being cached (makes it very difficult to debug sometimes because the urls are cached.. so a lots of IISRESETS are meant to be done for debugging (but we aren’t new to that now are we? ;))
Also, if look around and do your homework, you wil find out that URL Rewrite + SharePoint = No Support, so make sure your client/customer knows about this. On the other side, the beauty of this technique is that you can turn it on and off relatively easily in IIS plus everything will remain working if you pass the ‘dirty’ urls, since we are only rewriting the friendly ones.

In short, by taking this approach, we are not rewriting SharePoint urls to be friendly.. we are ‘just’ handling friendly urls that are coming from a custom navigation provider that we have to make ‘SharePoint’-ish again.

Some helpful links:

Comments

Why parse all url's

I guess images webservice-calls etc are also processed by the rewriter. Maybe making the rule in web.config more specific on the match will improve site performance. Try running dotTrace for your project to find that out.
 on 14/01/2011 13:09
 

 Statistics

 
Views: 6258
Comments: 0
Tags:
Published:1458 Days Ago