Pogledajte određenu poruku
Staro 28. 06. 2009.   #5
Marko Medojevic
član
Certified
 
Avatar Marko Medojevic
 
Datum učlanjenja: 12.05.2007
Lokacija: Beograd
Poruke: 82
Hvala: 20
293 "Hvala" u 7 poruka
Marko Medojevic is on a distinguished roadMarko Medojevic is on a distinguished roadMarko Medojevic is on a distinguished roadMarko Medojevic is on a distinguished road
Pošaljite poruku preko MSN za Marko Medojevic
Default

Pattern matching

Yes, Googlebot interprets some pattern matching. This is an extension of the standard, so not all bots may follow it.

Matching a sequence of characters using *
You can use an asterisk (*) to match a sequence of characters. For instance, to block access to all subdirectories that begin with private, you could use the following entry:

User-agent: Googlebot
Disallow: /private*/

To block access to all URLs that include a question mark (?), you could use the following entry:

User-agent: *
Disallow: /*?

To block access to all URLs containing the word "private", you could use:

User-agent: *
Disallow: /*private*

Matching the end characters of the URL using $
You can use the $ character to specify matching the end of the URL. For instance, to block an URLs that end with .asp, you could use the following entry:

User-agent: Googlebot
Disallow: /*.asp$

You can use this pattern matching in combination with the Allow directive. For instance, if a ? indicates a session ID, you may want to exclude all URLs that contain them to ensure Googlebot doesn't crawl duplicate pages. But URLs that end with a ? may be the version of the page that you do want included. For this situation, you can set your robots.txt file as follows:

User-agent: *
Allow: /*?$
Disallow: /*?

The Disallow:/ *? line will block any URL that includes a ? (more specifically, it will block any URL that begins with your domain name, followed by any string, followed by a question mark, followed by any string).

The Allow: /*?$ line will allow any URL that ends in a ? (more specifically, it will allow any URL that begins with your domain name, followed by a string, followed by a ?, with no characters after the ?).

Izvor:
http://www.google.com/support/webmas...n&answer=40367
Marko Medojevic je offline   Odgovorite uz citat
2 članova zahvaljuje Marko Medojevic za poruku: