|
Sva početnička pitanja Sva početnička pitanja bi trebala da se postavljaju u ovom forumu, a ako se pretvori u kvalitetnu diskusiju interesantnu svima - prebacićemo je u odgovarajući forum. Molimo "znalce" da ne omalovažavaju početnike, ako žele da pomognu svi ćemo biti zahvalni, ako ne žele, neka preskoče ovaj forum. |
|
Alati teme | Način prikaza |
28. 06. 2009. | #1 |
profesionalac
Qualified
Datum učlanjenja: 19.05.2007
Poruke: 123
Hvala: 13
3 "Hvala" u 3 poruka
|
robots.txt - sprijeciti indexiranje odredjnih linkova
Zdravo.
Ne znam gdje da postavim temu, a i pitanje je dosta početničko U robots.txt stavio sam: Kôd:
User-agent: * Disallow: /posalji-email-* Gdje grijesim? Hvala i pozdrav. |
28. 06. 2009. | #2 |
Super Moderator
Knowledge base
Datum učlanjenja: 02.10.2006
Lokacija: Niš
Poruke: 1.618
Hvala: 263
275 "Hvala" u 104 poruka
|
mislim da pretrazivaci ne prepoznaju * u Disallow.
ukratko - sa robots.txt neces uspeti to da uradis. stavi nofollow u linkovima ka tim stranicama, a preko google webmaster tools obrisi te indeksirane stranice. Poslednja izmena od Peca : 28. 06. 2009. u 16:18. |
28. 06. 2009. | #3 |
profesionalac
Qualified
Datum učlanjenja: 19.05.2007
Poruke: 123
Hvala: 13
3 "Hvala" u 3 poruka
|
Stavio sam rel="nofollow" u te linkove, ali čini mi se da je google ipak prepoznao te linkove kao zabrajene jer u "URL restricted by robots.txt" u google webmasters tool pise da je zabranjeno nekih 120 linkova, a indexirano je svega nekih 10-tak takvih linkova. Ti linkovi koji su indexirani su takodjer u listi Restricted by robots.txt.
E sad se ne mogu sjetiti 100%, ali mislim da sam naknado dodao ova pravila u robots.txt pa da je google uspio da indexira 10-tak tih linkova u jednom danu. Hvala na pomoci. Iskren da budem nisam nikada do sada korostio google webmasters tool. Pozdrav. |
28. 06. 2009. | #4 |
Super Moderator
Knowledge base
Datum učlanjenja: 02.10.2006
Lokacija: Niš
Poruke: 1.618
Hvala: 263
275 "Hvala" u 104 poruka
|
|
28. 06. 2009. | #5 |
član
Certified
|
Pattern matching
Yes, Googlebot interprets some pattern matching. This is an extension of the standard, so not all bots may follow it. Matching a sequence of characters using * You can use an asterisk (*) to match a sequence of characters. For instance, to block access to all subdirectories that begin with private, you could use the following entry: User-agent: Googlebot Disallow: /private*/ To block access to all URLs that include a question mark (?), you could use the following entry: User-agent: * Disallow: /*? To block access to all URLs containing the word "private", you could use: User-agent: * Disallow: /*private* Matching the end characters of the URL using $ You can use the $ character to specify matching the end of the URL. For instance, to block an URLs that end with .asp, you could use the following entry: User-agent: Googlebot Disallow: /*.asp$ You can use this pattern matching in combination with the Allow directive. For instance, if a ? indicates a session ID, you may want to exclude all URLs that contain them to ensure Googlebot doesn't crawl duplicate pages. But URLs that end with a ? may be the version of the page that you do want included. For this situation, you can set your robots.txt file as follows: User-agent: * Allow: /*?$ Disallow: /*? The Disallow:/ *? line will block any URL that includes a ? (more specifically, it will block any URL that begins with your domain name, followed by any string, followed by a question mark, followed by any string). The Allow: /*?$ line will allow any URL that ends in a ? (more specifically, it will allow any URL that begins with your domain name, followed by a string, followed by a ?, with no characters after the ?). Izvor: http://www.google.com/support/webmas...n&answer=40367 |
2 članova zahvaljuje Marko Medojevic za poruku: |
28. 06. 2009. | #6 |
Super Moderator
Knowledge base
Datum učlanjenja: 02.10.2006
Lokacija: Niš
Poruke: 1.618
Hvala: 263
275 "Hvala" u 104 poruka
|
ovo meni treba
tnx. |
14. 05. 2010. | #7 |
profesionalac
Qualified
Datum učlanjenja: 19.05.2007
Poruke: 123
Hvala: 13
3 "Hvala" u 3 poruka
|
Zna li neko zasto Google indexira stranice u kojima je u okviru head tagova stavljeno <meta name="robots" content="noindex" /> ?
Format tih adresa je: http://domen.com/forum/viewtopic.php...t=0&view=print I u robots.txt sam stavio Disallow: /forum/*&start=0&view=print i Disallow: /forum/*view=print ali dzaba, jer u Google Webmasters Centar pod 'Restricted by robots.txt' ih nema. Imali li kakva fora za masnovo slanje zahtijevaza birsanje u Google Webmasters Centru, tipa da obrise sve linkove koje u sebi sadrze 'print' ili se mora jedan po jedan ? EDIT: I kada posaljem zatijev da se ukloni iz pretrage jedan od tih linkova, oni budu ukoljeni, a inace da bi bili uklonjeni moraju biit ili 404 ili noindex ili restriced by robots.txt Poslednja izmena od mb_sa : 14. 05. 2010. u 21:59. |
|
|
Slične teme | ||||
Tema | Početna poruka teme | Forum | Odgovori | Poslednja poruka |
robots.txt | GaVrA | (X)HTML, JavaScript, DHTML, XML, CSS | 4 | 14. 11. 2008. 19:34 |
Drupal robots.txt ne radi kako treba | BluesRocker | Marketing i SEO | 2 | 11. 08. 2008. 23:18 |
Statistike Awstats Robots/Spiders visitors | novi | Sva početnička pitanja | 4 | 28. 01. 2008. 12:12 |
robots-nocontent tag | Eniac | Marketing i SEO | 0 | 03. 05. 2007. 11:33 |
Google indexiranje foruma kao php nuke modula | bukovski | Marketing i SEO | 4 | 11. 11. 2006. 08:47 |