Page 1 of 1
Need help!
Posted: 25 Oct 2004 01:09 pm
by Duvel78
I need some help for the robots.txt file, to avoid a duplicate content problem with my url rewriting... But I'm lost
I've found that code for example in a tutorial:
Code: Select all
Disallow: /your-forum-folder/sutra*.html$
Disallow: /your-forum-folder/ptopic*.html$
Disallow: /your-forum-folder/ntopic*.html$
Disallow: /your-forum-folder/ftopic*asc*.html$
But this is not allowed, a "*" after a "disallow" instruction...

Posted: 25 Oct 2004 01:24 pm
by 5lab
why not? surely robots.txt just tells bots what to do?
Posted: 25 Oct 2004 01:28 pm
by Duvel78
Yep it tells the bots what to do and not to do
But in the code, a "*" can't after a "disallow"
Posted: 25 Oct 2004 01:47 pm
by 5lab
different bots use slightly different rules. how do you know you cant use a * after a dissalow?
Posted: 25 Oct 2004 02:13 pm
by Duvel78
It's a general rule (
http://www.robotstxt.org )
Note also that regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot". Specifically, you cannot have lines like "Disallow: /tmp/*" or "Disallow: *.gif".
whole article:
http://www.robotstxt.org/wc/exclusion-admin.html
Posted: 25 Oct 2004 02:18 pm
by 5lab
ah gotcha. is there another character, something like ? you can use?
Posted: 31 Oct 2004 06:41 pm
by foggyjames
Presumably the * is a 'wildcard'...which doesn't work with this board....in which case the 'wildcard' for this board must be something else....
....he says from a complete position of ignorance
cheers
James
Posted: 31 Oct 2004 06:46 pm
by 5lab
you kinda had teh right idea, but not entirely right. never mind eh
