Best robots.txt Format for Simple Machines Forums

Previous topic - Next topic
Following is the best robots.txt (https://www.dailytechtuts.com/robots.txt) Format for Simple Machines Forums:

QuoteUser-agent: *

# --------------------------------------------------
# 1. Allow rendering resources globally (critical)
# --------------------------------------------------
Allow: /*.css$
Allow: /*.js$
Allow: /*.png$
Allow: /*.jpg$
Allow: /*.jpeg$
Allow: /*.gif$
Allow: /*.webp$
Allow: /*.svg$
Allow: /*.woff$
Allow: /*.woff2$
Allow: /*.ttf$

# --------------------------------------------------
# 2. Block low-value system directories
# (assets still load via rules above)
# --------------------------------------------------
Disallow: /Themes/
Disallow: /Sources/
Disallow: /Packages/
Disallow: /Smileys/
Disallow: /avatars/

# --------------------------------------------------
# 3. Block guaranteed crawl traps (not speculative)
# --------------------------------------------------
Disallow: /index.php?action=printpage
Disallow: /index.php?action=search
Disallow: /index.php?action=profile
Disallow: /index.php?action=login
Disallow: /index.php?action=register
Disallow: /index.php?action=help
Disallow: /index.php?action=calendar
Disallow: /index.php?action=stats
Disallow: /index.php?action=who
Disallow: /index.php?action=recent
Disallow: /index.php?action=unread
Disallow: /index.php?action=mlist

# --------------------------------------------------
# 4. Block true duplication at scale
# --------------------------------------------------
Disallow: /*sort=

# --------------------------------------------------
# 5. Block session IDs (critical)
# --------------------------------------------------
Disallow: /*;PHPSESSID=
Disallow: /*PHPSESSID=

# --------------------------------------------------
# 6. Explicitly allow core content
# --------------------------------------------------
Allow: /index.php?topic=
Allow: /index.php?board=

Sitemap: https://www.dailytechtuts.com/sitemap.xml

Similar topics (2)