News:

Bored?  Looking to kill some time?  Want to chat with other SMF users?  Join us in IRC chat or Discord

Main Menu

SEO: Duplicate Content Preventer

Started by Aaron, November 09, 2006, 01:06:34 PM

Previous topic - Next topic

bluegray

Google have no problems spidering smf standard install. But the pages that show up in the google search will take some time, depending on when they update. So it might take a while if you got spidered just after an update. A proper sitemap might help google spider your important pages faster, but it does not guarantee it will show up in results. GoogleBot will follow all the urls, but the duplicate pages will not be indexed, but instead show up as 'Supplemental Result' or just the url will be listed. Some of the links might already be in the google index, and those will take a while to disappear unless you specifically ask them to remove them.

Check which pages are in the google index by searching for 'site:yourwebsite.com'

Having your pages show up in the top ten is a whole other story. You will have to provide content that is relevant to the search words. Although the standard smf theme is fine, there is plenty of optimizations that can be done (header tags/title/words in url ect.). The prettyurls mod, while not necessary to get spidered, can provide extra keywords for google to rank the page by. I get spidered every day, and most pages are not more than 3 days old in google's cache.  Relevant keywords also provide results in the top ten for my site.

To keep googlebot from certain pages, I also use a robots.txt file, but it's not necessary. But then googlebot doesn't waste time spidering content that people probably don't want to see


User-agent: *
Disallow: /cgi-bin/
Disallow: /index.php?action=activate
Disallow: /index.php?action=admin
Disallow: /index.php?action=arcade
Disallow: /index.php?action=calendar
Disallow: /index.php?action=collapse
Disallow: /index.php?action=deletemsg
Disallow: /index.php?action=editpoll
Disallow: /index.php?action=help
Disallow: /index.php?action=helpadmin
Disallow: /index.php?action=lock
Disallow: /index.php?action=login
Disallow: /index.php?action=logout
Disallow: /index.php?action=markasread
Disallow: /index.php?action=mergetopics
Disallow: /index.php?action=mlist
Disallow: /index.php?action=modifykarma
Disallow: /index.php?action=movetopic
Disallow: /index.php?action=notify
Disallow: /index.php?action=notifyboard
Disallow: /index.php?action=pm
Disallow: /index.php?action=post
Disallow: /index.php?action=profile
Disallow: /index.php?action=register
Disallow: /index.php?action=removetopic2
Disallow: /index.php?action=reporttm
Disallow: /index.php?action=search
Disallow: /index.php?action=sendtopic
Disallow: /index.php?action=splittopics
Disallow: /index.php?action=stats
Disallow: /index.php?action=sticky
Disallow: /index.php?action=trackip
Disallow: /index.php?action=unread
Disallow: /index.php?action=unreadreplies
Disallow: /index.php?action=who
Disallow: /Themes/

Disallow: *.msg

Niteblade

#21
Nice robots.txt file ...

Here's mine ... with some Tinyportal additions.

User-agent: Fasterfox
Disallow: /

User-agent: *
Disallow: /arcade/
Disallow: /arcade
Disallow: /attachments/
Disallow: /attachments
Disallow: /avatars/
Disallow: /avatars
Disallow: /chat/
Disallow: /chat
Disallow: /FCKeditor/
Disallow: /FCKeditor
Disallow: /gallery/
Disallow: /gallery
Disallow: /Packages/
Disallow: /Packages
Disallow: /Smileys/
Disallow: /Smileys
Disallow: /Sources/
Disallow: /Sources
Disallow: /Themes/
Disallow: /Themes
Disallow: /tp-downloads/
Disallow: /tp-downloads
Disallow: /tp-images/
Disallow: /tp-images
Disallow: /wysiwyg/
Disallow: /wysiwyg
Disallow: /apc.php
Disallow: /ssi_examples.php
Disallow: /ssi_examples.shtml
Disallow: /status.php
Disallow: /status.php?php
Disallow: /Settings.php
Disallow: /Settings_bak.php
Disallow: /index.php?action=admin
Disallow: /index.php?action=activate
Disallow: /index.php?action=arcade
Disallow: /index.php?action=calendar
Disallow: /index.php?action=collapse
Disallow: /index.php?action=deletemsg
Disallow: /index.php?action=editpoll
Disallow: /index.php?action=gallery
Disallow: /index.php?action=help
Disallow: /index.php?action=helpadmin
Disallow: /index.php?action=lock
Disallow: /index.php?action=login
Disallow: /index.php?action=logout
Disallow: /index.php?action=markasread
Disallow: /index.php?action=mergetopics
Disallow: /index.php?action=mlist
Disallow: /index.php?action=modifykarma
Disallow: /index.php?action=movetopic
Disallow: /index.php?action=notify
Disallow: /index.php?action=notifyboard
Disallow: /index.php?action=pm
Disallow: /index.php?action=post
Disallow: /index.php?action=printpage
Disallow: /index.php?action=profile
Disallow: /index.php?action=register
Disallow: /index.php?action=removetopic2
Disallow: /index.php?action=reporttm
Disallow: /index.php?action=search
Disallow: /index.php?action=sendtopic
Disallow: /index.php?action=splittopics
Disallow: /index.php?action=stats
Disallow: /index.php?action=sticky
Disallow: /index.php?action=tpadmin
Disallow: /index.php?action=tpmod
Disallow: /index.php?action=trackip
Disallow: /index.php?action=unread
Disallow: /index.php?action=unreadreplies
Disallow: /index.php?action=who
affiliate blog

Chantal Matar

Guys where do I put the robots.txt file?

I just checked my site on Google and they've indexed all the profiles and really irrelevant stuff!

Do I install this mod first, and then upload a robot.txt file somewhere?

Your help would be much appreciated.  :)

Neorics

does this work for smf 1.1.3 and with conjunction with SEO4SMF mod?
[For Hire] I can help you with anything regarding Simple Machines Forum  ~ My Portfolio

Flying Drupalist

Quote from: eldʌkaː on November 10, 2006, 09:49:20 PM
You should never overwrite the default theme.

Why not? The default theme can always be easily restored from the clean SMF.

Aileen

Does this mod still work on 1.1.4? thanks

vbgamer45

If you have SMF 1.1.x you do not need to install this mod since it is built into this release.
Community Suite for SMF - Take your forum to the next level built for SMF, Gallery,Store,Classifieds,Downloads,more!

SMFHacks.com -  Paid Modifications for SMF

Mods:
EzPortal - Portal System for SMF
SMF Gallery Pro
SMF Store SMF Classifieds Ad Seller Pro

TrueSatan

Quote from: Miraploy on September 21, 2007, 06:00:55 PM
Quote from: eldʌkaː on November 10, 2006, 09:49:20 PM
You should never overwrite the default theme.

Why not? The default theme can always be easily restored from the clean SMF.

Mods may well not install properly if you overwrite the default theme and having it is a fallback for any problems in other themes even if you install manually for each mod...the advice given not to overwrite it is absolutely correct!

pcigre


AlexAcosta

Quote from: Aäron on November 09, 2006, 01:06:34 PM
Link to Mod
Rate this Mod

This mod will tell robots not to index topics that are being accessed with .msg, prev_next, ;all, or by printing the topic (?action=printpage), by adding <meta name="robots" content="noindex" /> to these pages.

Note: this mod requires a modification in index.template.php. It changes only the default theme's index.template.php, so you'll have to apply the changes manually in any custom theme!
I believe you should also add rel=nofollow on all links pointing to pages like these..not only the noindex tag. Noindex will avoid duplicate content but PR will spead to these useless pages

Advertisement: