See Spiders Online

Started by David, September 21, 2003, 03:27:25 PM

Previous topic - Next topic

The_Altered1

I have seen them on my skinning server not even 30 minutes after applying a no-ip redirect url to it. It appeared as a guest and I was like "huh?" so I tracked the ip and it was registered to google. Not even 30 minutes......wow

Fizzy

Once you get used to the IP ranges for the Bots it's easy enough to watch them through Whos_Online.
One of the problems though is keeping track of and denying access to certain undesirable Bots, like the new MSN Bot.
I had to deny that one in the end because it was trawling my site taking everything it could get its hands on but not actually contributing anything to search results.
I can understand now why the Devs decided to write out the previous "Access Log". It was very handy for tracking incoming search results but certainly slowed the forum down.
"Reality is merely an illusion, albeit a very persistent one." - A.E.


Ben_S

Quote from: Fizzy on October 01, 2004, 11:42:51 AM
Once you get used to the IP ranges

Far easier to go by the user agent.
Liverpool FC Forum with 14 million+ posts.

Fizzy

Agreed  :)
That's one thing that I do miss from the old access log
"Reality is merely an illusion, albeit a very persistent one." - A.E.


andrea

#24
This feature can be achieved by themeing.


Quote from: treo on September 28, 2003, 11:17:47 AM
it's not too hard to identify such a spider because they have such "cool" user agent names like Googlebot/2.1 (+http://www.googlebot.com/bot.html)

You can customize the "Who.template.php" file in your theme directory to achieve that.

Intercept the variable $context['members']['query']['USER_AGENT'] in order to display your desired information about the online user.

In order to only have it for admins interprete the variable $context['user']['is_admin']

Andrea Hubacher
Ex Lead Support Specialist
www.simplemachines.org

Personal Signature:
Most recent work:
10 Aqua Themes for SMF



orange

Quote from: Fizzy on October 01, 2004, 11:42:51 AM
One of the problems though is keeping track of and denying access to certain undesirable Bots, like the new MSN Bot.
I had to deny that one in the end because it was trawling my site taking everything it could get its hands on but not actually contributing anything to search results.

Yeah, I got this too... last month the MSNBot single handedly used up 3 GB of bandwidth crawling my site, so in the end I had to ban its IP's. Is it possible to ban by user agent, and if not how hard would that be to add?

The_Altered1

Wow that sounds nasty, I did not know about this, is this msn bot new?

dschwab9

#27
Try the attached template file to be able to see the bots.

This is for RC1, don't know if it will work for other versions or not.

Fizzy

Quote from: The_Altered1 on October 02, 2004, 12:31:16 PM
Wow that sounds nasty, I did not know about this, is this msn bot new?

Not new, but it does not actually add to MSN search results. That is covered by Inktomi (the last I read).
MSN Bot is experimental which is what makes it even more annoying.
It's a bandwidth burner and nothing more
"Reality is merely an illusion, albeit a very persistent one." - A.E.


Fizzy

#29
Quote from: orange on October 02, 2004, 12:21:08 PM
Yeah, I got this too... last month the MSNBot single handedly used up 3 GB of bandwidth crawling my site, so in the end I had to ban its IP's. Is it possible to ban by user agent, and if not how hard would that be to add?


Have you tried using .htaccess or robots.txt?

User-agent: MSNBOT
Disallow: /
"Reality is merely an illusion, albeit a very persistent one." - A.E.


Seb87

Somebody have created a mod for a recongniction of spider bot ???

OvermindDL1

I would like it so I could then set it to not add spiders to user online count, would not artificially inflate it then.  Mabye even choose to give a different theme to the entire forum (or sections or whatever) to bots to take less bandwidth like IPB does...

Fizzy

Quote from: OvermindDL1 on December 31, 2004, 06:13:57 AM
  Mabye even choose to give a different theme to the entire forum (or sections or whatever) to bots to take less bandwidth like IPB does...

That's a good idea, would .htaccess redirect to imode cover that?
"Reality is merely an illusion, albeit a very persistent one." - A.E.


Trekkie101

Imode would be a good version for lowering bandiwdth consumption.

Fizzy

That's what I was thinking too :)
"Reality is merely an illusion, albeit a very persistent one." - A.E.


Trekkie101

Bump!

Has this been considered for SMF 1.1 at all?

I like the idea and would like to see it. Maybe if only the major engines like Google, MSN, Yahoo (Inktomi?) were used. :)

[Unknown]

In the end, it's a small feature some people might use once or twice and then likely never again, but which requires a good amount of upkeep to maintain.  Search engine identification has to be updated, it has to process this all the time....

It's a novelty.  Just really strikes me as a mod thing.

-[Unknown]

Cerberus

There was a template around for the who's online page which shows all the user agents of the users online and therefore you can see the spiders :)
Best Regards, Cerberus
YaBB Gold -> YaBB 1.1 -> YaBB SE (YaPP -> PfaBB) -> SMF
Pocket PC Russia

Trekkie101

Does the current mod affect SMF like how looking up hostnames setting would? eg. slow the forum down

because I might just use the mod.

(reminds self to delete smf and reupload fresh copys of everything)

sudden

IPB has an option to show bots as guests.. would love to see that on SMF too :D

Advertisement: