Please upgrade here. These earlier versions are no longer being updated and have security issues.
HackerOne users: Testing against this community violates our program's Terms of Service and will result in your bounty being denied.
Options

2 Discussions, 98 Comments...3,679 Page Views??!

StumpyJoeStumpyJoe
edited January 2014 in Vanilla 2.0 - 2.8

Is that normal? Looking at the Dashboard, my page views are ALWAYS exponentially more than the number of comments & discussions for that day.

One day had 0 new discussions, only 13 comments, and 1,593 page views. That's the lowest number in the past month.

I'm not running a large forum at all, just a small, private message board--maybe 40 active users, if that. What could be causing so many page views?

I disallowed the forum's directory on robots.txt to see if that would do anything, so we'll see, but I wanted to ask here if those numbers are normal or a red flag.

(Version 2.1b1 -- Upgraded from 2.0.18.4 which came from a phpBB import)

Thanks.

Comments

  • Options
    peregrineperegrine MVP
    edited January 2014

    I'm not running a large forum at all, just a small, private message board--maybe 40 active users, if that. What could be causing so many page views?

    look in your web logs. it should track ip address. you could see if someone is looking at your site.

    to digress, perhaps some one is trying to scrape your site to generate a list of passwords to try. you could use sign logger to see if someone is trying to guess your password for admin.

    you should upgrade your version or at least implement security patches.

    I may not provide the completed solution you might desire, but I do try to provide honest suggestions to help you solve your issue.

  • Options
    emziemzi
    edited January 2014

    What do you mean for "private"? Access to your community is limited, even for bots/spiders/crawlers? Check logs for bots IP such as google.

  • Options
    hgtonighthgtonight ∞ · New Moderator

    I am guessing it is bot views.

    As @peregrine noted, you will want to update to 2.1b2 for security fixes.

    Search first

    Check out the Documentation! We are always looking for new content and pull requests.

    Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.

  • Options
    vrijvlindervrijvlinder Papillon-Sauvage MVP

    I have found a very good article on the subject. Here is a snip and source link

    Imagine you own an eCommerce site, and dozens of customers are trying to check out their shopping carts at 4:00 pm. If your server CPU is busy serving thousands of hits from bots at that moment — with bot requests taking up, say, 80% of your server processing capacity — then the human visitors’ experiences will suffer. Some of them might bounce away and never return, finding the product elsewhere. And it's all because of bots!

    Why Don’t I See Bots From Web Analytics?

    Like most web analytics services, Google Analytics relies on running a small JavaScript code snippet inside your webpage to track your site visitors. This JavaScript code snippet is designed to collect the visitor behavior data and send such data back to Google Analytics backend, which performs further number crunching to produce the reports.

    The key requirement for web analytics to work is that their JavaScript snippet will be executed by the client side browser. Browsers that human visitors use to view the web are equipped to automatically run JavaScript, as well as render images, execute CSS, and perform the multitude of other tasks that result in a web page looking and acting like it’s supposed to. Bots, on the other hand, crawl around your site without running a real browser. They don’t need to execute JavaScript or render images: they can get all the information they need by crawling through the raw HTML documents. Since they don’t execute the Google Analytics script, Google Analytics service is not aware of such traffic and the bots fly under the radar.

    The Good, The Bad, The…

    It would be one thing if blocking all bots were a viable solution. But as you're surely aware, the "good" bots from sites like Google, Bing, and other companies drive your site's SEO value. Smart site owners would sooner roll out a red carpet and usher these bots onto their pages than block them. So if you want to both allow good bots to crawl your site but save your server processing capacity for your human visitors, then the solution for dealing with bot traffic must lie somewhere between "carte blanche" and total blocking.

    There are various methods for dealing with bots that fall into this middle space. IP blocking is a way to deal with individual bots that you have identified as troublesome by keeping them off your site. Throttling deals with bots generally (good, bad, and everywhere between) by placing a limit on the number of times one can hit your site.

    HOW IP BLOCKING CAN HELP YOUR SITE'S SAFETY AND PERFORMANCE

    Some bots are bad. They might be scraping your site's content to republish it (illegally) elsewhere on the web, posting spam comments on your blog, or showing advertisements to some of your visitors.

    To deal with pernicious bots, there are tools that allow you to block them. First, you must find out which bots are causing trouble. There are a number of ways to find out if your site is getting unusually high amount of hits from a certain bot: your hosting provider, firewall provider, or other backend service provider can provide that information. There’s also the raw server log for your site, but logs are usually so massive that you need a tool like Deep Log Analyzer or AWStats to read it effectively.

    Once you've identified one or more IP addresses to block, enlist a firewall service (Yottaa's is pictured above) or use your existing one. Enter as many IP addresses as you wish to block.

    HOW THROTTLING CAN HELP YOUR SITE'S PERFORMANCE

    Bots don't have to be malicious to cause problems, however. If neutral or even friendly bots hit your site too many times they will slow down your site's backend performance, causing more harm than good. If your prized human visitors are having a bad experience on your site, no bot is worth its weight in gold.

    Throttling limits the number of times any one client can hit your site. You set the maximum number of requests allowed in a given time period, and if a bot or any other type of client hits that number, it will be rejected. The idea is to set your throttling limit high enough that friendly bots have room to roam your pages, but low enough that bots hitting your site excessively will be cut off before they can seriously impact performance.

    To find your ideal throttle limit, examine your traffic patterns. Develop an idea of how many requests per minute from each client are normal for your site. A rule of thumb for setting the throttle figure might go something like this:

    Find out the peak number of hits your server receives per day on its busiest days (let's call this number "A")
    Find out the typical number of visitors to your site per day (number "B")
    Divide A and B (call the result "C")
    Convert C to a "per minute" figure by dividing by 1440 (call the result "D")
    Multiply D by 10 (number "E")

    "E" would be the threshold to set as the throttle limit on a per-minute basis. It assures that all activity on your site will go on undisturbed except for a totally wayward bot that is threatening your site's performance.

    (If you’re a Yottaa customer, you can go to "Optimizer Overview" page under the Optimizer tab on the Yottaa Dashboard, and you'll see a graph of your site's requests. This includes all hits to your server, including bot traffic invisible to Google Analytics.)

    HOW MUCH BOT TRAFFIC DO YOU HAVE?

    Take a look at your site traffic and see if bots are in danger of impacting your site's performance. Don't let bot traffic drag you down!

    http://www.yottaa.com/blog/bid/215372/The-Real-Story-of-Bot-Traffic-And-How-to-Prevent-it-from-Killing-Performance

  • Options

    Wow, thanks everyone. :)

    A while ago I did catch a weird IP from a French ISP constantly hitting the site, and blocked that IP. Like I said I just disallowed the forum's directory on my robots.txt, so I'm hoping that helps. I also added a crawl-delay.

    By "private" I just meant you need an account to view/post, accounts are invite-only, etc. Thanks for that link, vrijvlinder -- gonna read up some more, and also upgrade for that security fix.

Sign In or Register to comment.