Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Try Vanilla Forums Cloud product

Ready to contribute?

Amazing! Sign our contributors' agreement and then join us on GitHub.

Update for critical security issue in PHPMailer included in release Vanilla 2.3.1

Noindex based on min word count?

edited June 2015 in Feedback

Please can someone provide me an IF statement I can insert somewhere so that any thread pages with a word count below X will have the following added:
<meta name="robots" content="noindex"/>

Reason being is I'd like to try and minimise what might be considered 'thin content' to Google and although word count is only one factor, it's an effective one.

Comments

  • I think you are taking the phrase too literally

    watch this video

    https://support.google.com/webmasters/answer/2604719?hl=en

    it is nothing to do with succinct discussions. You can have really useful questions that are short, with a single simple answer.

    You are far better off encouraging the behaviours you want, rather than trying to prevent indexing on low quality content.

    grep is your friend.

  • edited June 2015

    I appreciate your point and working the leading SEO agency in Europe, I do have a reasonable grasp of this. The nature of my forum is there are a lot of 'request' one liner topics that offer no value for searchers but it would not be reasonable for me to try and stop that activity.

    I understand thin content is not simply about word count but it just so happens that limiting word count in my situation would prevent indexing topics that might get flagged by Panda. Also many of the one liner topics are similar to previous ones, and canonical tags are not an option, so word count restriction would simply rule those similar topics out anyway.

    Is there a way to do the IF statement mentioned in first post?

  • hgtonighthgtonight ∞ · New Moderator
    edited June 2015

    I assume you want to filter on the number of words in the discussion and comments bodies.

    public function discussionController_render_before($sender) {
        $threshold = C('AutoNoIndex.Threshold', 100);
        $wordCount = str_word_count($sender->Discussion->Body);
    
        if($wordCount >= $threshold) {
            return;
        }
    
        $comments = $sender->Data('Comments', array());
        foreach($comments as $comment) {
            $wordCount += str_word_count($comment->Body);
    
            if($wordCount >= $threshold) {
                return;
            }
        }
    
        $sender->Head->AddTag('meta', array('name' => 'robots', 'content' => 'noindex'));
    }
    

    You can adjust the word count threshold via the following configuration in /conf/config.php:

    $Configuration['AutoNoIndex']['Threshold'] = 10;
    

    Search first

    Check out the Documentation! We are always looking for new content and pull requests.

    Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.

    R_JphreakBleistivt
  • x00x00 MVP
    edited June 2015

    Again you will need to encourage your users to make make better topics, there are a number of ways to do that from interface to behavioural incentives.

    If you want to add noindex conditionally in a theme hooks file you can add this hook

    public function DiscussionController_Render_Before($Sender){
        if (isset($Sender->Discussion) && str_word_count(Gdn_Format:Text($Sender->Discussion->Body)) < 10) {
            $Sender->Head->AddTag('meta', array('name'=>'robots', 'content'=>'noindex'));
       }
    }
    

    This will also trap tag only posts such as links and videos, as it is taking a text only version of the body rather then tags which are not easily taken into consideration.

    A better idea would to flag this up during posting, by trigging a validation error. To encourage better quality posts.

    grep is your friend.

  • edited June 2015

    Thanks guys.

    Encouraging better discussion really wouldn't help at all. When they create request topics, it is correct for it to only be a couple of sentences and of minimal value to outside world. Ideally I would just put them all in a requests sub-forum and block the directory. However, half of these requests actually turn into discussions and therefore are of value for searchers but I don't have time to moderate and move them on a per topic basis.

  • phreakphreak VanillaAPP - Your white label app for Vanillaforums MVP
  • @hgtonight Where exactly do you place your code? I tried in bootstrap.early/before/after.php but it all caused the forum to load a blank page. I tried removing 'public' from your code which allowed the forum load but the code didn't seem to actually work.

  • hgtonighthgtonight ∞ · New Moderator

    Put it in your favorite plugin.

    Don't have one?

    Search first

    Check out the Documentation! We are always looking for new content and pull requests.

    Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.

  • Never created a plugin before so what I did was I downloaded your 'Testing Ground' and dumped your code at the bottom of class.testingground.plugin.php and it seems to be working. Is this what you had in mind?

  • hgtonighthgtonight ∞ · New Moderator

    @w1ckedsick said:
    Never created a plugin before so what I did was I downloaded your 'Testing Ground' and dumped your code at the bottom of class.testingground.plugin.php and it seems to be working. Is this what you had in mind?

    Sounds good to me.

    Search first

    Check out the Documentation! We are always looking for new content and pull requests.

    Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.

  • edited June 2015

    @hgtonight Is your 'Testing Ground' compatible with latest version of Vanilla? When enabled you can't use basic functionality such as deleting topics as it just gives a discussion controller error.

  • edited June 2015

    Ok, when I remove 'noindex' code from your plugin container, the forum is fine. So perhaps something in the code you provided me is causing this problem. Any ideas?

  • hgtonighthgtonight ∞ · New Moderator

    Enable the debugger plugin, then the testing ground plugin. This should give you an error instead of an "nice" error.

    Search first

    Check out the Documentation! We are always looking for new content and pull requests.

    Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.

  • edited June 2015

    Ok so I enabled debugger in config and this is what I get when I try and delete a topic:

    FATAL ERROR IN: DiscussionController.__get();
    
    "DiscussionController->Discussion not found."
    
    LOCATION: /home/mywebsite/site/public_html/forum/applications/vanilla/controllers/class.discussioncontroller.php
    > 40:             Deprecated('DiscussionController->CommentData', "DiscussionController->Data('Comments')");
    > 41:             return $this->Data('Comments');
    > 42:             break;
    > 43:       }
    >>> 44:       throw new Exception("DiscussionController->$Name not found.", 400);
    > 45:    }
    > 46: 
    > 47:    /**
    > 48:     * Default single discussion display.
    
    BACKTRACE:
    [/home/mywebsite/site/public_html/forum/plugins/Noindex/class.testingground.plugin.php 76] DiscussionController->__get();
    [/home/mywebsite/site/public_html/forum/library/core/class.pluginmanager.php 705] TestingGround->discussionController_render_before();
    [/home/mywebsite/site/public_html/forum/library/core/class.pluginmanager.php 638] Gdn_PluginManager->CallEventHandler();
    [/home/mywebsite/site/public_html/forum/library/core/class.pluggable.php 196] Gdn_PluginManager->CallEventHandlers();
    [/home/mywebsite/site/public_html/forum/applications/vanilla/controllers/class.discussioncontroller.php 608] Gdn_Pluggable->__call();
    [/home/mywebsite/site/public_html/forum/applications/vanilla/controllers/class.discussioncontroller.php 608] DiscussionController->Render();
    [/home/mywebsite/site/public_html/forum/applications/vanilla/controllers/class.discussioncontroller.php 608] DiscussionController->Delete();
    [/home/mywebsite/site/public_html/forum/library/core/class.dispatcher.php 350] PHP::call_user_func_array();
    [/home/mywebsite/site/public_html/forum/index.php 46] Gdn_Dispatcher->Dispatch();
    
  • edited June 2015

    Please can aonyone figure out why @hgtonight 'noindex' code is causing discussion controller errors on most of the forum functions? Deleting, favouriting etc.

    Reminder of @hgtonight 's code:

    public function discussionController_render_before($sender) {
        $threshold = C('AutoNoIndex.Threshold', 100);
        $wordCount = str_word_count($sender->Discussion->Body);
        if($wordCount >= $threshold) {
            return;
        }
        $comments = $sender->Data('Comments', array());
        foreach($comments as $comment) {
            $wordCount += str_word_count($comment->Body);
            if($wordCount >= $threshold) {
                return;
            }
        }
        $sender->Head->AddTag('meta', array('name' => 'robots', 'content' => 'noindex'));
    }
    
  • hgtonighthgtonight ∞ · New Moderator
    edited June 2015

    Huh, thought I had responded to this.

    You need to check and make sure the discussion object exists. Replace line 3 with:

    $discussion = val('Discussion', $sender, false);
    if($discussion === false) {
      return;
    }
    $wordCount = str_word_count($discussion->Body);
    

    You could also improve this by passing the bodies through the format class like @x00 did.

    Search first

    Check out the Documentation! We are always looking for new content and pull requests.

    Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.

  • edited June 2015

    Thanks, that worked! :)

    I did also try @x00 code but my forum were just blank pages but I guess that doesn't matter now anyway.

  • edited July 13

    I've just upgraded from 2.1.x to 2.3.1 and this little hack/plugin is no longer working consistently.

    Any ideas?

  • R_JR_J Cheerleader & Troubleshooter Munich Moderator

    Please provide the full code...

  • edited July 13
    $PluginInfo['TestingGround'] = array( // You put whatever you want to call your plugin folder as the key
        'Name' => 'Testing Ground', // User friendly name, this is what will show up on the garden plugins page
        'Description' => 'A skeleton plugin that adds its resources to every page, creates a settings page, and creates a stub minicontroller.', // This is also shown on the garden plugins page. Will be used as the first line of the description if uploaded to the official addons repository at vanillaforums.org/addons
        'Version' => '0.1', // Anything can go here, but it is suggested that you use some type of naming convention; will appear on the garden vanilla plugins page
        'RequiredApplications' => array('Vanilla' => '2.0.18.8'), // Can require multiple applications (e.g. Vanilla and Conversations)
        'RequiredTheme' => FALSE, // Any prerequisite themes
        'RequiredPlugins' => FALSE, // Any prerequisite plugins
        'MobileFriendly' => FALSE, // Should this plugin be run on mobile devices?
        'HasLocale' => TRUE, // Does this plugin have its own local file?
        'RegisterPermissions' => FALSE, // E.g. array('Plugins.TestingGround.Manage') will register this permissions automatically on enable
        'SettingsUrl' => '/settings/testingground', // A settings button linked to this URL will show up on the garden plugins page when enabled
        'SettingsPermission' => 'Garden.Settings.Manage', // The permissions required to visit the settings page. Garden.Settings.Manage is suggested.
        'Author' => 'Zachary Doll', // This will appear in the garden plugins page
        'AuthorEmail' => 'hgtonight@daklutz.com',
        'AuthorUrl' => 'http://www.daklutz.com',
        'License' => 'GPLv3' // Specify your license to prevent ambiguity
    );
    
    class TestingGround extends Gdn_Plugin {
    
        // add a Testing Ground page on the settings controller
        public function SettingsController_TestingGround_Create($Sender) {
            // add the admin side menu
            $Sender->AddSideMenu('settings/testingground');
    
            $Sender->Title($this->GetPluginName() . ' ' . T('Settings'));
            $Sender->Render($this->GetView('settings.php'));
        }
    
        public function PluginController_TestingGround_Create($Sender) {
            // Makes it act like a mini controller
            $this->Dispatch($Sender, $Sender->RequestArgs);
        }
    
        public function Controller_Index($Sender) {
            echo T('Plugins.TestingGround.SadTruth');
            echo "\nPlugin Index: " . $this->GetPluginIndex();
            echo "\nPlugin Folder: " . $this->GetPluginFolder();
        }
    
        public function Base_Render_Before($Sender) {
            $this->_AddResources($Sender);
            // decho($Sender)
        }
    
        private function _AddResources($Sender) {
            //$Sender->AddJsFile($this->GetResource('js/testingground.js', FALSE, FALSE));
            //$Sender->AddCssFile($this->GetResource('design/testingground.css', FALSE, FALSE));
        }
    
        public function Setup() {
            // SaveToConfig('Plugins.TestingGround.EnableAdvancedMode', TRUE);
        }
    
        public function OnDisable() {
            // RemoveFromConfig('Plugins.TestingGround.EnableAdvancedMode');
        }
    
        public function discussionController_render_before($sender) {
        $threshold = C('AutoNoIndex.Threshold', 300);
    $discussion = val('Discussion', $sender, false); if($discussion === false) {   return; } $wordCount = str_word_count($discussion->Body);
        if($wordCount >= $threshold) {
            return;
        }
        $comments = $sender->Data('Comments', array());
        foreach($comments as $comment) {
            $wordCount += str_word_count($comment->Body);
            if($wordCount >= $threshold) {
                return;
            }
        }
        $sender->Head->AddTag('meta', array('name' => 'robots', 'content' => 'noindex, follow'));
    }
    }
    
  • R_JR_J Cheerleader & Troubleshooter Munich Moderator

    Normally simple things like that either work or they don't. Is it possible that "this little hack/plugin is no longer working consistently" is based on a misunderstanding on what the plugin should do/can do?

    hgtonight adviced you to "passing the bodies through the format class like @x00 did". The difference of counting the words in the body and counting the words in the body formatted to text might be enormous if a discussion contains a lot of links and text formattings. That would be counted as words although you cannot see them when they are displayed.

    Furthermore you simply added the code snippet to a plugin template without reading or understanding what is in there. One of the more intersting lines in the plugin is this: 'MobileFriendly' => FALSE, // Should this plugin be run on mobile devices?. You can read it like that:
    Question: Should this plugin be run on mobile devices?
    Answer: No.

    Therefore that optimization wouldn't work for all clients. Not sure how search engines do their crawling but I would always consider that they also might crawl you mobile pages. Since you haven't changed that line, this optimization has never been effective for mobile agents.

    But coming back to your question: change this

        $threshold = C('AutoNoIndex.Threshold', 300);
    $discussion = val('Discussion', $sender, false); if($discussion === false) {   return; } $wordCount = str_word_count($discussion->Body);
        if($wordCount >= $threshold) {
    

    to that

        $threshold = C('AutoNoIndex.Threshold', 300);
        $discussion = val('Discussion', $sender, false);
        if($discussion === false) {
            return;
        }
        $wordCount = str_word_count(Gdn_Format::text($discussion->Body));
        if($wordCount >= $threshold) {
    

    and this

            $wordCount += str_word_count($comment->Body);
    

    to that

            $wordCount += str_word_count(Gdn_Format::text($comment->Body));
    
  • edited July 14

    Thank you and I had no idea about the mobile exclusion. Very important!

  • phreakphreak VanillaAPP - Your white label app for Vanillaforums MVP

    A little bit input from my experience for others interested in no-index (not misunderstanding the issue that was asked here, automatization). About a manual approach.

    Google is pretty strict on user generated content when it comes to SEO. Especially if you have two or more discussions with similar content (which is likely in communities, and merging two or more topics can confuse things pretty hard) it makes sense to look up with a SEO tool which is ranking better in the SERPs and no-index the other discussions which performs lower (Google could balance them to each other sometimes not clear which one to choose which weakens both discussions).

    There is a manual no-index plugin by the Vanilla Team which seemed to be broken ever since:
    https://open.vanillaforums.com/addon/noindex-plugin

    It has a permission issue (which is also in action on this forum) and also didn't add no-index to the discussion when I tried it.

  • R_JR_J Cheerleader & Troubleshooter Munich Moderator

    @phreak said:
    There is a manual no-index plugin by the Vanilla Team which seemed to be broken ever since:
    https://open.vanillaforums.com/addon/noindex-plugin

    It has a permission issue (which is also in action on this forum) and also didn't add no-index to the discussion when I tried it.

    The plugin looks good to me. I've just tested it on my Vanilla 2.3.1 test installation.

    As long as a role either has Garden.Moderation.Manage or Garden.Curation.Manage rights, users of that role have the right to "Add NoIndex"/"Remove NoIndex"

    A discussion which is marked with "NoIndex" has the following line added to the head <meta name="robots" content="noindex,noarchive" />

    So from what I can tell, it works exactly as intended.


    If you are facing problems, I could think of two scenarios:
    1. if you add NoIndex and don't see any difference, you might have seen an outdated version of your page and a refresh might solve that problem (which only exists in your browser)
    2. users permissions get cached for a while. So if your users had one of the above mentioned rights, they might still see the option in the discussion options (but they normally should not, maybe they are looking at an outdated version of the page, too). But when they try to use that option, they will be presented with a permission error

    rbrahmson
Sign In or Register to comment.