HackerOne users: Testing against this community violates our program's Terms of Service and will result in your bounty being denied.

VanillaHtmlFormatter: how to customize, add attributes and class

Is there a way to extend this class or utilize its public functions in my own plugin/themehook?

In one case I had to edit this core file to achieve some things.

core/class.vanillahtmlformatter.php


These two settings would awesome in Vanilla: and the ability to use wildcard for allowed_attributes and allowed_classes. IIRC, right now, there only exists deny_attribute

«1

Comments

  • charrondevcharrondev Developer Lead (PHP, JS) Montreal Vanilla Staff

    I would support a PR adding addAllowedClasses(array $classes) and removeAllowedClasses(array $classes) to VanillaHtmlFormatter .

    Then you could configure it in container_init

  • SOLUTION!

    @charrondev here is an idea, after studying htmlawed a bit it seems it is quite a little bit achievable rocket science after all.

    It turns out I might not need allowed_attributes, because deny_attributes, actually doubles as both allow and deny according to the official documentation. And vanillaHtmlFormatter already begs the setting from config.php

    // $Configuration['Garden']['Html']['BlockedAttributes'] = "-data-, -href, -style";
    

    $attributes = c('Garden.Html.BlockedAttributes', 'on*, target, download');


    created new settings for config.php

    $Configuration['Garden']['Html']['ClassedElements'] = ["a"]; //only link tag is allowed classname
    $Configuration['Garden']['Html']['AllowedClasses'] = ["float-left"]; //only float-left will ever be allowed as a classname for any tag
    // $Configuration['Garden']['Html']['ExtraAllowedClasses'] = [];
    
    

    As for anyone's needs for classes, I suggest the spec function be a little smarter like thus. And this allows the user to place all required settings in config.php

    //core/class.vanillahtmlformatter.php
    
    //make the spec function smarter and more in line with the Vanilla way of pulling settings from config.php(!??!)
    private function spec() {
        static $spec;
        if ($spec === null) {
            $allowedClasses = c('Garden.Html.AllowedClasses', $this->allowedClasses); //<-added (defaults to the hardcoded list in this file)
            $extraAllowedClasses = c('Garden.Html.ExtraAllowedClasses', []); //<-added (defaults to nothing)
            $classedElements = c('Garden.Html.ClassedElements', $this->classedElements); //<-added (defaults to the hardcoded list in this file
            $spec = [];
            // $allowedClasses = implode('|', array_merge($this->allowedClasses, $this->extraAllowedClasses)); //removed
            // foreach ($this->classedElements as $tag) { //<-removed
            $allowedClasses = implode('|', array_merge($allowedClasses, $extraAllowedClasses)); //<-added
            foreach ($classedElements as $tag) { //<-added
                if (!array_key_exists($tag, $spec) || !is_array($spec[$tag])) {
                    $spec[$tag] = [];
                }
                if (!array_key_exists('class', $spec[$tag])) {
                    $spec[$tag]['class'] = [];
                }
                $spec[$tag]['class']['oneof'] = $allowedClasses;
            }
        }
        return $spec;
    }
    


    Cheers. Should this make into a PR?

  • edited August 2019
    // A lot of damage can be done by hackers with these attributes.
    $config['deny_attribute'] .= ',style,class'; //<_remove this
    

    Of course the above must be gotten rid of. Because right now it mistakenly and invalidates this setting ...

    $Configuration['Garden']['Html']['BlockedAttributes'] = "etc";
    

    htmlawed library seems very particular about its value


    At some point it would be totally awesome if this setting is stored in config.php

    $config = [
        'anti_link_spam' => ['`.`', ''],
        'balance' => 1,
        'cdata' => 0,
        'comment' => 1,
        'css_expression' => 1,
        'deny_attribute' => $attributes,
        'direct_list_nest' => 1,
        'elements' => '*-applet-button-embed-fieldset-form-iframe-input-legend-link-object-optgroup-option-script-select-style-textarea',
        'keep_bad' => 0,
        'schemes' => 'classid:clsid; href: aim, feed, file, ftp, gopher, http, https, irc, mailto, news, nntp, rapidminer, sftp, ssh, telnet; style: nil; *:file, http, https', // clsid allowed in class
        'unique_ids' => 1,
        'valid_xhtml' => 0
    ];
    
  • R_JR_J Ex-Fanboy Munich Admin

    I do not agree that this should all be in the config. I would like to see only those values in the config that people like to configure quite often or that should have visible configuration option. If this is something which is rarely used, I'd say there are better options. Look at the Advanced Editor, here is a snippet:

           // Let plugins and themes override the defaults.
           $this->EventArguments['actions'] = &$allowedEditorActions;
           $this->EventArguments['colors'] = &$fontColorList;
           $this->EventArguments['format'] = &$fontFormatOptions;
           $this->EventArguments['font'] = &$fontFamilyOptions;
           $this->fireEvent('toolbarConfig');
    

    You can see that the configuration of this plugin has been done completely "internally" but by firing an event "at the right place", users could expand it to change the options the editor shows.


    I guess that's why charrondev suggested the "setter"-like methods. Adding two methods to the class does change nothing, really nothing in the current flow but gives you configuration options. I think this is very elegant.

  • charrondevcharrondev Developer Lead (PHP, JS) Montreal Vanilla Staff

    @donshakespeare We've been looking into replacing the main configuration with JSON for some time (and we already do on our infrastructure) so I think we'd prefer to reduce the logic in the config itself. PRs are definitely accepted, but it would be better if they were more narrowly tailored, with the logic living in the class instead of in the config file.

    Could you open an issue with a couple stories of what you're trying to achieve? The next release is shipping with most of Gdn_Format completely refactored and deprecated, so there is a lot of flexibility at the moment to do things in a clean way, rather than just tacking something on.

  • rbrahmsonrbrahmson "You may say I'm a dreamer / But I'm not the only one" NY ✭✭✭
    edited August 2019

    May I suggest a compromise -- that in some cases if settings are less commonly changed they could still be in config-defaults.php ? These are mostly "hidden" from admins but easily refrencable by more knowledgable admins. That is of course independent of whether the information is stored in text or JSON formats.

  • charrondevcharrondev Developer Lead (PHP, JS) Montreal Vanilla Staff

    I don't see an issue with more configurations for something like this. I just don't want the configuration file to contain for loops. I actually misread @donshakespeare's post and was thinking he was suggesting putting that loop into the config file itself. @donshakespeare Please go forward with a PR for that, although I would recommend using StaticCacheConfigTrait and self::c() instead of just c() so that the config values aren't refetched for every post.

  • R_JR_J Ex-Fanboy Munich Admin

    I would recommend using StaticCacheConfigTrait and self::c() instead of just c() so that the config values aren't refetched for every post.

    What magic is that?

    My way of "optimizing" that would have been to fetch "Garden.Html" and then use work with that.

  • charrondevcharrondev Developer Lead (PHP, JS) Montreal Vanilla Staff
    edited August 2019

    The source for fetching a configuration value is more complex than I wish it was.

    https://github.com/vanilla/vanilla/blob/master/library/core/class.configuration.php#L344-L364

    In particular the explode() in there is O(n). Not a big deal unless you're using it 100's or 1000's or 10's of 1000's of times in a page request. This particular gets called a LOT and we made a lot of performance improvements from 2.5 to today by just doing the lookup once and then relying on the lookup of an existing saved local value.

    Nowadays my personal preference has gone to configuration objects to neatly bundle up our config access. For example: https://github.com/vanilla/vanilla/blob/refactor/garden-format/library/Vanilla/Formatting/FormatConfig.php

    But, if you're not about to do some big refactoring, it can be easier to just slap in StaticCacheConfigTrait and call it a day.

  • R_JR_J Ex-Fanboy Munich Admin

    Sounds weird to me.

    You are using one object for 5 configuration options only. Can this be efficient? You are exchanging one big array by dozens of trivial objects if you use this approach consequently. I'm just asking because I can not imagine that creating objects doesn't come with system "costs", too. Don't you have to upscale some other places if you get better performance for the array/object swap?

    When the class you have linked to is instantiated, all config settings are fetched anyway. So if I only want to access one config setting, using a class with a constructor like that would fetch multiple items that might not even be need, which cannot be a performance benefit.

  • charrondevcharrondev Developer Lead (PHP, JS) Montreal Vanilla Staff

    It definitely would not be any better if you instantiates a new config object every time.

    Objects like this can be configured to only ever be instantiated once and that significantly reduced the overhead.

    The primary benefit is in readability and definition of defaults in one single location. We’ll be playing around with this a big more and it might not stick, but from our profiling with blackfire multiple config lookups is a lot slower than a class instantiation.

  • edited August 2019

    I ported a couple of my popular editor implementations to vanilla and I noticed a few things. A bit of difficulty matching the frontend tag/attributes restriction to the vanilla rendering one. Hence my digging.

    My editor allows sets of possibilities which the user can regulate, I'd like them to be also able to regulate the corresponding Vanilla settings. Hence I was recommending they be stored in the config. It is unnecessary to have a plugin just to regulate vanilla format.

    In TinyMCE for e.g, we use this, and such a thing would be great if easily mimicked in Vanilla.

      extended_valid_elements: 'img[class=myclass|!src|border:0|alt|title|width|height|style]',
      invalid_elements: 'strong,b,em,i'
    
  • Also may I suggest a setting to turn off this formatting that happens upon rendering? And maybe we need another function (if it not already exists) that cleans/strips on SAVE/EDIT.

    Many users apply a strict filter on their editor, so vanilla would be nice to prevent the unexpected from being saved at all.

    I noticed that this current formatting is superficial, all the strange stuff and unsuitable code that the user posted already went to the database, but is not shown/rendered.


    In a nutshell:

    What you post is what is stored in the database and exactly what you see rendered.

    And what you post is strictly filtered by the owner of the forum.

  • charrondevcharrondev Developer Lead (PHP, JS) Montreal Vanilla Staff

    So in the new formatting system, you would implement FormatInterface and register it with FormatService . There is a method FormatService::filter() that seems exactly what you are looking for here.

    • Allows you modify/filter the data on the way in for your format.
    • Allows you to throw a FormattingException that will prevent the post from being made and return an error to the user.
  • This is sounding very promising. I tried to copy from Rich Editor implementation but, this is really in the core and not much in its plugin folder. Still digging.

    Question, how do I do this sitewide with or without my editor, and while using existing formats?

    I am using Html/Text/Wysiwyg formats. And sometimes the editor is disabled and only a plain textarea is used.

    And we still want to curb and sanitize all input to match the rendering rules of the vanillaHtmlFormatter

  • charrondevcharrondev Developer Lead (PHP, JS) Montreal Vanilla Staff

    @donshakespeare A formatter can definitely be made in a plugin. Plugins are capable of modifying the container using the the container_init(Container $dic) . Here's an example:

    I should have more docs on the this new service ready with the next release.

  • R_JR_J Ex-Fanboy Munich Admin

    @charrondev This is getting an awful nightmare. I thought it would be easy after figuring that one out


    public function container_init(Container $dic) {
      $dic->get(FormatService::class)->registerFormat(
          'html',
          $dic->get(CustomHtmlFormat::class)
      );
    }
    


    But the workflow is like that:


    HtmlFormat->renderHtml() {
       HtmlSanitizer->filter()
       ...
    }
    
    HtmlSanitizer->filter() {
       ...
       HtmlFilterer->format()
       ...
    }
    
    
    HtmlFilterer->format() {
       ...
      return Htmlawed::filter($html, $config, $spec);
    }
    


    All I want in the end is to take influence on the $config in that last step.

    The classes HtmlFormat, HtmlSanitizer and HtmlFilterer are full of private properties and methods. If I want to reuse most of what is alreay there, I extend those classes, but since they are using "private" properties/methods I find myself duplicating the classes instead of extending them. Good bye DRY.

    I do not want to simply bypass all those steps and directly render the Htmlawed output. I assume the processing inbetween is done for a purpose.

    But even if I duplicate them, I end up with a "Could not find required constructor param htmlenhancer in the autoloader." error.

    Am I on the right track? It looks all too complicated...

  • In lay plain English @R_J what are you trying to achieve?

    I have done somethings in this area ... I might be able to help. And as I myself detest complications I invent new weird shortcuts to doing the same thing.

  • charrondevcharrondev Developer Lead (PHP, JS) Montreal Vanilla Staff

    @R_J there is a bit of an issue with your container init.

    1. I'd recommend using the /settings/bootstrap.php file at this point. There are some tradeoffs with the formatter than can complicate things. I wrote new doc over here about it:
    2. You container registration is creating the object. That's not really what you want to do. You want to configure rules for it, not apply them directly. Eg:

    settings/bootstrap.php

    $container = Gdn::getContainer();
    $container->rule(FormatService::class)
      ->addCall('registerFormat', [CustomHtmlFormat::FORMAT_KEY, new \Garden\Container\Reference(CustomHtmlFormat::class)]);
    

    @R_J

    But even if I duplicate them, I end up with a "Could not find required constructor param htmlenhancer in the autoloader." error.

    Could you share the full error? I'm guessing that it's missing an namespace. The full class is Vanilla\Formatting\Html\HtmlEnhancer.

  • R_JR_J Ex-Fanboy Munich Admin

    Thanks for looking at it. Here is the complete error message:

    Fatal Error in Garden\Container\Container.makeDefaultArgs();
    Could not find required constructor param htmlenhancer in the autoloader.
    The error occurred on or near: /usr/share/nginx/html/vanilla/installations/release_current/vendor/vanilla/garden-container/src/Container.php
    
    609:            } catch (\ReflectionException $e) {
    
    610:                // If the class is not found in the autoloader a reflection exception is thrown.
    
    611:                // Unless the parameter is optional we will want to rethrow.
    
    612:                if (!$param->isOptional()) {
    
    613:                    throw new NotFoundException(
    
    614:                        "Could not find required constructor param $name in the autoloader.",
    
    615:                        500,
    
    616:                        $e
    
    617:                    );
    
    Backtrace:
    
    [/usr/share/nginx/html/vanilla/installations/release_current/vendor/vanilla/garden-container/src/Container.php:454] Garden\Container\Container->makeDefaultArgs();
    
    [/usr/share/nginx/html/vanilla/installations/release_current/vendor/vanilla/garden-container/src/Container.php:708] Garden\Container\Container->makeFactory();
    
    [/usr/share/nginx/html/vanilla/installations/release_current/vendor/vanilla/garden-container/src/Container.php:356] Garden\Container\Container->createInstance();
    
    [/usr/share/nginx/html/vanilla/installations/release_current/vendor/vanilla/garden-container/src/Container.php:791] Garden\Container\Container->getArgs();
    
    [/usr/share/nginx/html/vanilla/plugins/mine/rj-ckeditor/CKEditorPlugin.php:65] Garden\Container\Container->get();
    
    [/usr/share/nginx/html/vanilla/plugins/mine/rj-ckeditor/CKEditorPlugin.php:65] RJPlugins\CKEditorPlugin->container_init();
    
    [/usr/share/nginx/html/vanilla/installations/release_current/library/Garden/EventManager.php:278] PHP::call_user_func_array();
    
    [/usr/share/nginx/html/vanilla/installations/release_current/bootstrap.php:491] Garden\EventManager->fire();
    
    [/usr/share/nginx/html/vanilla/installations/release_current/bootstrap.php:491] PHP::{closure}();
    
    [/usr/share/nginx/html/vanilla/installations/release_current/vendor/vanilla/garden-container/src/Container.php:740] PHP::call_user_func_array();
    
    [/usr/share/nginx/html/vanilla/installations/release_current/bootstrap.php:492] Garden\Container\Container->call();
    
    [/usr/share/nginx/html/vanilla/installations/release_current/index.php:22] PHP::require_once();
    


    I wouldn't be surprised if this is a namespace issue. I started using a custom namespace for my plugins which didn't make things easier for me, since I never used that PHP feature before... 🤔

Sign In or Register to comment.