HackerOne users: Testing against this community violates our program's Terms of Service and will result in your bounty being denied.
Options

Retroactive changes to HTML parsing in posts

jamesincjamesinc Sydney ✭✭
edited June 2020 in Vanilla 3.x Help

I've got an issue with my long-running Vanilla forum. There are lots of older posts that include HTML tables, and something in the rendering side of things has led to these tables now be displayed as plaintext (i.e. < is being translated to &lt;)

A lot of these older posts contain information with a very long useful lifespan and I would like to not have this sort of content rot on my forum.

How can I make the renderer ignore these older posts? They are all stored in the DB with Format=BBCode, but the forum uses the Rich editor these days.

Running Vanilla 3.3

Comments

  • Options
    LincLinc Detroit Admin

    I suspect HTML support was removed from the BBCode parser. Is there also BBCode mixed in those posts?

  • Options
    R_JR_J Ex-Fanboy Munich Admin

    I was curious and created a BBCode post in my test forum:

    And this is how it looks like:


    Can it be caused by another plugin? Have you taken a look at what is stored on DB level?

  • Options
    jamesincjamesinc Sydney ✭✭

    @R_J I think my original post was vague around this - your DB input and rendered output are consistent with mine. The DB has HTML stored in the post body, but when rendered to the page it is being escaped and displays as html-as-text, same as yours.

  • Options
    R_JR_J Ex-Fanboy Munich Admin

    I have tried to dig a little deeper and found that the NBBC parser doesn't allow to specify safe html tags. There is the following code in the parser which gives a hint on what to do to allow some tags:

       /**
        * Escape HTML characters.
        *
        * This function is used to wrap around calls to htmlspecialchars() for
        * plain text so that you can add your own text-evaluation code if you want.
        * For example, you might want to make *foo* turn into <b>foo</b>, or
        * something like that. The default behavior is just to call htmlspecialchars()
        * and be done with it, but if you inherit and override this function, you
        * can do pretty much anything you want.
        *
        * Note that htmlspecialchars() is still used directly for doing things like
        * cleaning up URLs in tags; this function is applied to *plain* *text* *only*.
        *
        * @param string $string The string to replace.
        * @return string Returns an encoded version of {@link $string}.
        */
       public function htmlEncode($string) {
    


    You would need to replace the BBCodeFormatter of Vanilla too to use a customized Nbbc instance, I guess. But even then, I personally wouldn't know a good way to only allow table tags

    There is a more simple but more dangerous possibility though: you can allow all tags. I tried to use a script tag but that was filtered, so it might be quite safe since you only use it for old posts:

    public function bBCode_afterBBCodeSetup_handler($sender, $args) {
        $args['BBCode']->setEscapeContent(false);
    }
    
Sign In or Register to comment.