Please upgrade here. These earlier versions are no longer being updated and have security issues.
HackerOne users: Testing against this community violates our program's Terms of Service and will result in your bounty being denied.

Thinking about read/unread handling

edited February 2012 in Vanilla 2.0 - 2.8

I imagine this would be a better discussion for the Developer's forum, however, that doesn't appear to be open to the public.

I started a discussion over at the Simple Machines developer forum (http://www.simplemachines.org/community/index.php?topic=466125.0) that talks about a way I started using to track down user viewing history on a forum I developed. It involves storing "boundaries," where an algorithm manages sets of records for a user that represents a lower message ID and an upper message ID, such that all IDs within that boundary are considered read.

As far as I can tell, Vanilla offers similar functionality to SMF, such that a user's tally of unread posts is determined by either how far the user has read into a topic, or by when they last replied. Either way, the method (in my opinion) is a little unsound, given that it has the potential to mark posts or pages of topics as "read" when really you've simply clicked past a certain point. The boundary solution I proposed efficiently stores the records (or as close as one can get and maintain accuracy on a per-post basis).

If you read the topic that I linked to, you'll see the first question I'm asked is why I did things this way. My answer is that I wanted accuracy in what a forum told me what was read or not, and to be able to do so while displaying my forum in a threaded fashion, it became a necessity. The other question I got immediately was about performance, which sort of went into the weeds about building thread trees, etc. Being that I don't have a large user base for my implementation forum, the best I can say is that performance isn't a problem yet, however, I'll admit that larger scale testing of my boundary method is certainly warranted before it's granted much credence.

I wanted to get opinions on this. Or, I would love to be shown whether Vanilla addresses the unread posts feature in a different way than SMF, phpBB, myBB, vBulletin, etc. I haven't seen much mention of this feature, nor anything past a "mark as unread" add-on.

But really, if indeed Vanilla uses a similar or equivalent method for unread posts, I'd like to get a decent answer on whether Vanilla's decision to do things this way came as a result of specific testing. And for the tests, were there alternative methods of implementing this feature? How did they work? And what ultimately made you folks choose your current method? And have users ever expressed concern about how posts are marked as read or unread?

Thanks in advance.

Answers

  • Additionally, if I've made a bad assumption in my analysis of Vanilla, I'd be glad to be educated as to my mistake. And if this is the wrong place for this sort of question, I would appreciate some direction as to the right one.

  • ToddTodd Vanilla Staff

    I don't quite understand what you mean when you describe your way of doing things, but we haven't had any problems with Vanilla's implantation. What problem are you trying to solve exactly?

  • Forgive me, I'm not too familiar with Vanilla's terminology, nor its internals. For certain "aspects" of the forum software (say, the aspect used for the Penny Arcade forums), you have a more classic way of interacting with a forum: forums, subforums, and topics (as opposed to the questions/answers here). As part of this other aspect, it seems you're storing records on how "far" one has read into a discussion (I base this on the presence of this add-on: http://vanillaforums.org/addon/unreaddiscussion-plugin). This is a very similar way to how phpBB, SMF, and a number of other forums manage their read/unread features, allowing users to see only posts they haven't read.

    The problem I'm trying to solve is when people read in the "wrong" order. If, as a user, you were basing your reading habits on what you haven't read (say, you're trying to catch up with a backlog of posts), you might want to see exactly which posts have been displayed to you, and not just how far you got into a discussion, or looking for replies after your own. For myself, I have a bad habit of reading the last page of a discussion, and then reading back if I'm interested. Given how (I think) you, phpBB, and SMF store viewing history, this would mean that the forum would mark the discussion as "read" when really I've only looked at, say, the last page.

    I understand you guys work heavily with notifications, and have bookmarking. And that some aspects of your software probably don't deal with paging, or skipping ahead to the latest post. However, for those types of aspects that wouldn't get value from notifications (again, reading through an extensive backlog), and that you wouldn't want to actively maintain bookmarks, an accurate system for viewing history (and not just one that marks according to the farthest you got in a discussion) would be valuable to a user.

    I hope that explains some of what I mean.

  • Maybe here's a better way to ask my question: Have you folks ever had cause to try to track which posts a user has actually seen? (As in, which posts were rendered on a page requested by a user, and not simply updating the UserDiscussion.DateLastViewed column)

  • ToddTodd Vanilla Staff
    via Email
    We have had cause to try and track which posts a user has actually seen: inline comments. This is a feature that I'd love to be able to do, but we haven't because we can't think of a good ux for it which includes being able to track the read status of comments.

    When you have inline comments then all of a sudden comments get displayed out of order which causes problems for being able to jump to the next unread comment. So if I were using your solution I'd want it to support this use-case.

    In terms of scaling I'd say that what you want to be able to do is reduce your number of joins. So if you are grabbing the discussion list you want your read/unread query to be very simple. Only do your complex query when actually in the discussion. If this means that you have to have some redundant data then so be it.

    What we've learned as far as scaling is that its often better to never join with the db, but rather join in the application. This way you can offload some of the queries to cache-gets where appropriate.

    Application-level joining is a double-edged sword though. On the appropriate hardware, in-application joins perform better, but on self-hosted environments the opposite is often true. This reality kind of sucks, but we've decided that it's better to be able to support million user communities than to support ten thousand user communities on a $5/month godaddy account.
  • I handle my read status in a few ways. Note that I'm a programmer, not a graphic designer, but here's some of how I handle the UI for read/unread status...

    This is in "collapsed" view, with subtle color differences between read (darker orange) and unread (brighter yellow) link colors on subsequent posts in the thread (my "inline" comments). Because the thread is collapsed, only one message is displayed at a time, with indicators as to your position within the thread (green background). And because only one message is displayed at a time, only one message is considered "seen" at a time. Note also that I have "!!" at the end of unread posts in the excerpt structure below.
    collapsed

    In full view, message contents are displayed, but with a orange border surrounding unread messages. Because you're being shown the full contents, the user is considered to have read the thread's contents after viewing, but it is indicated to the user which posts were unread before page render.
    full

    In my forum view (multiple threads/topics/discussions listed in succession), I use collapsed view, and then let it be a user preference on whether you open full or collapsed view after clicking into a thread. For this particular view, I don't do any read/unread marking unless a user clicks into a thread.
    forum

    Before page rendering, in all cases, you have other indicators that you have either unread replies to your posts, or counts of unread messages in each of your subforums as part of the navigation elements. There are also shortcuts to the "next unread post," taking you to the thread display of a post you haven't read yet. And, like you say, the complex updating of viewing history only occurs when viewing a thread.

    As for performance, I find approximately the same thing. The multiple iterations of my boundary algorithm have gone from total reliance to almost no reliance on database logic, save for retrieval and storage. There always seems to be a good heuristic that makes relational logic unwieldy or inefficient. But it's interesting to hear you say that a performance gain at the 1-million user level doesn't help at the 10-thousand user level. I'm guessing your in-memory footprint worsens the more you do with application joins? And that's more of an issue on shared systems than larger, dedicated infrastructures?

  • ToddTodd Vanilla Staff
    via Email
    Doing more in-application makes your application more cpu-bound which makes a lot of shared hosts puke. On a fully scaled out infrastructure you can just add web-fronts though. I think that's the tradeoff.
  • Well, if it's any sort of positive testimonial, my CPU-bound read/unread calculation runs well on my $10/mo shared service :)

Sign In or Register to comment.