Please upgrade here. These earlier versions are no longer being updated and have security issues.
HackerOne users: Testing against this community violates our program's Terms of Service and will result in your bounty being denied.
Building a vBulletin 3.x to Vanilla 2 importer
Linc
Admin
I'm taking a crack at the foundation of a vBulletin to Vanilla importer this week. I'm going to use this discussion as a place to throw ideas and questions out as I move along.
About me: I'd call myself an advanced (not expert) PHP programmer, mostly because I've never had the opportunity to work closely with another PHP programmer. I only know as much as I've read (snore) and figured out as I've gone along my way the last few years. I understand and have written OOP and MVC-based code, but only have gotten that far in the past year or so. I'm operating at the boundaries of my comfort zone, which is a good place to be I suppose
I say this mostly as way of disclaimer up front that whatever I do is going to be going through progressive improvements and will be a learning experience for me. I'm not going to poop out a polished importer in 10 days or something.
I have Vanilla 2 running (finally) and have sorta gotten my head around the database schema. I know vBulletin's data structure pretty well as I've been hacking at it for over three years (including writing a custom vBulletin/Wordpress bridge). I'm at the base of a big learning curve on the Garden/Vanilla code with only a modest and incomplete amount of documentation available so far. On the other hand I've wrestled with PDFlib in PHP so I'm not a stranger to this situation, haha. Any suggestions and help will be much appreciated.
[more in a minute - stupid max post size]
About me: I'd call myself an advanced (not expert) PHP programmer, mostly because I've never had the opportunity to work closely with another PHP programmer. I only know as much as I've read (snore) and figured out as I've gone along my way the last few years. I understand and have written OOP and MVC-based code, but only have gotten that far in the past year or so. I'm operating at the boundaries of my comfort zone, which is a good place to be I suppose
I say this mostly as way of disclaimer up front that whatever I do is going to be going through progressive improvements and will be a learning experience for me. I'm not going to poop out a polished importer in 10 days or something.
I have Vanilla 2 running (finally) and have sorta gotten my head around the database schema. I know vBulletin's data structure pretty well as I've been hacking at it for over three years (including writing a custom vBulletin/Wordpress bridge). I'm at the base of a big learning curve on the Garden/Vanilla code with only a modest and incomplete amount of documentation available so far. On the other hand I've wrestled with PDFlib in PHP so I'm not a stranger to this situation, haha. Any suggestions and help will be much appreciated.
[more in a minute - stupid max post size]
1
Comments
I'm breaking the importer up into parts by data type (users, categories, discussions, etc) and making it as modular as possible. I have to make some decisions about which data to bring along and which to cull. For now, I'm focusing on the core of vBulletin. Stuff like Albums I'm just not interested in worrying about presently. I can always add more later. My priorities are:
- Usergroups / Users
- Categories / Discussions
- Private messages -> Conversations (very delicate / awkward, I expect)
- Subscriptions -> Bookmarks
Beyond these, I'd also like to move the data for polls, smilies, tags, wall posts, and stats. However, I'm not really sure what format they will take on the Vanilla side of the equation since the functionality for those doesn't exist afaik. It quickly gets into "however I write the importer is going to lock in how any future poll/tagging/etc plugins work" territory.Starting with users, I run into this immediately. vBulletin as lots of "extra" userfields like Location, Interests, AIM, etc. and the thing is extensible which means just about everyone has custom fields. I can either start inserting a lot of columns into Garden's user table, or go the Wordpress "usermeta" table route. Given that every import is going to be different, I'm putting my money on the Wordpress method and creating a GDN_UserMeta table in the same style. All the "extra" stuff is going to go there in anticipation of being unlocked by some simple profile addons.
[/wall-o-text]
Feedback and smacks upside the head welcome. I'm well-stocked on tea and time; hopefully I will have more to say soon.
In Vanilla 1 all custom fields are stored in User Table as serialized array. I prefer this method instead of new table data.
Am I mistaken? Do you have a reason you'd prefer a serialized array?
Second, take a look at the Vanilla 1 to Vanilla 2 import script. It's kind of broken from a front-end perspective, but it should give you a good idea of how we're tackling that problem for Vanilla. It is in /applications/garden/controllers/import.php
Third, you can add columns to the user table if you want, but serializing the data is a decent option as well that requires far less effort. There is an "Attributes" field on the user table that takes all of that serialized data. You just need to pull the data already in that table out, unserialize it to an array using Format::Unserialize($Data), then add your values to it, then serialize it again with Format::Serialize($Data), and save it back into the Attributes field for that user.
I have an idea for how to handle this, but it requires understanding how vBulletin encrypts those passwords. So, please pass along that info if you have it, and @Todd or I will help out with that part of the process.
copy :realname, :LastName
transform (:LastName) { |n,v,r| v.split(/[\s]+/)[1] }
which does assume that the real name has only two parts separated by some space. Mostly, that will be true. I think you could do something with passwords as well. All that matters is that you can decrypt them and re-encrypt them via ruby.
vBulletin stores passwords in the database thusly: md5(md5(password) . salt) //EDIT: corrected Dec 8 '09//
and in cookies thusly: md5(ENCRYPTED_PASS . COOKIE_SALT)
where ENCRYPTED_PASS is the value stored in the database, and COOKIE_SALT is a unique value to each vBulletin installation (defined in /path/to/vbulletin/includes/functions.php).
My first thought was it would be necessary to bring along the encrypted password in a new field and add a plugin that transfers the password at each users' first login after the migration.
//edit: I forgot the COOKIE_SALT is actually just the license number (mine is 7 numbers followed by a single letter). It's in almost every PHP file as a comment on the fourth line in this format:
|| # vBulletin 3.8.1 Patch Level 1 - Licence Number 1111111a
You really think the edit process being unserialize->change->serialize is easier than just having a UserMeta table and editing values directly? The developer in me likes that serializing packs it all in one place very efficiently, but the designer in me hates out-of-control arrays that make the database more confusing and adds a step to every edit.
//edit: I guess this is how I feel about my experience with both so far: vBulletin uses serialized arrays EVERYWHERE and it's a pain (especially for novice devs) to get basic info out of it. You have to unserialize it just to figure out if it's even the info you were looking for. Wordpress uses a key-based UserMeta table and it's always felt super-simple to get info out of it. You can open the table in phpmyadmin and ta-da! it all becomes clear.
> Am I mistaken?
No, You are correct.
> Do you have a reason you'd prefer a serialized array?
Yes, I have two.
1. Allow ANY "field-value" pair (auto garden's validation model of your fields will be restricted your table structure/columns).
2. I'm not planning to use this fields in search, sort, etc., only on "profile" page. Faster. No waste data tables (columns).
Also, I'm not sure we're all talking about the same thing as far as a UserMeta table is concerned. Are you familiar with Wordpress's model? It has 4 columns: unique ID, user ID, metakey, and metavalue. To store something like a Steam name you'd set the user ID, a metakey of 'steam' and then a metavalue of their Steam name. This model prevents wasted columns, and it doesn't create any more duplicated key text than a serialized array would.
>It has 4 columns
Is this UserMeta concept what you mentioned above?
If I understood correctly, that will be example.
Example data for three users (1,2,3).
UID UserID Key Value # 1 steam st1 # 2 steam st2 # 3 steam st3
Is not "steam" duplicated key here?
IMH. The game is not worth the candle.
I guess my point is that I'd rather err on the side of the format that will enable the larger number of use cases. "I don't have a use for it yet" doesn't seem like a good rationale for picking a format that others will likely need to use.
It does. But no need new data tables for it.
I'm also going to be importing Attachments and "Smilies" as I think these are two things that most vBulletin communities would be loathe to lose. I plan to follow up the importer with quick Vanilla addons for parsing emoticons and showing existing Comment Attachments. I want to at least enable switchers to not *lose* these things even if I don't immediately create robust enough addons to add and manage them.
Attachments present a particularly nasty problem because of the way vBulletin stores them (as extension-less hashes in a directory that can exist outside the web root, or as database blobs). I'm going to require that they be moved out of the database before the migration (this is an option in vBulletin) and move and rename every attachment during the migration to overcome this (to a format of /path/to/vanilla/foldername/year/month/actual-filename).
Lastly, I'm going to create a permanent ImportID column for discussions, users, comments, and attachments to enable a redirect addon. Breaking links would be fatal to my site.
I've decided to leave polls, tags, and stats outside the scope of my first draft as I really want to hone in on the basics to get something working before I start adding things of more dubious cost:benefit ratios.
...and if anyone actually read all that and has suggestions I'm all ears.
tl;dr: UserMeta table = yes, importing attachments (ugh) and emoticons, ignoring everything else non-core for now.
I've managed to import roles, users, and user/role relationships so far.