Limited access to Vanilla API
I don't really know what I'm asking but I'll ask it anyway. Here's what happened...
I was running Vanilla forum 2.0.18 and my server crashed losing all of the 2000+ discussions I built up over the years. I was able to use a web scraper called Warrick that went through and retrieved a ton of the discussions. Unfortunately, it is not in the Vanilla database format. So, I have someone working with the data I collected to try and rebuild the discussions in the database form. They say they can do it manually however, it would be much quicker if they had access to the Vanilla API (I'm not sure what that is).
Is there anyway I could get them temporary access to the Vanilla API to rebuild the discussions from my cosmetic science forum?
thanks
Perry, 44
Comments
Sorry to hear about your loss!
If I wanted to go the API route, I would install a local version of 2.1b2 and try using @kasper's Vanilla API application addon found here: http://vanillaforums.org/addon/api-application
That said, I don't think using an API is going to be any faster than manually populating the tables or using a customized version of the porter script. If my understanding of Warrick is correct, it will only get the fully composed pages, so manual/custom extraction is required anyway.
Search first
Check out the Documentation! We are always looking for new content and pull requests.
Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.
Hi Hgtonight - I am the developer doing the work for Perry. From what I've read, people haven't had much success getting the application by @kasper to work. Have you or anyone you know gotten it set up and functioning properly (writing data, not just reading)?
I already wrote a parser for the Warrick download data and have all the forum postings in a nice XML format. Something like threadid, subject, username, datestamp, postdata, etc. So extraction is already finished. In order to get this clean data into Vanilla, I still have to reverse engineer the MySQL database, hoping there aren't terribly many constraints/triggers, etc. Obviously having a REST-like API with POST functionality ("Create Discussion" and "Add Post" mainly, although "Add User" would also be nice) would be quicker than doing that.
Thanks, and thanks in advance to anyone who has input!
Vanilla is a php application with nearly no intelligence on the database: that means there are no constraints or triggers. The only "intelligence" you can find on db side is an index and an autoincrement for primary keys
There are some informative columns concerning number of discussions and comments in the user and category table that are changed by php when you post a discussion or a comment. If you have a backup of the user and category table, too that will be no problem.
And even if you have not it will be no big thing to recalculate those numbers. They will be recalculated each time an appropriate action is taken, anyway (user creates new discussion/comment) and if that are the only downsides for the users, they will be surely tolerant.
Restoring with simple SQL inserts should be fairly easy.
If you want to scan the structure of the database, it might help to look at these three files:
Users
https://github.com/vanillaforums/Garden/blob/2.0/applications/dashboard/settings/structure.php
Discussions, Comments, Categories
https://github.com/vanillaforums/Garden/blob/2.0/applications/vanilla/settings/structure.php
Conversations (PMs)
https://github.com/vanillaforums/Garden/blob/2.0/applications/conversations/settings/structure.php
Those are scripts that build the database initially
If you permanently lost files, due to drive failure, etc. This is unfortunately something that can happen on any physical server regardless of what software you are running. Usually there is some mitigation for this like second hard drive (redundancy) and external backup. Make sure you choose your hosting wisely. I presume forensic recovery has been tried on the drives? There are no backup lying around anywhere? That is not good or professional.
Using an API or not, it is not going to help you get in the databases faster. Personally at this stage I would only work in Pure SQL or using the custom version the porter script, either way there is going to be some work.
As the data you scraped is already post-parsed html. You can enter it as raw html (Format Raw), just close the discussions so it can't be edited to put malicious code.
You are going to have to cut your loses. Entering only the basic Users, Comment and Posts. You will need to do calculations for the counts.
Being able to play through all the history of the internerations that every happen isn't really relaitic. There aren't operational transforms, etc,.
The various API work ok for what they are for, but in this case it isn't really relevant. It is not goign to be more efficient for batch operations of this scale. Pure SQL would be the fasted, but would require some parsing capability an knowledge of the schema.
Another possibility is just making a static archive and starting for scratch. I mean whatever you are goign to do you will need to reset all the passwords all the users. All information that would be published is basically gone.
grep is your friend.
I personally set it up a while ago and played around with it. I don't need it, so it was mostly for exploration purposes, but I remember getting it to post new tweets of mine.
If you already have your data in an easy to part format, all you have left is mapping the data to Vanilla's database. This is pretty straightforward and I am happy to help any way I can.
Search first
Check out the Documentation! We are always looking for new content and pull requests.
Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.