InnoDB, Foreign Keys and Not Null
data:image/s3,"s3://crabby-images/4e782/4e782487c458d7498786fea26b19ed02c9549bab" alt="R_J"
I have been surfing around a bit for db optimization and came across three thesis that I wanted to discuss here. I'm no expert and I wouldn't bet on any of this points but I found them convincing and worth considering. So what follows below is no knowledge or truth but simply thoughts I'd like to share and discuss.
Maybe I start with the least effective one:
Saving memory space by disallowing Null values in columns
Thesis: If you don't allow Null values in columns, you will need less memory
Explanation: if you choose for example integer for your column and allow Null values, you will have to store either integer or Null and so you will have to consume more memory per column. Easy: integer is one piece of info, integer or Null are two different kind of data.
My opinion: This seems to be design decision that could give only a small benefit. Nevertheless it would be better than nothing. I don't know if Vanilla needs Null or if it couldn't be substituted by 0 or '' where it is used by now. "null" is found in about 250 files in my local Vanilla 2.1b2 copy (might also include plugins). So this can not be easily judged.
Beside from small memory benefits when Null is abandoned, I simply don't like Null because Null means dontknowanddontcare in most cases.
10 years ago I was working with databases quite more often than nowadays. I remember than Null values where in 90% the reason when something which seemed simple did not work.
So I would like to abandon Null and if it has a performance (or better resource) advantage too, then I would think it would be worth the effort to go through Vanillas sources.
Foreign Keys to speed up queries
Thesis: Foreign Keys speed up joins
Explanation: I don't remember well, but it was something like helping the query compiler to optimize queries? I almost forgot it, sorry...
My opinion: Whether you look at discussions or user info, Vanilla uses a lot of joins, so if there is something that's speeding up joins, it fit's well to Vanilla.
Downside of Foreign Keys seem to be that writing actions will take longer. So there seems to be a general decision again: is speeding up read actions worth slowing down right actions. I'd say yes, for a forum! You will have a huge amount more reads than writes to your db. So if you sacrifice writing speed but gain reading speed for that, go for it!
InnoDB is better than MyIsam
Thesis: InnoDB is the better choice for a forum.
Explanation: MyIsam (M.) is blocking the complete table whenever any query uses the table, InnoDB (I.) locks only the current row. I. allow to use foreign keys, which should be a benefit (see above) and M. does not, I. is actively developed and might become even better in the future while M. is old fashioned and kind of fix.
My opinion: I was shocked when I read that M. is blocking the complete table on every query. That is simply not acceptable to me.
The downside of I. is, that it does not support fulltext indexing of columns in "older" versions of MySQL. I think it has only been implemented in the most current version (5.6). Vanilla relies on that fulltext index for its search capability
But what if someone implements a basic search plugin that uses an internal index table instead of the built in search function? Parsing each post on save, store the words of the post in an index table and use that one for a custom search function should be possible. If such a plugin would exist, comment and discussion table could be used with the InnoDB engine.
Well, what do you think of this three tweaking possibilities? Are they worth any effort at all?
I wanted to test the InnoDB and Foreign Key effect for discussion and comment table but to my shame, Ihave to confess that I didn't even know how to do this. If anybody can point me in the right direction, I'd be happy to provide some numbers.
Comments
My opinion, based on my experience as a DBA:
My shop | About Me
Null should be used where ever there is a chance of incomplete data and a default value is unacceptable. If it is implemented properly (and I see no reason it wouldn't be), NULL be a 1 bit boolean. Add in the fact that NULL is handled natively, I really don't see a reason to throw out some information about the data for a single bit/column. You could add a bit to whatever column type
you want and implement your own masking for the type on the application end, but that seems like a lot of extra work rather than checking for NULL.
I will be the first to admit that my db optimization skills are pretty low, so take this with a grain of salt.data:image/s3,"s3://crabby-images/b4948/b49486d15d348358bb5bc89e26ed97acd947b2dc" alt=":D :D"
Search first
Check out the Documentation! We are always looking for new content and pull requests.
Click on insightful, awesome, and funny reactions to thank community volunteers for their valuable posts.
Thanks for the feedback by now. From what I get back, it is not completely idiotic to switch discussions and comments to innodb and I have been thinking about a custom search for quite a while. It would be easy to exclude your own stopwords and define the length of words to search.
I need two tables and one temp table (although I think creating three tables and deleting the 'temp' table at the end of the process would be better):
I start the names of my tables with "0" to see them at the top of my db, but that will be changed later on. Just if you are curious...
Here's my temp table
I have taken a text and made an array of it after performing some search and replace steps:
That text will be my example text, stored in the temp table. The plugin would hook somewhere AfterSaveDiscussion/Comment and use the post body. Here is how I store it:
Don't ask me from which book "zoölogical" comes from!data:image/s3,"s3://crabby-images/73832/7383274cf9e58c50f6868488fd4277313dcb0080" alt=":o :o"
I clean up the temp table but I think this should be done beforehand with PHP
Afterwards, I add all "new" words to my wordlist
Now I can update my temp table with the IDs of the words:
Now I can insert new lines for my discussion/comment in the occurance table:
Searching for words will first look up the WordID and then do a
SELECT * FROM 0Occurance WHERE WordID = X ORDER BY OccuranceCount DESC
These are my thoughts so far. I do not know when I will have time to elaborate that so I just posted it here in order not to lose track of itdata:image/s3,"s3://crabby-images/f6ba2/f6ba2c365bff14e22312b38e83f4c414773e5ad1" alt=";) ;)"
And if you have some thoughts and early feedback that you like to share with me, I would be happy!
(My biggest concern by now is that I will not be able to use the sad blubbot as an icon for that plugin)