| home | community | downloads | matrix | search | news | join | members: 4811 |
|
Fri, 2006-04-21 06:49
Database DesignOkay, I have to say it. Because I have spent another week trying to make sense of the mess. Drupal's database design is stupid! it sucks! it is crap! To things clearer I am talking about design errors. Yes, they do exist. Some say that design errors are the basic cause of bugs in a system. They are right. Rather than go through a bunch of queries or rather hundreds of queries, I will just say that the database design has no relational qualities that would optimize and speed up the system what so ever. This a glaring whole in the knowledge of the Development team so if you are a joiner and want to contribute to the project then think about helping them out with the database design first. Please start in on the taxonomy schema. It is scary. Textbook type errors that could be used for examples of what not to do in any database course. I took some time to compare the database to that of ezPublish. I don't know why. Maybe because it was the only othe CMS I had installed. Congratulations to ezPublish! Finally something I can say they have done with excellence and may bring me over to using the CMS. The database design is one of the best I have seen. Normalized, efficent SQL that looks like it would not pull more than 50 queries at most. Very nice. I may even open up that $30 book on ezPublish that I have been thinking of using to soak up that pool of oil from my parking area with. The the type of errors found run the gamut. Some are comical and make me giggle while others are serious in that they cascade through the system and make me want to cry. Most are the fault of the ease of use of MySQL. MySQL does not require that you be a trained database administrator or designer to get the job done. MySQL also does not complain much unlike Postgresql or MS SQL Server ( this is where I used to live ) which tell you to get lost if things aren't right. Only recently have CMS developers been forced by performance bottlenecks to look into learning more about optimizing SQL queries and designing schema that are naturally relational rather forcing relationships with complicated SQL statements on the fly. It is the on-the-fly SQL that makes the Drupal db design so weird. Rather than submitting a patch that would change the database schema most contributor's write on-the-fly mechanisms and try and gain efficiency by racing queries. What's that? Hmm sounds like the beat of drums the native are restless again. The drums are saying "If you are so smart then ...." Well to be honest I have only found two queries that just are not needed. Somehow they got added because there was a missing foreign key and probably some hurried logic used. I am still in the midst of doing the access system over so if it does not relate to that then I have to leave it marked and go back to it later. One thing that I am going to publish is changes in the database that stop query fluctuations. These are a pain and can be seen by using the Drupal devel.module. For some reason two identical calls for pages (including access control) will have two different number of queries called. This might be a indexing problem, but I suspect it is more of one set of queries winning the race at different times. Perhaps it is some redundacy also. I see that some calls to functions like variable_get() are being double checked with extra queries in the code.
<?phpThe above code just makes me laugh. There are four queries involved ( two hidden in variable_set()) to return a vocabulary id. Not a list or a complicated result set but a single id number. Somewhere something is very wrong. This is made especially clear by the fact that there are two hardcoded strings in the code that are the same strings in the row being called from the database. Hardcoding means there is a known value. If you already know the value then there is no reason to ask the database more than once for corresponding information. ![]() While you are trying to take in the view above think a bit about the node type column shown. While the number of node types may be small and never grow to more than a dozen the vocabularies column could grow into the thousands. Doing a string comparison to find what node type is associated with a thousand vocabularies is not the best way. Why not a seperate list of types? The node table has the same string comparison design and while many may not use vocabularies even the simpliest of websites may have 8,000 nodes. This is where 'DISTINCT' starts being over used. Again there should be a list table for node types. If you are wondering why all the noise about this. Well, it's because the front page of a test site and another suddenly became slow after using a mySQL script to put up a finished vocabulary for the classified ads module. The classifieds taxonomy is a standard used by newspapers and other media. When I started creating different node types for different types of ads I thought this would speed things up because each node type would have a finished query associated with it. But instead the entire thing slowed to a crawl. So it's back to the drawing board for that idea. That's all I have time for now. Remember, beat the drum not the messenger ;)
Tags: Developer Zone This article brought to you by the
Hiveminds Magazine - Staff. Contact us if you want to post an article or announcement anonymously |
|
NewsletterGet updates on Hiveminds services, articles and downloads by signing up for the newsletter. |
Editor's choiceSome of the better articles, stories and tutorials found at Hiveminds. |
Find moreFind more of Hiveminds articles, stories, tutorials and user comments by searching. |
Picked linksHand picked websites and articles from around the web that provide quality reading. |