home community downloads matrix search newsjoinmembers: 4766
 
joomla wordpress drupal smiletag fireorb java php ruby classic ASP
 videos
 articles
 blogs
 comments
 downloads
sitemap
Sat, 2008-05-10 07:59
Drupal

Many Open Source content management systems written in PHP want to be recognized by the business industry as being "enterprise" ready. This is not only a mark of prestige and status but places them in a position where large companies are ready to invest in the software as a platform for their projects. Drupal is now trying making its move to be enterprise ready but has a long way to go.

Design Patterns

Drupal is a content management system built on PHP and is noted for its flexibility. But it is a flexibility that comes at a price. In its present state Drupal is not scalable. Well it is scalable but it is not easy to do and those that do it are either very secrective about how they accomplished the task or are too embarassed to show their ugly solutions. Drupal is based on the front controller pattern and suffers from all of the drawbacks of such a pattern while complicating it with an adherance to procedural coding. A front controller is a simple method for guiding all requests for web pages through the same system. Because many different requests pass through the front controller, it can be performance sensitive. Placing logic in the front controller which is not used for all requests will add unnecessary overhead to some requests.

There is also a helper pattern the application or page controller. This usually an array or text file of some sort. In the Drupal CMS it is an array that is implemented by the menu system. Though an application controller works well with front controller this is where compatibility with other design patterns end. Front controllers do not play well or get along with their sibling design patterns. This is because front controllers are by design created using the Singleton pattern meaning that only one instance of the front controller can exist. It also brings with it some other baggage like emulating a registry and not being extendable. Bringing another design pattern into a front controller environment like Drupal is almost impossible. This is bad because you need those other patterns to help solve difficult issues when programming extensions and third-party applications that may run under Drupal. You can't just say that you will offload something to the registry pattern when you need a registry or the factory pattern when you need to handle forms. You can read more about the why not's and cons of implementing a front controller in PHP here.

Changing to a different design pattern like MVC, which seems to be the holy grail of web application design nowadays,  and starting over is not going to hurt since there is no backwards compatibility in Drupal. There is however a wall of stubborness to be climbed. That wall is thinking that the use of Classes and other available PHP tools can be done in procedural fashion. OO design with PHP in large applications is something that is avoidable but at a great cost. A cost that becomes very apparent when trying to find out where something happens within the core code. If Drupal were fully object oriented it could make use of helper classes to overcome many of its shortcomings.

So Drupal is not an MVC front controller like Java Struts. But it seems to be slowly finding its way there. If you have been following the developement of Drupal over the last few years you will notice that ideas that have come into the CMS are those that already existed in Struts. Take a look at the phpMVC project and then study the latest version of Drupal in development. You will see the similarities and you can in fact write a roadmap for Drupal by using the diagrams for phpMVC. You might even notice that Drupal is moving to be parallel to another MVC front controller CMS, Joomla!. The latest version of Joomla! 1.5 MVC is a nice piece of work. The difference is that Drupal developers are trying do something similar using only procedural code and without any forced conventions for code creation or even a plan.

Duct Tape

One of the things that PHP developers try to do to offset the overhead of having thousands of rows of code loaded in per request is to use conditional includes. This may work okay for the core structure of a CMS but it does it does not help much when an end user adds in dozens of contributed modules. This is true because many developers are lazy and do not want to take the time to add in the logic that conditional includes require. They may also be put off by the fact that complex business logic is very difficult to implement using conditional includes. If you are building a large shoppingcart system with dozens of supporting modules adding in another hundred or so small includes to shave overhead is probably the last thing you will be thinking about. Trying to maintain the system afterwards is an large and time consuming task. I feel for the groups that have to maintain and upgrade things like the Drupal e-commerce modules with every new release.

Database Architecture

Another reason that Drupal is not scalable is because all the content for the application is in a single database table. Here is a true life story. The company I work for is seriously thinking of taking student.se over to Drupal. During a recent update to a newer version of the site I sat down with the websites administrator and went through the system. The very first thing I asked about was how the database looked. Migrating users is always a pain and I wanted to get an idea of how dirty the system was. He replied back " Which one? There are five of them." Drupal is not designed to be spread over multiple databases. When I say this I mean multiple databases not database servers. To put it simply there is just no way of spreading the load that is put on the Drupal node table without some code breaking refactoring. You may just as well start from scratch. If you put in 10 different node types and the number of rows for each type is between 100, 000 and 600,000 you find yourself with a single table with millions of rows that is constantly being used, updated and searched.

Caching

One of the big fixes for the front controller of Drupal is the cache everything policy adopted with Drupal 6. While caching goes along way once it's done it's done. You cannot ever hope to compensate for bad design choices by caching. The better way is to start refactoring the design and make better choices. Caching should be a last resort. Caching is also a huge factor if you are thinking of using Drupal as a social networking application. Caching and real-time do not like each other. Users of social networking sites love those who's online, status reports, chats and other real time interaction.

Third Party Software

Getting third party software to work in a front controller environment is not easy. This is because you can't just send requests off to the application that handles them. The requests have to be filtered and massaged by the front controller. Adding in modules is part of what the front controller needs to get meta data on the third party application so that it can be sewn into the rest of the system. This of course brings with it more overhead. Because of the non-extendablity of procedural code the best way to allow third party software to integrate is to create loads of API's. This bring with it another set of problems, having to guess which parts of the system are to be bubbled out.

Refactoring

Because of the design Drupal cannot be updated without effecting the entire system. You can never tell what is going to happen when a module is turned on or a bit of code changes. It kind of like putting your finger in a glass of water. The the entire glass of water is influenced and no matter where or how you place your finger this happens. There is also a lot of unseen activity the water may have been contaminated . Something the will only be visible as time goes by and the bacteria grows. If this happens there may not be much of a choice. You will have to start over with a new glass of water. This is as versus other patterns that are built like a pile of toothpicks. While the pile maybe in a chaotic order you can straighten it out with a careful touch. When you draw out one toothpick you can do it without moving the others if you are careful. You can also see how things go to some degree and even make predictions based on the visual layout of the pile.

Globals

If you have spent even a small portion of time going through the Drupal API then you know that the system lives of off the use of globals. Just about all PHP programmers worth anything have agreed that the use of globals will make your life miserable when it comes time to refactor code. Though the use of globals is okay for small applications once you have built an enterprise size website with hundreds or thousands of pages and users changing code becomes an insurmountable task.

Overriding Code

One of the best things about PHP frameworks created using the object oriented Class and method structure is the ability to override the core systems when necessary. Code Igniter is very impressive in the way they implement this. If you don't like the way a method in route.php( a Class )works then place your own route.php and override the core file. But in Drupal overriding code in the core is convoluted and impossible for most of the structure. This is a major drawback in using procedural coding one that very seldom gets mentioned in arguments of the Procedural Programming vs. OOP. Taking a look at two sources of database overhead the functions node_load() and user_load() you will be able to identify a common problem. They are designed to only grab a single set of objects, one row of fields from the database. The result is that if you want a hundred users or nodes you will have to create a loop. Looping causes more connections and database queries. The solution is to create your own functions which kind of defeats the purpose of having the framework.

Conclusion

Perhaps a 6 or 7 years ago Dries Buytaert, who is I believe was java programmer originally and the originator of the Drupal CMS, read great things about the front controller design and all of its pluses. But these pluses are not available in PHP. If Drupal where ported to another language other than PHP it would probably scale and serve very well in the enterprise. If PHP had some type of persistence and maybe if it worked more like Java there would be no problems. But you can find developer blogs all over the internet that repeatedly say "PHP is not Java". 

Social sharing: delicious  digg  reddit  icerocket
drupal's picture
Join Hiveminds and link to your website or blog.
 
a Visitor posted on: Thu, 2008-05-08 14:41.

Just about every CMS product I've seen, other than very specialized products such as Wordpress and Mediawiki, has serious problems.

I don't think PHP needs to be "more like Java." PHP's lack of persistence means that PHP applications are more reliable and scalable than Java applications via a "shared-nothing" approach.

For instance, many Java webapps require constant reboots of the Tomcat application server. This doesn't happen with PHP. Period.

A PHP bytecode cache mitigates many of the problems involved with reparsing complicated apps.

Hiveminds posted on: Thu, 2008-05-08 16:49.

That was just a "What if" type statement. I don't want PHP to be like Java. Just saying that any in any plans to use a design pattern that works well in one language should be carefully investigated before trying to implement the same pattern in another.

a Visitor posted on: Thu, 2008-05-08 22:04.

I am not an expert in design patterns ... so I can't really debate with you there.

However, you can query multiple databases from Drupal:
http://drupal.org/node/18429

Also, in terms of traffic capacity, Drupal does scale well, and I've found it reasonably well documented:
http://www.johnandcailin.com/blog/john/scaling-drupal-open-source-infrastructure-high-traffic-drupal-sites

Plus Drupal has numerous sites that prove its high-traffic capacity:
http://buytaert.net/tag/drupal-sites

Also, you mention Joomla 1.5 as being impressive, but I have yet to find an actual high-traffic site being hosted on Joomla, nor have I heard of hundreds of sites being hosted on one Joomla installation (like Drupal can do).

Also, *this* site is using Drupal....

a Visitor - posted on: Fri, 2008-05-09 05:57.

Actually you cannot query multiple databases by default there is db_set_active() which makes a call to the hardcoded db settings. But Drupal is not set up to handle multiple database natively.

To explain: Let's say you have entries of the type blog and guestbook. But you don't want to have them in the same table or database. You want to optimize as much as possible. You can not explicitly tell Drupal that blog entries should go in db_blogs.tbl_blogpost and that db_gbooks.tbl_gbpost should handle all nodes of the guestbook type. This would require rewriting the core system to allow it to see the different databases.

Now if this where another framework other than Drupal the system would be written so that you could first override the database handler as mentioned in the article using route.php. Then you would be able to write code that would do this on a need to have basis.

The article you mention is just reinforcing what is mentioned here. Scaling across multiple database servers and using clusters is not the same as using multiple databases.

As mentioned in the very first paragraphs of this article, there may be large enterprise sites using Drupal but they do not show how its done. My suspect nature tells me that they are simply using Drupal as a pretty front end to some very hardcore systems. Or perhaps they have hacked into the code so much to get everything to work that the CMS is only Drupal in name.

No you probabaly have not heard of any Joomla sites that are huge enterprise systems. But Joomla! does not propogandize that it is such a system. Joomla! pushes excellence in the area that suites it most.

The reason that this article exists is because one board of directors exec listened to all the propoganda surrounding Drupal and got mesmerized by the large brand names mentioning Drupal. He failed to do any research, one a code editor, ask and trust the opinions of the company programmers and made descisions based on inaccuracies. These decisions are now written in stone apparently. Because even after reading an article like this one the company will continue on trying to use Drupal and wonder "Why is it taking so long?" "Why are we spending more?".

a Visitor posted on: Fri, 2008-05-09 15:46.

Ideas of futures articles
- five reasons why the Internet is dead
- five reasons why Microsoft will disappear
- five reasons why Apple sucks

and son on... I'm always sceptical about post like this one with only negative sides (as articles with only positive point of view).

Hiveminds posted on: Sat, 2008-05-10 07:19.

Not everything written in this world has to be given a positive light for it to be true and accurate. Take some time and learn to program then take a closer look at the code. Then read the article again a tell us if there is anything that is not on the money.

The items talked about here are listed to be informative not negative. There are plenty of people ansswering the question "Can I do this with Drupal" with "Yes you can". But there are very few if not none that tell developers and user what cannot be done using Drupal. The limitations of the software is never discussed.

 

Newsletter

Get updates on Hiveminds services, articles and downloads by signing up for the newsletter.

Editor's choice

Some of the better articles, stories and tutorials found at Hiveminds.

Find more

Find more of Hiveminds articles, stories, tutorials and user comments by searching.




Picked links

Hand picked websites and articles from around the web that provide quality reading.

page top