"Jesse", a blogger who apparently has just one article and either wants to build a reputation for a new venture or simply wants to be anonymous (hard to say which) posted a fairly long article on his site called 4 problems with Drupal.

I read this article, and I started out a little annoyed. It makes some false or at least misleading assertions. I immediately commented on one obvious one, and then as I thought about it, I felt a need to address the entirety of his arguments, because they are, in all, flawed.

The API is ignorant of context

Put simply, loading a node (or a user) is an all-or-nothing deal unless you want to write your own SQL. In Drupal each module has the option of hooking into the node (and/or user) API. The most common case is where a node has some additional data it wants to associate with the node. Typically the node will create a hook so that every time a node is loaded is queries the database and inserts some that data into the node. This happens every time one calls node_load. The situation is analogous when one invokes user_load to load a user object.
Now let’s take the buddylist module as an example. This is a third-party module and thus not part of the Drupal core, but the idiom is used throughout and module developers basically have no other option if they’re looking for any kind of maintainability. This is the heart of the buddylist_get_buddies function, which returns a formatted list of friends.

<?php
     
if ($buddies = buddylist_get_buddies($user->uid)) {
        foreach(
array_keys($buddies) as $buddy) {
         
$account = user_load(array('uid' => $buddy));
         
$listbuddies[] = $account;
        }
        return
theme('user_list', $listbuddies);
      }
?>

The problem is that user_load loads all data associated with a user — it is completely ignorant of context. What happens if I have ten modules each of which issues a query when a user is loaded and I have fifty users on my buddy list? One way around this is to have buddylist_get_buddies issue a single query at the top of the page. After all, a “buddylist” is probably just going to consist of a list of usernames. Rather than issuing 100 queries I could simply do something like

SELECT u.uid, u.name FROM buddylist b JOIN user u ON (u.uid = b.buddy_id) WHERE b.uid = %d

This query is fast for any reasonably-sized buddylist but it totally breaks maintainability. What if someone else wants to use more than a name in the buddylist? The “Drupal way” would be to overwrite the theming function so that when theme(’user_list’, $listbuddies); is called we get our custom output. The extra data is available because user_load gives us all the data, irrespective of our need for it. But the buddylist module has no way of knowing what data you want up front — only the top-level stuff, the theming functions, really know for sure.

This is fairly typical in many systems; it's actually fairly difficult in many programs for the use of an object to be truly context sensitive. Also, context sensitivy can often be less efficient than simply loading the whole object. For example, if I load an object and it has three properties, 'foo', 'bar', and 'baz', and I load 'foo' when it is used...and then later load 'bar' when it is used...and finally load 'baz' when it is used...I've loaded those properties at different times. That level of context sensitivy is not a bonus, it's a penalty.

All this is to say that the points where you know exactly what data you need are precisely those points where you’re least able to get it. Oops! To contrast, in an MVC-based framework like Rails, you’d get around this by having a controller fetch the active user and creating a view more-or-less thus:

      <ul id="buddylist">
          <% User.buddies.each do |buddy| %>
          <li><a href="/user/view/<%= buddy.uid %>"><%= buddy.name %></li>
          <%end>
      </ul>

The view knows exactly what it needs to display and the data for a user’s buddies isn’t retreived from the database until you access User.buddies.

This little snippet does not demonstrate your last sentence in any way. How does it know what it needs? How does it load it? Does it load the entire buddy when you do User.buddies.each? Does it just load the name when you go for 'buddy.name'? How is this any more maintainable than the SELECT query mentioned above? This doesn't constitute a good counterpoint in any meaningful way, as it discusses two completely different levels of code. The level discussed here is a very high level concept, while discussing underlying mechanics on the Drupal side.

It doesn't offer any reasonable alternatives to loading the entire user object when a user is needed, or an entire node object when a node is needed.

2.
Event-driven? What’s that?

As I noted above Drupal is procedural at heart. Event or signal-driven designs are very common in the procedural world, but, again, it seems like Drupal can’t decide what it wants. Rather than create an actual event/listener system Drupal uses a strange “hooks” system.

The core defines a certain set of so-called “hooks.” For example, hook_load is the hook associated with a node getting loaded. If I have a module called foo.module and create within that module a foo_load function then every time node_load is called my foo_load function will also get called. This function will probably fetch foo-specific data from the database and stuff it back into the node. An example:

      function foo_load($node) {
        $additions = db_fetch_object(db_query('SELECT * FROM {mytable} WHERE nid = %s', $node->nid));
        return $additions;
      }

Now, personally, I find this system a little strange. For one, it forces me to know each and every hook within Drupal when I’m naming a function. If there’s a hook called hook_shipoopie and I, for whatever reason, create a foo_shipoopie function — a totally legal function name — then there could be all sorts of unintended consequences. Indeed, this decision seems so strange to me that I can’t imagine it wasn’t made on purpose. Maybe someone can give me a good reason?

I admit I'm not fond of magic naming like this either, but the reason is because it's *easy*. By simply naming something appropriately, bam, you get the functionality. Registration of events is another step, which some developers don't like. (personally, I'm working on a patch for Drupal 6 to do exactly this, though, because there are some limitations of the existing hook system I do not like).

More annoying, though, is the fact that even though the “hooks” system is the mechanism by which modules interact with the core, there’s no way to use it for inter-module communication. The buddylist module, for example, might want to add “befriend” and “defriend” events. As a web application developer I can then choose what happens when a user gets added or removed from someone’s buddylist. Since I know that I have the private message module installed I might just want to send someone a private message. Yeah, fine, the buddylist module could define “befriend” and “defriend” hooks, call module_invoke_all, but we’re still polluting the global namespace. And what if I need to attach custom data to the event? A real event system would allow for much greater flexibility.

What's a real event system? The Views module, which this guy bashes a little later, uses the hook system too, and it creates its own events. There is no reason that other modules can't also utilize the hook system. There is a sort of gentleman's agreement in place about how to name hooks properly, even. I.e, all of Views' hooks are prefixed with views_* to help deal with the namespace pollution problem. It's not ideal, by any means, but it works.

These design issues encourage bad developer practices and have huge performance implications.

1.
Just One More Query Syndrome

Drupal suffers from what one might call the “just one more query syndrome.” Unfortunately because of the context-blind nature of the Drupal API, a lot of these extra queries are being executed inside big loops. For the CS people out there, consider this. Most modules exist to alter the way nodes or users work, either by adding additional functionality or content. This means that often times you will be executing at least one query per node on a page and one query per user. Let u be the number of users on a page, n the number of nodes, and m the number of modules you have installed. Then we get the following:

#queries per page = O((u+n)m)

I can’t overstate the importance this has for the scalability of Drupal. Imagine trying to create a social networking site, which Drupal claims it can do. You will have hundreds of thousands of nodes, tens of thousands of users, and will need at least a dozen-or-so modules to get the functionality you want. Here’s a scenario: load a listing of the ten most popular blogs on your site with a buddylist, a list of the five popular groups, and a list of new users elsewhere on the site. Let’s say 10 modules issue queries for a node (not unusual) and 3 for a user. This makes 15*10 + 3*10 = 180 queries (groups are nodes, oddly enough). I have definitely seen cases where a complex page powered by Drupal would execute on the order of 1000 database queries uncached. Caching is only a symptomatic cure.

This is just the number of queries to get the display elements on the screen. Drupal stores everything in the database: session data, application variables, URL aliases, everything! If database resources were cookies Drupal would be the Cookie Monster after spending the afternoon getting high with his pals. You could try to cache the expensive parts of the page, but the ability to increase performance via caching does not excuse bad design, it only hides it. And God forbid you get a cache miss. Hello 500 error!

Well, databases are where data is stored. I've seen people arguing about Drupal storing data in the database as a bad thing, but I haven't gotten a good idea what the alternative is.

It is true that modules can add queries, though I haven't seen a set of modules that adds 10 queries to the node object. I'm using about 15 contrib modules on this very Drupal-based blog and I think I only add 1 or 2 queries per node object. The sidebars also load a few nodes to show recent posts and comments, etc. It hasn't been too bad.

The social networking site I run on Drupal has around 150,000 nodes on it, as well. It hasn't choked under its own weight yet, though I admit that Drupal does have scalability issues here and we do need to figure out good ways to work on them, but it is not as dire as this author makes it out to be.

2.
Third-party modules are awful performers

Some third-party modules will beat your site into a bloody pulp if you’re not careful. The two biggest offenders I’ve seen are og and views. og is a module which provides group functionality and access controls based around those groups. Views is a generic system by which administrators can define rules to display a list of nodes (e.g., show me every node tagged with “foobar” written in the last week). This is all done through an administrative GUI interface.

og is basically essential if you want group functionality, a mainstay of social networking sites. Drupal has probably the worst hook ever, hook_db_rewrite_sql, which allows any module to rewrite any SQL headed towards the database. og uses this for permissions. Basically for every list of nodes you fetch og adds a WHERE clause to the effect of “AND you have permission to view this node.” These where clauses are absolutely horrendous, of course, and can cause otherwise innocuous queries to take over 20ms to run. If you’re fetching multiple lists of nodes on a single page it only gets worse.

This isn't just OG, this is any 3rd party module that implements node security, and there are some very interesting tradeoffs on this security system. I haven't seen the node access system make node queries take over 20ms to run, myself, so I guess I've seen some very different results than this author has. Either that or he's making up numbers. Not sure which.

From the description of views above you can probably guess that the module programmatically generates SQL from the parameters specified on the administrative page. As anyone familiar with frameworks and CMSs knows, generated SQL is almost universally awful. The views module provides no exception. This SQL is only there to fetch a list of node ids, mind you. Each individual node id is then passed to node_load, which then results is another avalanche of database hits, even if the SQL generated by views is already using some of this data to filter the list of node ids. For example, if I had a view which produced list of all posts authored by “jim” it would generate SQL something like

SELECT n.nid FROM node n JOIN user u ON (n.uid = u.uid) WHERE u.name = 'jim'

Later, when we invoke node_load, this very same data will be fetched by the user module using hook_load. Views is too clever by half, in my opinion.

It is far more clever than the author of this post, who didn't bother to actually pay any attention to how Views works. There are instances where Views will load the node ID, and there are instances where Views will load only the data it needs. The instances are based on two very important features:

1) Are you telling Views what data you need? I.e, using the right sort of view -- list/table
2) Is the data you need available without extra work? I.e, some fields can't be simply loaded from the database, they need some kind of transformation. This is extremely typical of the 'node body' field which can go through all kinds of transformations.

This is where I will, in my position of experience, come flat out and say that the author of this post is a fraking idiot. Views has a lot of quirks and a lot of problems, and I'll happily agree with you when you talk about certain kinds of inefficiencies that Views has, but please, get your facts straight.

These two are biggies because they’re such popular modules, but as with any system most third-party modules for Drupal simply suck. I can’t really blame Drupal for that, except perhaps insofar as I think its core APIs make it more difficult to write good modules.

I'm afriad the evidence he's given simply doesn't support this statement, in my opinion, and all I can do is disagree with it. The core APIs are definitely not responsible for the author's misunderstandings.

I really don’t mean to dump on Drupal so much. Well, ok, I do, but it’s not out of spite.

I can't help thinking that this is like any good politician, doing exactly the opposite of what he says. I.e, "I'm not attacking you out of spite," he said spitefully.

Part of it is that, as a developer, I find Drupal to be very unfriendly.

That's the strangest thing I've ever heard from a developer. Generally people find Drupal to be very user-unfriendly. It's fantastic for developers because it gives you a ridiculous amount of control over the system. Not quite enough control yet, but far more than most systems out there give you.

I know exactly what I want to do and more often than not Drupal is a roadblock. Add to this the performance issues and I think anyone trying to use Drupal to design a medium-to-large sized interactive website should definitely take a second look. Developing such a site with Drupal would take no less time than using a more general rapid development framework like Ruby on Rails, Django, or Catalyst if you have experience developing such web applications already.

I can't help but feel like, given that this is the only article on this guy's block, that "Jesse" has a bone to pick somewhere. I can't actually argue with his final assessment...there ARE some road blocks in Drupal. It's still fairly difficult to make the UI truly friendly and customize it to your userbase, though it's improved a lot lately. It's still quite difficult to change certain aspects of Drupal's behavior which are too hardcoded. It's still very difficult for some 3rd party modules to interact. And yes, Drupal can do too many queries if you go crazy with the contrib modules.

It's funny -- his conclusion is probably right; there are some problems with Drupal. He just identified mostly the wrong problems.

EDIT: I want to apologize. Jesse isn't necessarily a male name, and a comment in the last response leads me to believe I assumed a gender that I should not necessarily have. Sorry about that, Jesse; for the moment I'll just assume I don't know (and it isn't really relevant, either).

Comments

Guessing where he comes from -Ruby on Rails and the likes- might put his complaints in another light. He is mostly complaining about issues that /any/ general system has, in one way or another. A blog in Drupal is always (technical) inferior to a blog you have written yourself especially for yourself. Drupal carries weigh around that you will not need. This is a result of it being 'general'.

His other complain is about being modular. It does not matter how well you architecture your module system, it will give overhead. Complaining about such overhead is like complaining that a truck uses more petrol then a motorbike, when you only want to deliver a small parcel. The extra weight (for modularity, flexibility etc) is there. That comes wiht a cost. Live with it. When we look at Ruby on Rails, we see it is not modular, because you need to develop everything yourself, there is no need for modules. RoR has engines, wich allow certain features to be developed faster: and using engines makes your application hardly any better then a modular Drupal: engines often come with extra weight and overhead too: performing queries /you/ don't need and so on.
When one uses views that is always less performant then simply writing your dedicated join-query and while-ing that result into a theme_table. No optimisation of views will solve that. But is that a problem? Going further, I think a SQL guru is able to create a single SQL query that fetches the whole frontpage at once. But as soon as you start abstracting stuff, modularizing it, and so on, you will end up with more queries then the single-dedicated-one.

Overall, he has valid points, but fails to give solutions. And worse, fails to see the reason behind the problems.

Thanks for your comments. I agree that there were a few mistakes in my article, but I believe it is still mostly correct. For now I'll leave my professional background aside as I don't want to prematurely associate my employer with my blog.

My response:

For example, if I load an object and it has three properties, 'foo', 'bar', and 'baz', and I load 'foo' when it is used...and then later load 'bar' when it is used...and finally load 'baz' when it is used...I've loaded those properties at different times

What Drupal does it worse. Let's say I load a user and have five modules which implement the 'load' op in hook_user. A user_load then invokes five additional queries irrespective of my need for the data.

Possible solutions:
1) Pass in an extra parameter which tells Drupal what other modules' data you want, e.g., user_load($uid, array('profile', 'blog')), etc. This at least reduces the number of queries. However, you still run into a problem where somewhere out in a theming function you want X data. If it hasn't already been loaded by the time you get the user object then you're out of luck unless you want to start issuing database queries from your theming functions.
2) Use an ORM, so that a user "object" understands that when it gets asked for property X it should either look to see if it doesn't already have it, and if it doesn't, go find it. There's obviously overhead here.

In other words, properties should be lazily loaded but with the option of loading them up front if you need them. An ORM is just one way to implement this.

his little snippet does not demonstrate your last sentence in any way. How does it know what it needs? How does it load it? Does it load the entire buddy when you do User.buddies.each? Does it just load the name when you go for 'buddy.name'?

Sorry, I was assuming that people reading the article were familiar with all sorts of web frameworks. Rails being one of the hotter ones I thought it might go without saying. There are plenty of tutorials out there. In any event, Rails does this exactly the way I describe in (2) above. If I have a list of users, I can do User.find(:conditions => 'age > 18', :include => :profile). This would find all users who were older than 18 and include the data from the relationship described by :profile. I could leave off the :include and if in my view I did user.profile, it would fetch it from the DB.

If I know I need the data up-front I can get is cheaply. If I don't know, then it's loaded lazily. Drupal, OTOH, just blindly loads everything up front whether I need it or not. Queries, queries, queries!

This isn't just OG, this is any 3rd party module that implements node security, and there are some very interesting tradeoffs on this security system. I haven't seen the node access system make node queries take over 20ms to run, myself, so I guess I've seen some very different results than this author has. Either that or he's making up numbers. Not sure which.

If your site is sufficiently large and complex OG will append something on the order of a dozen additional WHERE clauses (I think in general it's one per group to which the user belongs). If you have 10000 groups and the average user belongs to 10-15, then you start getting hit. Hard. I'm not talking about node_access itself; I haven't looked at its performance.

That's the strangest thing I've ever heard from a developer. Generally people find Drupal to be very user-unfriendly. It's fantastic for developers because it gives you a ridiculous amount of control over the system. Not quite enough control yet, but far more than most systems out there give you.

To me, Drupal tries to make the hard stuff easy and the easy stuff becomes harder. Ok, I can move around abstract content blocks around the page from an administrative panel. Neat. But it's more difficult to fetch a list of items and theme each one in a way that works for users, designers, and developers. Frameworks like Rails, CakePHP, etc., make the easy (and tedious) stuff dead simple and give me the freedom to define higher level stuff like how the various components of my application interact, without getting in my way (too much).

This is where I will, in my position of experience, come flat out and say that the author of this post is a fraking idiot. Views has a lot of quirks and a lot of problems, and I'll happily agree with you when you talk about certain kinds of inefficiencies that Views has, but please, get your facts straight.

Aww, love ya too, hon. xoxo. :-*

If I know I need the data up-front I can get is cheaply. If I don't know, then it's loaded lazily. Drupal, OTOH, just blindly loads everything up front whether I need it or not. Queries, queries, queries!

Actually, this isn't always true. It depends upon the implementation of the module.

Take profile.module, which adds a significant amount of information to a user. It doesn't load profile information automatically. If you want to access profile information, you must call, if I remember right, profile_load_profile on the user object.

It's an extra step that some find annoying. Depending upon what a given module does, this is a good way to go, and there is no reason module developers can't utilize this tactic, especially if the data they are loading is unlikely to be needed whenever a module is loaded.

The problem here may well be a lack of education amongst contributed developers. But pick any framework and look at contributed code, and I guarantee you some of it will suck.

Sorry, I was assuming that people reading the article were familiar with all sorts of web frameworks. Rails being one of the hotter ones I thought it might go without saying. There are plenty of tutorials out there. In any event, Rails does this exactly the way I describe in (2) above. If I have a list of users, I can do User.find(:conditions => 'age > 18', :include => :profile). This would find all users who were older than 18 and include the data from the relationship described by :profile. I could leave off the :include and if in my view I did user.profile, it would fetch it from the DB.

Clearly there is a lot going on under the hood here, and that is one of the niceties of the language, I must assume. It looks pretty similar, in fact, to calling profile_load_profile if you need profile fields, to me, so while the language handles the concept in a more object-oriented manner (and I will sigh here because I think Drupal should be more OO than it is, even without being OO), it's very much the same thing.

So again, I still disagree with your point here.

If your site is sufficiently large and complex OG will append something on the order of a dozen additional WHERE clauses (I think in general it's one per group to which the user belongs). If you have 10000 groups and the average user belongs to 10-15, then you start getting hit. Hard. I'm not talking about node_access itself; I haven't looked at its performance.

Again, it is node_access that does the query rewriting, not OG, which is merely utilizing the node_access features of Drupal to implement this.

I can't argue that there are some inefficiencies added here, but I work on a very large site that uses this same model, and the hits I'm seeing are not nearly as bad as you make them out to be. It's also only one WHERE clause -- it uses an IN () to join all the groups you belong to into a single clause.

To me, Drupal tries to make the hard stuff easy and the easy stuff becomes harder. Ok, I can move around abstract content blocks around the page from an administrative panel. Neat. But it's more difficult to fetch a list of items and theme each one in a way that works for users, designers, and developers. Frameworks like Rails, CakePHP, etc., make the easy (and tedious) stuff dead simple and give me the freedom to define higher level stuff like how the various components of my application interact, without getting in my way (too much).

In part there are some differences in language here, at least when talking about Rails. And to be fair, CakePHP has some really interesting library functions we could probably borrow in Drupal to make some things easier. The funny thing here is that in one context, you completely bashed generated queries. In another, you are talking up code that clearly generates queries. I'm a little confused by which one you find bad.

It's an extra step that some find annoying. Depending upon what a given module does, this is a good way to go, and there is no reason module developers can't utilize this tactic, especially if the data they are loading is unlikely to be needed whenever a module is loaded.

The problem with this is that there's no standard or convention. How am I supposed to know that if I need profile data I must invoke this function manually? It's just more overhead for the developer when it should be the framework that is providing these basics.

It's also only one WHERE clause -- it uses an IN () to join all the groups you belong to into a single clause.

Interesting. The version of OG I'm using definitely produces queries like "AND (blah or blah or blah) AND (blah or blah or blah)." Maybe it's an issue of the version?

Clearly there is a lot going on under the hood here, and that is one of the niceties of the language, I must assume. It looks pretty similar, in fact, to calling profile_load_profile if you need profile fields, to me, so while the language handles the concept in a more object-oriented manner (and I will sigh here because I think Drupal should be more OO than it is, even without being OO), it's very much the same thing.

The lack of OO is definitely a huge frustration for me. One of my "four points" was originally going to be the inability to easily extend modules. Let's say I want everything the user module does, except that it provides for multiple profile images. Am I supposed to *&^@#ing copy the user module entirely and make my_user.module? Maybe I'm lucky and I can hack it in by writing my own module that does the translation. This is stupid because PHP already provides exactly this functionality with its object model.

In another, you are talking up code that clearly generates queries. I'm a little confused by which one you find bad.

Well, in my experience the queries generated with Rails are better than those generated with certain Drupal modules, probably not least because the automatic query generation is centralized rather than diffused among modules. If you improve it in the Rails (or Cake or Django or whatever) core you improve it everywhere. With Drupal it's like whack-a-mole. What's more, with any MVC framework I've used, it's easy to replace the generated SQL with your own. If Rails is producing shit I can tell it what SQL I need to find the widgets I want. If Drupal is producing shit then I...what? Write my own SQL and invoke db_query myself? That's maintainable.

The problem with this is that there's no standard or convention. How am I supposed to know that if I need profile data I must invoke this function manually? It's just more overhead for the developer when it should be the framework that is providing these basics.

Well, without being properly OO there isn't a way to fix that in PHP. PHP4's OO really sucks; it wasn't until PHP5 that OO really became feasible, and it won't be until PHP6 that it's actually good. So, if you want to argue that PHP has some problems here...ok, sure. That's fine. Oh and the answer to your question there is "read the profile.module documentation". Presumably if you're using its API you need to do this at some point anyway.

The lack of OO is definitely a huge frustration for me. One of my "four points" was originally going to be the inability to easily extend modules. Let's say I want everything the user module does, except that it provides for multiple profile images. Am I supposed to *&^@#ing copy the user module entirely and make my_user.module? Maybe I'm lucky and I can hack it in by writing my own module that does the translation. This is stupid because PHP already provides exactly this functionality with its object model.

Goodness, I hope not. The extensibility of modules is based entirely upon how the module itself is coded. A lack of OO mentality does make procedural code harder to extend. I agree with that. You have to rely on the good developers to provide code that can be extended easily. There's a group of us that believe heavily in creating APIs for that exact reason, rather than simple do-it-once modules that do it their way and their way alone.

Certainly the example you suggest doesn't require writing a new user.module. It probably means turning off the existing user_picture stuff, and writing something else that works like profile.module and providing hooks so that themes can output your new data. I'm sure if you try hard enough you'll come up with examples you simply cannot do without something weird, but the one you picked is actually pretty straightforward.

Well, in my experience the queries generated with Rails are better than those generated with certain Drupal modules, probably not least because the automatic query generation is centralized rather than diffused among modules. If you improve it in the Rails (or Cake or Django or whatever) core you improve it everywhere. With Drupal it's like whack-a-mole. What's more, with any MVC framework I've used, it's easy to replace the generated SQL with your own. If Rails is producing shit I can tell it what SQL I need to find the widgets I want. If Drupal is producing shit then I...what? Write my own SQL and invoke db_query myself? That's maintainable.

Well, I don't know about Rails' automatic query generation. PHP simply doesn't have it, so we have to come up with our own. With Views I did what I could. I tried very hard to write a system that produces decent queries. For the most part, though, the SQL is written by the contributor, and it's as good as the author of the contributed code.

On the other hand, I'm going to go back to your original statement that generated SQL generally sucks. See, I agree with that very much, especially as the author of a query generator. So I still find your argument here to be of the "have your cake and eat it too" variety.

PHP4's OO really sucks; it wasn't until PHP5 that OO really became feasible, and it won't be until PHP6 that it's actually good.

Yep, I know. But even so, there are simple things that could have been done to improve on this situation. For example, accessing static members in a class has almost no overhead in PHP, but already provides you with OO extensibility and real namespacing. This was definitely feasible in PHP4 and have written several sites in PHP5 which use exactly this approach.

Certainly the example you suggest doesn't require writing a new user.module. It probably means turning off the existing user_picture stuff, and writing something else that works like profile.module and providing hooks so that themes can output your new data. I'm sure if you try hard enough you'll come up with examples you simply cannot do without something weird, but the one you picked is actually pretty straightforward.

And that doesn't seem complicated to you from a development perspective? I have some object (whether it's implemented as an actual honest-to-goodness PHP object or not) and I want to extend it. To me this is a Solved Problem and to take any other approach requires good justification.

For the most part, though, the SQL is written by the contributor, and it's as good as the author of the contributed code.

I've actually found the SQL to be much worse, personally. Often times there are tons of queries against columns without indexes, over-aggressive table locking, etc. I can't really blame Drupal for the fact that most people don't know what makes a good database query. What's more, since views relies on third-party modules to provide filters, you're back in the same boat. If I have to write my own code I'd rather do it in an environment that I feel accommodates my needs.

On the other hand, I'm going to go back to your original statement that generated SQL generally sucks. See, I agree with that very much, especially as the author of a query generator. So I still find your argument here to be of the "have your cake and eat it too" variety.

Is there a way to supply custom SQL to a view? Can I just enter SQL into the admin interface somewhere and have everything else work as normal?

If there is then that's wonderful and I recant most of my objections to the views module. I've actually had to write a module which acts as an intermediary between the views core and display. It runs as a cron job outside of drupal and periodically fetches the generated SQL from each view and populates a table with (view, nid) pairs. When the original menu item is invoked it then fetches the nid list from this table and proceeds as usual using the views theming functions.

If you want to argue that Drupal is not OO, and therefore this is bad, I don't have a great counterpoint to that. I'm a fan of OO myself. I do believe that OO brings extra cruft, so I don't entirely agree that simply using proper OO methodology would make it better. It takes really, really solid developers to write OO code that isn't excessively heavy.

As for Views...there isn't a way to do that via the admin UI. However, there are a couple of ways to affect the SQL that is generated.

First, there is hook_views_query_alter() which gives you the $query object and lets you futz with it before the actual SQL is generated.

Second, Views tries to cache generated queries so that it doesn't always spend lots of time actually generating them. It can't always, due to the dynamic nature of some Views, so this won't always work. At the moment this caching is stored in the wrong place (in the views database itself). And you *can* just drop a query in there and it'll get used. Of course, this cached query will get removed at certain points, but I *do* see the utility of making it more feasible to just replace a Views' query with something that's been hand-tweaked. Because as I said, I agree that generated SQL may well just suck.

Also, I highly recommend using List views where possible, since you can avoid node_loads. In fact, Karoly Negyesi went so far as to write a little nodeapi hook that stores the fully converted teaser in the database so that he could utilize a List view as as teaser view. Something that is ordinarily not possible because teasers have to be processed. With a little nodeapi magic, however, he actually gained significant efficiency using Views.

To be honest, I was surprised. I never expected to hear about anyone gaining efficiency with Views. And to be even more fair, he is a serious Drupal expert and your average Drupal developer isn't going to think to do this. But it can be done.

It takes really, really solid developers to write OO code that isn't excessively heavy.

Maybe, but it's not like most third-party modules are shining paragons of good code, anyhow. I'm just saying that PHP provides a native mechanism by which a lot of these problems can be easily solved and yet Drupal obstructs me from doing that. I would be fine if Drupal were agnostic with respect to my choice of development methodology, but it's not.

Second, Views tries to cache generated queries so that it doesn't always spend lots of time actually generating them

Yeah, it's not the generation that is expensive, but rather the queries themselves.

Also, I highly recommend using List views where possible, since you can avoid node_loads.

Unfortunately a teaser view is what I find I use views for most often. "Tricking" views into making the list view behave like a teaser view is, well...it just seems like getting around your ass to get to your elbow, as my aunt says. I know precisely what data I want. I'm forced to either dive neck deep into Drupal voodoo or write my own SQL and call db_query. Neither extreme is desirable.

I guess my biggest aesthetic complaint is that when developing with Drupal it feels like is has a huge footprint. Maybe I'm treating Drupal too much like a development framework and not enough like a CMS. As a CMS it's alright, albeit with performance issues. As a development framework it makes me want to rip my hair out.

Well. The trouble there being that the teaser can be processed so many ways that simply retrieving it from the database just isn't viable. node_view() can do a lot of things. In general, that is the price of extensibility.

When you want to factor in performance, you've got to pay back some of the extensibility. There are ways to do it, and they suffer from maintanable.

"Good, fast, cheap" -- it may be analogous to "Extensible, maintainable, fast". Except that one sounds really dull.

Comparing Drupal to Rails and CakePHP has always been unfair. Rails and CakePHP both amount to library sets that you pick and choose functionality from. Drupal has much more Man in the Middle stuff going on. If you don't want to do things the Drupal way, then by all means, Drupal is not right.

That doesn't mean it's poor as a development platform. It does mean that we have a lot of work to do to really be the development platform that I want it to be.

One specific response:

Maybe, but it's not like most third-party modules are shining paragons of good code, anyhow.

Imagine how much worse it'd be if it were OO? I tell ya, I've seen bad code, and then I've seen the OO code of people who Just Don't Get OO code. It's another level of bad.

Maybe I'm treating Drupal too much like a development framework and not enough like a CMS.

And right there is the cognitive dissonance. You've been dancing around that point for this entire thread so far. Let's get it out in the open:

Hand-crafted custom SQL for a given task will be faster than dynamically built customizable queries. Period. Duh.

Hand-crafted custom PHP code for a give task will be faster than dynamically constructed multi-part plugin-based code (whether done via hooks, events, or inheritance). Period. Duh.

Hand-crafted custom C code will be faster for a given task than PHP will be. Period. Duh.

Hand-crafted custom assembly code will be faster for a given task than C will be. Period. Duh.

No matter what level you're working at, there is a trade off between Modifiability and Performance. In most cases, those two architectural axes are at odds with each other. You have to trade-off somewhere between the two. Depending on the task and code you can ameliorate the losses to one or the other, but you still have to trade-off between those two. Where you trade-off will depend on your priorities on a given project.

Drupal takes a trade-off point that emphasizes Modifiability first and Performance second. That's its design philosophy. "Make it extensible, and oh yeah fast." Rails frameworks, at least in my admittedly limited experience (getting a Ruby on Rails sales pitch multiple times, including from David, and testing out CakePHP, a Rails-on-PHP framework, for an upcoming project), do the opposite. Get code mostly working fast, keep the code fast, but you have to stay "on the rails" in terms of how things are done (hence the name) and it's not as easy to drop in extensions as it is in Drupal (or Firefox, or Eclipse, or any other plugin-based or hyper-modular system). Although it's funny to be having this conversation, since not along ago one of the main complaints against RoR was that its performance was terrible in large-scale systems because its SQL was so unoptimized. :-)

Sure, using CCK-based nodes and Views is slower than a custom-coded node type and targeted SQL queries. Absolutely it takes more CPU cycles. But when (not if, when) a client comes back and says "by the way, for the launch we have in two hours, can we add two more fields to that list and make it a table instead of an unordered list", it takes me 10 hours to change that in hand-crafted targeted code and 10 minutes to change a few settings in a View. Larry cycles cost more than CPU cycles, and I want to go home on time. That's why I push as much as I can into CCK and Views these days. It helps MY performance. (And to be fair, Drupal is still not too shabby speed wise compared against other systems in its class, such as Joomla or Typo3. Dries had some benchmarks on his blog a while back on that.)

If you're looking for a targeted, low-profile, one-off, ORM-based framework for building an application, then you don't want Drupal. Drupal is not for you. Drupal isn't even trying to provide that. Don't use a bulldozer to drive to the grocery store to get milk.

If you're building a social networking site, though, you can get the basic functionality of such a site (users, buddy lists, a wiki, blogs, comments, reviews, a forum, etc.) setup in a lazy afternoon in Drupal. With the hundred hours you just saved, spend some time making it look really pretty (which, if you know Drupal's theme system well, is also a not-insanely-long process). With the 100 hours just saved by using Drupal, at lots of dollars per hour, buy yourself a bigger server to handle the not-fully-optimized SQL and the overhead of all that indirection. In the end you'll get the job done faster, for less money, and have more time to spend with your family.

I'm not just saying that, either. The company I work for just switched to Drupal for our main site development this year for exactly that reason. For an upcoming targeted web app, though, we're planning to use CakePHP instead, because it's more suited for that sort of setup.

Insert that old adage about hammers and nails here.

it takes me 10 hours to change that in hand-crafted targeted code and 10 minutes to change a few settings in a View.

I'm sorry, that's nonsense. If you're working in a framework, either of your own design or someone else's, and it takes you 10 hours to change the presentation of a given set of data then you seriously need to reevaluate your technical decisions. I can write a simple templating system in less time than that which would totally separate data from presentation.

In fact, when I was debating about what points to include (I have way more than four), one of the top contenders was the fact that Drupal has no consistent data/presentation separation. In an MVC framework I could give a designer access to the views and with a little guidance they would be able to understand what was happening. With Drupal design is splattered all over the place: templates, theming functions, etc. What's worse, many theming functions make PHP function calls or retreive data from the database.

This adds overhead for the developers because now they have to act as intermediaries between the designers and Drupal.

If you're building a social networking site, though, you can get the basic functionality of such a site (users, buddy lists, a wiki, blogs, comments, reviews, a forum, etc.) setup in a lazy afternoon in Drupal. With the hundred hours you just saved, spend some time making it look really pretty (which, if you know Drupal's theme system well, is also a not-insanely-long process). With the 100 hours just saved by using Drupal, at lots of dollars per hour, buy yourself a bigger server to handle the not-fully-optimized SQL and the overhead of all that indirection.

I think you're underestimating how violently Drupal can abuse the database. I've seen pages which, using nothing more than off-the-shelf modules, take upwards of 2000 queries to generate. You can talk to me about caching all you want, but that's a sign of a fundamental design problem right there. Some of this is because third-party code sucks and some of it is because of the way the API and hooks system interacts with the database. This is what the buddylist example in my original post was meant to illustrate. The code isinefficient, but there is no other way to fix it without breaking extensibility or maintainability.

In the end you'll get the job done faster, for less money, and have more time to spend with your family.

Look, you might subscribe to the Ron Popeil school of web development, but I think really you're just exaggerating to make a point. It's all about a website's needs. My points are (and were) as follows:

1) Drupal doesn't scale well with respect to the number of modules
2) The infrastructure of Drupal causes it to have a lot of overhead

This means that if your website is going to be complex (let's say 20+ modules) or high traffic (let's say 15M+ pageviews / mo) then Drupal will become a roadblock in the future. If that's the case then the 100 hours you saved by using Drupal as a turn-key solution are just being displaced to a later time.

I'm going to use Rails as an example again. At some point you're probably going to want to use a shared memory cache like memcached. Because Rails can treat any object store as a data source it is easy to put a memcache layer between the model and the controller. With little effort you get the effects of memcache everywhere, always. With Drupal the only "solution" is going to be to make Drupal itself memcache-aware. To do that you're going to have to make each API memcache-aware independently. Fun times.

Yes, I understand that Drupal is not a framework in the way that Rails is, but I still need to be able to do the same things with it. And if you're never going to reach this point, fine. But if you are then can you honestly tell me that Drupal will save me 100 hours?

answer?

Drupal seems to have an edge when it comes to setting up a quick and dirty portal / community site / automated Web CMS. (to some degree, so does Joomla)

In lieu of using Drupal -- for either "code issues", SQL efficiencies, or scalablility issues -- what's the better product to use? (FOSS or Commercial -- either way)

IBM's Websphere Portal comes to mind .. but then issues of infrastructure impacts and software liscencing costs also seem to come along with this class of product.

If you stacked up all of the "objecting points" with the discussion so far, and pointed to a package that addressed them all (without injecting other demons along the way), what would that package be?

Thanks in advance

Giant Mike

There is no web framework which solves all the problems of web development. Drupal is designed first and foremost to be a drop-in solution: install some modules, configure them, maybe change the theme, and voila! In no time flat you have a (relatively) complex web application.

But every site has different needs. Some sites need performance and some sites need flexibility. This is probably the central trade-off in web development and the appropriate balance differs from application to application.

I think Drupal is fine if you (1) want a flexible CMS, (3) you don't need many modules, and (3) you aren't going to be pulling in a large number of visitors (20M+ per month, let's say). I wouldn't build the next GMail or Twitter or Amazon using Drupal, ever. Even something like a social network is hard to create with Drupal because many of the modules (og is by far the worst) will absolutely abuse your database. The architecture of Drupal, IMO, only exacerbates the problem.

Earl Miles and Jesse probably don't have time to sit down and hash out all the ways drupal should not or should be improved, but although the root of the dispute is Rails/Cake type development framework versus CMS, I think there's a lot of value getting into the specifics of how the different frameworks work.

Because Drupal is also also a development framework. It has too much good stuff and too much extendability for people not to jump in.

In particular looking for a standard way to make modules extendible would be very interesting for me. Could this be done without going full-fledged OO? A way too tell Drupal "I'm overriding module_doallthis with mymodule_override_module_doallthis" (which could recreate or call module_doallthis inside it?)

That's probably not a good example, but most usefully from this conversation Drupal could be looking for ways to have greater tweakability to manage with high traffic (by enabling generated queries to be overwritten for instance, or even centrally managed)... and ask people like Jesse coming from a Rails or Cake background to contribute code patches that get us closer...

Drupal is a content management system pretending to be a framework. It has undoubtedly become bloated over the years. Its feature creep at its best. They misconstrue productivity for hours of hacking and frustration. Drupal is a great tool, but until the start refactoring major parts of their core work simply and powerfully with productivity in mind for developers, then it will always lose newcomers.

"This is fairly typical in many systems; it's actually fairly difficult in many programs for the use of an object to be truly context sensitive. Also, context sensitivy can often be less efficient than simply loading the whole object. For example, if I load an object and it has three properties, 'foo', 'bar', and 'baz', and I load 'foo' when it is used...and then later load 'bar' when it is used...and finally load 'baz' when it is used...I've loaded those properties at different times. That level of context sensitivy is not a bonus, it's a penalty."

This made me laugh ... If someone's wife is pregnant should he buy cloths for his kids with size from 1,2,3,4,5,6,7,8,9,10 .. M, X, XXL, XXXXL (you never know how big he can be)

honestly speaking i am not a EXPERT programmer but i know some bits and bites of debugging and basic knowledge of OOPS and control structures... Since last 8 months i have been working ( i guess 4 months must be spent on searching through API's and docs) and fond its deep shit for developing a real complex system ...

Note: There is only one reason & two rules I am working it is "BOSS"

Rule 1: "BOSS is always right"
Rule2 2: "If the boss is wrong see rule 1"

One more thing i just observed : Do i really need to preview comment each time i need to submit ????
hahahahaha

Since I no longer work there, I guess it can't hurt to say, but once upon a time I worked for Sugar Publishing. You can read about the sort of stuff I did with Drupal here: http://drupal.org/node/116578 (yes, I'm farmerje).

So, I wouldn't try to pull the "yeah, but what have YOU built with Drupal?" card.

I'll just add that at the time I left our traffic numbers were more than double what I quoted in that post.

Burgh. This was meant to be a reply to the chap below, not you. Sorry.

I do wonder if those that dislike Drupal have only checked out the API's and a fresh install, or have actually used it to develop an actual website for a client. I mean, the taxonomy system alone would take hours to code even with a MVC. And the fact that MTV and The Onion are using Drupal means that it can handle high traffic just fine depending on how you configure caching etc. Considering that processing power is getting ever cheaper while labor is getting ever more expensive, I simply cannot imagine a scenario in which the audiences that use Drupal (newspapers, blogs, communities) would want to use a framework or code from scratch.

And of course it would be silly to, say, switch Wikipedia or Amazon to Drupal, but those aren't exactly examples of your typical web portal, online newspaper, or whatever. So perhaps you (Jesse) are misjudging what drupal is meant to do and whom it aims to serve.

To be fair, the Onion goes to some pretty interesting extremes in order to handle their massive loads. But they are a VERY VERY VERY high traffic site. They put normal high traffic sites to shame, in fact.

Since I no longer work there, I guess it can't hurt to say, but once upon a time I worked for Sugar Publishing. You can read about the sort of stuff I did with Drupal here: http://drupal.org/node/116578 (yes, I'm farmerje).

So, I wouldn't try to pull the "yeah, but what have YOU built with Drupal?" card.

I'll just add that at the time I left our traffic numbers were more than double what I quoted in that post.

It's not a "card" that I tried to pull, it's a genuine question, because I simply can't imagine building something with the same functionality as the average drupal-plus-modules site in a reasonable timespan within a framework like CakePHP; and I'd suppose that only pays off in some corporate contexts or for the specialized sites/services you've referred to. Apparently your experience has been different. Nothing much to discuss about, then :)

I also agree from personal experience that Drupal has a hard time keeping up with high volume sites.
I agree with what Jesse has said and have experienced similar things. Furthermore, I don't -just- use "use" Drupal, if need be (time provided) I could actually build a CMS like Drupal as well.

All I want to say is you aren't going to run into some of these issues until you work on and experience a high traffic site for yourself first hand.

Drupal is great for MANY situations and MANY people - but it's not a universal solution. No CMS is. Just have to come to grips with that. It's not about what you love. What you know. Mac vs. PC vs. Linux. Drupal vs. Joomla! etc. It's about what works and what doesn't.

If there was one right answer we wouldn't have all these various options and applications available to us, now would we? =)

For the record, I am involved in some of the highest traffic Drupal sites on the net, so I'm quite aware of what it takes to run a high volume Drupal site.

So Drupal.org has a Google Page Rank of Nine (great SEO is baked into the system), so yeah, Drupal can handle high volumes of traffic. FYI: Amazon also has a page rank of 9. Feel free to fact check this statement:

http://www.seochat.com/seo-tools/pagerank-lookup/

that Drupal has a hard time keeping up with high volume sites.

Drupal does have problems with high volume sites, but you really gotta be up there with the traffic for it to have a big impact.

Regards,

Tom G.
San Francisco Dentist
Beverly Hills Dental Spa

You really have to be up there in traffic to notice the difference. I've never had to worry about it :-(

Glad to see the someone loves Drupal :-)

Thanks for all of the information.

Regards,

Terry

Jesse, learn more about Drupal and get involved instead of writing such long articles :)
Drupal 6 will have better performance, so Drupal gets better from time to time.

Seems like here's a real Drupal pro speaking :-) Actually, most of what you wrote is correct, but as you too made some mistakes you should allow Jesse to make his own ..

I've been enjoying this helpful discussion so far, and noticed almost all of the comments around Drupal relating to "community-driven" sites.

I'm looking for a CMS for our e-commerce startup. Is Drupal a complete write-off for an e-commerce project, or should we give Drupal E-commerce a try?

Thanks in advance.

Cheers,

Dre.

Check out ubercart. See http://ubercart.org