home forums resources search newsjoinmembers: 5931
Main Network: Joomla Wordpress Drupal Drupal.se Fireorb Flash Java PHP Ruby Windows Linux

Drupal - Speeding up taxonomy_term_node_count

 
Hiveminds - posted on:Wed, 2007-09-05 09:16

Drupals taxonomy node count function takes too much time to process. The problem is that the design is not optimized for use with a large number of terms and hundreds of thousands of nodes. Here is the devel read out for the process.

734.89ms 1 taxonomy_term_count_nodes

SELECT t.tid, COUNT(n.nid) AS c FROM term_node t INNER JOIN node n ON t.nid = n.nid LEFT JOIN i18n_node i18n ON n.nid = i18n.nid WHERE (i18n.language ='sv' OR i18n.language ='' OR i18n.language IS NULL) AND ( n.status = 1 ) GROUP BY t.tid

<?php

/**
 * Given a term id, count the number of published nodes in it.
 */
function taxonomy_term_count_nodes($tid$type 0) {
  static 
$count;
  if (!isset(
$count[$type])) {
    
// $type == 0 always evaluates true is $type is a string
    
if (is_numeric($type)) {
      
$result db_query(db_rewrite_sql('SELECT t.tid, COUNT(n.nid) AS c FROM {term_node} t INNER JOIN {node} n ON t.nid = n.nid WHERE n.status = 1 GROUP BY t.tid'));
    }
    else {
      
$result db_query(db_rewrite_sql("SELECT t.tid, COUNT(n.nid) AS c FROM {term_node} t INNER JOIN {node} n ON t.nid = n.nid WHERE n.status = 1 AND n.type = '%s' GROUP BY t.tid"), $type);
    }
    while (
$term db_fetch_object($result)) {
      
$count[$type][$term->tid] = $term->c;
    }
  }
  foreach (
_taxonomy_term_children($tid) as $c) {
    
$children_count += taxonomy_term_count_nodes($c$type);
  }
  return 
$count[$type][$tid] + $children_count;
}
?>

As you can see there is a JOIN between two tables that can have an incredible amount of rows on a large and busy website. The term_node table which is used in the query is also one that cannot be normalised as it is a join table of sorts. Caching does not work in this case because the number of nodes per term is a dynamic value.

My idea is to change the taxonomy node save function to have it save/delete to a term_node_count table. This table would allow for a single query to be used and speed things up tremendously. This would involve changes in:

taxonomy_save_term
taxonomy_del_term
taxonomy_save_node
taxonomy_term_count_nodes


This article brought to you by the
Hiveminds Magazine - Staff. Contact us if you want to post an article or announcement anonymously

 

 
by Carl on Wed, 2007-09-05 13:50

Ran into a miss in taxonomy_nodeapi. The function uses a switch based on the operation/action being called. But it calls the same function for different actions without passing the calling action on to the recieving function. This makes it impossible to know if the operation is a "update" or an "insert".

<?php

/**
 * Implementation of hook_nodeapi().
 */
function taxonomy_nodeapi($node$op$arg 0) {
  switch (
$op) {
    case 
'load':
     
//$output['taxonomy'] = taxonomy_node_get_terms($node->nid);
     
return $output;
    case 
'insert':
      
taxonomy_node_save($node->nid$node->taxonomy);
      break;
    case 
'update':
      
taxonomy_node_save($node->nid$node->taxonomy);
      break;
    case 
'delete':
      
taxonomy_node_delete($node->nid);
      break;
    case 
'validate':
      
taxonomy_node_validate($node);
      break;
    case 
'rss item':
      return 
taxonomy_rss_item($node);
    case 
'update index':
      return 
taxonomy_node_update_index($node);
  }
}
?>

So changing taxonomy_nodeapi to suit is called for before I can insert or update another table in the taxonomy_node_save function.

With ten years of experience in web development he spends most of his work day developing in PHP,mySQL and Drupal
Carl McDade - Systems Developer
by Hiveminds on Wed, 2007-10-17 07:34

In the end I just hacked the taxonomy module and removed most of the code in taxonomy_term_count_nodes(). This solution has been running for a few weeks now without any noticable changes to Drupals behaviour.

<?php

function taxonomy_term_count_nodes($tid$type 0) {
  
/*static $count;
  if (!isset($count[$type])) {
    // $type == 0 always evaluates true is $type is a string
    if (is_numeric($type)) {
      $result = db_query(db_rewrite_sql('SELECT t.tid, COUNT(n.nid) AS c FROM {term_node} t INNER JOIN {node} n ON t.nid = n.nid WHERE n.status = 1 GROUP BY t.tid'));
    }
    else {
      $result = db_query(db_rewrite_sql("SELECT t.tid, COUNT(n.nid) AS c FROM {term_node} t INNER JOIN {node} n ON t.nid = n.nid WHERE n.status = 1 AND n.type = '%s' GROUP BY t.tid"), $type);
    }
    while ($term = db_fetch_object($result)) {
      $count[$type][$term->tid] = $term->c;
    }
  }
  foreach (_taxonomy_term_children($tid) as $c) {
    $children_count += taxonomy_term_count_nodes($c, $type);
  }
  return $count[$type][$tid] + $children_count; */
  
$data db_fetch_object(db_query("SELECT node_count FROM {term_node_count} WHERE tid = %d"$tid));
  return  
$data->node_count;
}
?>

db_rewrite_sql was also left out to keep the function from reacting and running more than once on multi language websites.

This article brought to you by the Hiveminds Magazine - Staff. Contact us if you want to post an article or announcement anonymously

Enter your Hiveminds Magazine username.

Enter the password that accompanies your username.

 

Newsletter

Get updates on Hiveminds services, articles and downloads by signing up for the newsletter.

Editor's choice

Some of the better articles, stories and tutorials found at Hiveminds.

Find more

Find more of Hiveminds articles, stories, tutorials and user comments by searching.




Picked links

Hand picked websites and articles from around the web that provide quality reading.