<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Cruncht &#187; drupal</title>
	<atom:link href="http://cruncht.com/tag/drupal/feed/" rel="self" type="application/rss+xml" />
	<link>http://cruncht.com</link>
	<description>Semantic web development and publishing</description>
	<lastBuildDate>Sun, 21 Mar 2010 23:54:04 +0000</lastBuildDate>
	
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>Drupal Performance Quick Reference (part 13)</title>
		<link>http://cruncht.com/103/drupal-performance-quick-reference</link>
		<comments>http://cruncht.com/103/drupal-performance-quick-reference#comments</comments>
		<pubDate>Mon, 08 Feb 2010 10:05:56 +0000</pubDate>
		<dc:creator>Murray Woodman</dc:creator>
				<category><![CDATA[Drupal Planet]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://cruncht.com/?p=103</guid>
		<description><![CDATA[Time to revisit the different types of Drupal sites to see where gains can be made. What type of site do you have? This quick reference recaps the previous articles and lists the areas where different types of Drupal sites can improve performance.

All Sites

Get the best server for your budget and requirements.
Enable CSS and JS [...]]]></description>
			<content:encoded><![CDATA[<a href="http://cruncht.com/103/drupal-performance-quick-reference" title="Drupal Performance Quick Reference (part 13)"><img src="http://cruncht.com/wp-content/uploads/2010/02/to_do_list-150x150.jpg" alt="To do list" class="feed-image" title="Drupal Performance checklist" /></a><p>Time to revisit the <a href='/83/drupal-server-profile'>different types of Drupal sites</a> to see where gains can be made. What type of site do you have? This quick reference recaps the previous articles and lists the areas where different types of Drupal sites can improve performance.</p>
<p><span id="more-103"></span></p>
<h2>All Sites</h2>
<ul>
<li>Get the best server for your budget and requirements.</li>
<li>Enable CSS and JS optimization in Drupal</li>
<li>Enable compression in Drupal</li>
<li>Enable Drupal page cache and consider Boost</li>
<li>Install APC if available</li>
<li>Ensure no slow queries from rouge modules</li>
<li>Tune MySQL for decent query cache and key buffer</li>
<li>Optimize file size where possible</li>
</ul>
<h2>Server: Low resources</h2>
<ul>
<li>Boost stops PHP load and Bootstrap</li>
<li>Sensible module selection</li>
<li>Avoid node load in views lists</li>
<li>Smaller JVMs possibly if running Solr</li>
<li>Nginx smaller than Apache</li>
<li>mod_fcgid has smaller footprint over mod_php</li>
</ul>
<h2>Server: Farm</h2>
<ul>
<li>Split off Solr
<li>Split off DB server, watch the latency</li>
<li>With Cache Router select Memcache over APC for shared pools</li>
<li>Master + slaves for DB</li>
<li>Load balancing across web servers</li>
</ul>
<h2>Size: Many Nodes</h2>
<ul>
<li>Buy more RAM for database indexes</li>
<li>Index columns, especially for views</li>
<li>Thoroughly check slow queries</li>
<li>Warm up database</li>
<li>Swap in Solr for search</li>
<li>Solr to handle taxonomy pages</li>
</ul>
<h2>Activity: Many requests</h2>
<ul>
<li>Boost or</li>
<li>Pressflow and Varnish</li>
<li>Nginx over Apache</li>
<li>InnoDB on cache tables</li>
</ul>
<h2>Users: Mainly logged in</h2>
<ul>
<li>View/Block caching</li>
<li>CacheRouter (APC or Memcache)</li>
</ul>
<h2>Contention: Many Writes</h2>
<ul>
<li>InnoDB</li>
<li>Watchdog to file</li>
</ul>
<h2>Content: Heavy</h2>
<ul>
<li>Optimized files</li>
<li>Well positioned server</li>
<li>CDN</li>
</ul>
<h2>Functionality: Rich</h2>
<ul>
<li>Well behaved modules</li>
<li>Not too many modules</li>
<li>View/Block caching</li>
</ul>
<h2>Page browsing: Dispersed</h2>
<ul>
<li>Boost over Varnish if RAM is tight</li>
</ul>
<h2>Audience: Dispersed</h2>
<ul>
<li>CDN</li>
</ul>
<hr />
<p>This article forms part of a series on Drupal performance and scalability. The first article in the series is <a href="http://cruncht.com/75/drupal-performance-scalability">Squeezing the last drop from Drupal: Performance and Scalability</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cruncht.com/103/drupal-performance-quick-reference/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Drupal Page Rendering (part 12)</title>
		<link>http://cruncht.com/101/drupal-page-rendering</link>
		<comments>http://cruncht.com/101/drupal-page-rendering#comments</comments>
		<pubDate>Sun, 07 Feb 2010 10:04:30 +0000</pubDate>
		<dc:creator>Murray Woodman</dc:creator>
				<category><![CDATA[Drupal Planet]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://cruncht.com/?p=101</guid>
		<description><![CDATA[The time for a page to render in a user&#8217;s browser is comprised of two factors. The first is the time it takes to build a page on the server. The second is the time it takes to send and render the page with all the contained components. This guide has mainly been concerned with [...]]]></description>
			<content:encoded><![CDATA[<a href="http://cruncht.com/101/drupal-page-rendering" title="Drupal Page Rendering (part 12)"><img src="http://cruncht.com/wp-content/uploads/2010/02/slow-150x150.jpg" alt="Slow" class="feed-image" title="Drupal page rendering optimization" /></a><p>The time for a page to render in a user&#8217;s browser is comprised of two factors. The first is the time it takes to build a page on the server. The second is the time it takes to send and render the page with all the contained components. This guide has mainly been concerned with the former &#8211; how to get the most from your server, however, it is estimated that 80% to 90% of page rendering time is taken up during the rendering phase. </p>
<p><span id="more-101"></span></p>
<p> It&#8217;s no good to serve a cached page in the blink of an eye if there are countless included files which need to be requested and many large images which need to be transported across the globe. Optimizing page rendering time can make a noticeable difference to the user and is the cream on the cake of a well optimized site. It is therefore important to consider and optimize this final leg of the journey.</p>
<dl class='more'>
<dt><a href='http://wimleers.com/article/improving-drupals-page-loading-performance'>Improving Drupal&#8217;s page loading performance</a></dt>
<dd>Wim Leers covers all the bases on how to improve loading performance.</dd>
<dt><a href='http://www.amazon.com/High-Performance-Web-Sites-Essential/dp/0596529309'>High Performance Web Sites: Essential Knowledge for Front-End Engineers</a></dt>
<dd>Steve Souders, Chief Performance Yahoo! and author of YSlow extension, covers the Yahoo recommedations in this book.</dd>
<dt><a href='http://www.amazon.com/Even-Faster-Web-Sites-Performance/dp/0596522304'>High Even Faster Web Sites: Performance Best Practices for Web Developers</a></dt>
<dd>Another Steve Souders book covering Javascript (AJAX), Network (Image compression, chuncked encoding) and browser (CSS selectors, etc).</dd>
</dl>
<p>It is worthwhile reviewing <a href='http://developer.yahoo.com/performance/rules.html'>Yahoo&#8217;s YSlow recommendations</a> to see all of the optimizations which are possible. We cover selected areas where the default Drupal install can be improved upon.<br />
<h2><a id='network-requests'>Minimize HTTP Requests</a></h2>
<h3>Combined Files</h3>
<p>The <a href='/87/drupal-performance-out-of-the-box'>Out of The Box</a> section covered the inbuilt CSS and JS aggregation and file compression. The use of &#8220;combined files&#8221; is a significant factor in Drupal&#8217;s relatively good score in the YSlow tests. Make sure you have this enabled.</p>
<p class='summary'>All sites: Enable CSS and JS aggregation.</p>
<h3>CSS Sprites</h3>
<p>CSS Image Sprites are another method of cutting down the number of requests. This approach combines a number of smaller images into one large one which is then selectively displayed to the user through the use of background offset in CSS. It is a useful approach for thing such as small icons which can have a relatively large amount of HTTP overhead for each request. Something for the theme designers to consider.</p>
<p class='summary'>Custom designs: Use CSS sprites if appropriate.</p>
<dl class='more'>
<dt><a href='http://www.alistapart.com/articles/sprites'>CSS Sprites: Image Slicing’s Kiss of Death</a></dt>
<dd>Overview of how CSS sprites work and how they can be used.</dd>
<dt><a href='http://www.advomatic.com/blogs/jack-haas/lesson-usefulness-css-sprite-generators'>A lesson in the usefulness of CSS sprite generators</a></dt>
<dd>Covers commonly used spite generators.</dd>
</dl>
<h2><a id='cdn'>Use a Content Delivery Network (CDN)</a></h2>
<p>This is the number two recommended best practice.</p>
<blockquote><p>A content delivery network (CDN) is a collection of web servers distributed across multiple locations to deliver content more efficiently to users. The server selected for delivering content to a specific user is typically based on a measure of network proximity. For example, the server with the fewest network hops or the server with the quickest response time is chosen.<br /><a href='http://developer.yahoo.com/performance/rules.html#cdn'>http://developer.yahoo.com/performance/rules.html#cdn</a></p></blockquote>
<p>Of all the CDN web services <a href='http://www.simplecdn.com/'>SimpleCDN</a> seems to be getting positive press amongst Drupal folks as it is simple and cheap. It offers the &#8220;origin pull&#8221; Mirror Buckets service which will serve content from 3.9 cents to 1.9 cents per GB. At this price you will probably be saving money on your bandwidth costs as well as serving content faster.</p>
<p>The <a href='http://drupal.org/project/cdn'>CDN integration module</a> is the recommended module to use for integration with content delivery networks as it supports &#8220;origin pull&#8221; as well as push methods. It supports content delivery for a all CSS, JS, and image files (including ImageCache).</p>
<p class='summary'>High traffic, geographically dispersed: use CDN</p>
<dl class='more'>
<dt><a href='http://drupal.org/project/cdn'>CDN integration module</a></dt>
<dd>Wim Leers&#8217; fully featured module which integrates with a wide range of CDN servers.</dd>
<dt><a href='http://drupal.org/project/simplecdn'>SimpleCDN module</a></dt>
<dd>Simple CDN re-writes the URL of certain website elements (which can be extended using plugins) for use with a CDN Mirror service.</dd>
<dt><a href='http://wimleers.com/talk/drupalcon-dc-2009'>Drupal CDN integration: easier, more flexible and faster!</a></dt>
<dd>Slides covering advantages of CDNs and possible implementations.</dd>
<dt><a href='http://www.voxel.net/labs/mod_cdn'>mod_cdn</a></dt>
<dd>Apache2 module which shows some promise but not much info available for it with regards to Drupal.</dd>
<dt><a href='http://groups.drupal.org/node/47258'>Best Drupal CDN module?</a></dt>
<dd>Drupal Groups discussion.</dd>
</dl>
<p>On a related note many sites can benefit from judicial placement of the server if traffic tends to come from one place and no CDN is being used. Sites based out of the US may find the proximity of a site hosted in their area worth the extra cost of hosting.</p>
<h2><a id='expires-headers'>Add Expires Headers</a></h2>
<p>When a file is served by a web server an &#8220;Expires&#8221; header can be sent back to the client telling it that the content being sent will expire at a certain date in the future and that the content may be cached until that time. This speeds up page rendering because the client doesn&#8217;t have to send a GET request to see if the file has been modified.</p>
<p>By default the .htaccess file in the root of Drupal contains rules which sets a two week expiry for all files (CSS, JS, PNG, JPG, GIF) except for HTML which are considered to be dynamic and therefore not cachable.</p>
<p><code><br />
# Requires mod_expires to be enabled.<br />
<IfModule mod_expires.c><br />
  # Enable expirations.<br />
  ExpiresActive On<br />
  # Cache all files for 2 weeks after access (A).<br />
  ExpiresDefault A1209600<br />
  # Do not cache dynamically generated pages.<br />
  ExpiresByType text/html A1<br />
</IfModule><br />
</code></p>
<p>The Expires header will not be generated unless you have mod_expires enabled in Apache. To make sure it is enabled in Apache2 run the following as admin.</p>
<p><code><br />
# a2enmod expires<br />
# /etc/init.d/apache2 restart<br />
</code></p>
<p>Ensuring this is enabled will elevate your YSlow score by about 10 points or so.</p>
<p class='summary'>All sites: Configue Apache correctly for fewer requests.</p>
<h2><a id='gzip'>Gzip components</a></h2>
<p>You can Gzip by enabling compression in the performance area of admin. Alternatively you could configure Apache to do it.</p>
<p class='summary'>All Sites: Enable Gzip compression</p>
<h2><a id='optimize-images'>Optimize Images</a></h2>
<p>Binary files do not shrink significantly after Gzip compression. Gains can be made by ensuring that rich media such as images, audio and video are (i) targeted for the correct display resolution and (ii) have an appropriate amount of lossy compression applied. Since these files will generally only be downloaded once they do not benefit from caching in the client and so care must be taken to ensure that they are as small as reasonably possible.
<p class='summary'>All Sites: Compress binary files</p>
<dl class='more'>
<dt><a href='http://pmt.sourceforge.net/pngcrush/'>Pngcrush</a></dt>
<dd>Pngcrush is an optimizer for PNG (Portable Network Graphics) files. It can be run from a commandline in an MSDOS window, or from a UNIX or LINUX commandline.</dd>
</dl>
<hr />
<p>This article forms part of a series on Drupal performance and scalability. The first article in the series is <a href="http://cruncht.com/75/drupal-performance-scalability">Squeezing the last drop from Drupal: Performance and Scalability</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cruncht.com/101/drupal-page-rendering/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Drupal Benchmarking (part 11)</title>
		<link>http://cruncht.com/99/drupal-benchmarking</link>
		<comments>http://cruncht.com/99/drupal-benchmarking#comments</comments>
		<pubDate>Sat, 06 Feb 2010 10:03:15 +0000</pubDate>
		<dc:creator>Murray Woodman</dc:creator>
				<category><![CDATA[Drupal Planet]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://cruncht.com/?p=99</guid>
		<description><![CDATA[Benchmarking a system is a reliable way to compare one setup with another and is particularly helpful when comparing different server configurations. We cover a few simple ways to benchmark a Drupal website.

A performant system is not just one which is fast for a single request. You also need to consider how the system performs [...]]]></description>
			<content:encoded><![CDATA[<a href="http://cruncht.com/99/drupal-benchmarking" title="Drupal Benchmarking (part 11)"><img src="http://cruncht.com/wp-content/uploads/2010/02/tape-150x150.jpg" alt="Blue Tape" class="feed-image" title="Drupal Performance benchmarking" /></a><p>Benchmarking a system is a reliable way to compare one setup with another and is particularly helpful when comparing different server configurations. We cover a few simple ways to benchmark a Drupal website.</p>
<p><span id="more-99"></span></p>
<p>A performant system is not just one which is fast for a single request. You also need to consider how the system performs under stress (many requests) and how stable the system is (memory). Bechmarking with tools such as ab allows you to stress the server with many concurrent requests to replicate traffic when a site is being slashdotted. With a more customised setup they can also be used in more sophisticated ways to mimic traffic across a whole site.</p>
<dl class="more">
<dt><a href="http://drupal.org/node/79237">Benchmarking and profiling Drupal</a></dt>
<dd>Documentation which covers tools of the trade including Apache Bench (ab) and SIEGE.</dd>
</dl>
<h2><a id="apache-bench">Apache Bench (ab)</a></h2>
<p><a href="http://httpd.apache.org/docs/trunk/programs/ab.html">ab</a> is the most commonly used benchmarking tool in the community. It shows you have many requests per second your site is capable of serving. Concurrency can be set to 1 to get end to end speed results or increased to get a more realistic load for your site. Look to the &#8220;failed requests&#8221; and &#8220;request per second&#8221; results.</p>
<p>In order to test the speed of a single page, turn off page caching and run ab with concurrency of one to get a baseline.</p>
<p><code>ab -n 1000 -c 1 http://drupal6/node/1</code></p>
<p>To check scalability turn on the page cache and ramp up concurrent connections (10 to 50) to see how much the server can handle. You should also make sure keep alives are turned (-k) on as this leads to a more realistic result for a typical web browser. At higher concurrency levels making new connections can be a bottleneck. Also, set compression headers (-H) as most clients will support this feature.</p>
<p><code>ab -n 1000 -c 10 -k -H 'Accept-Encoding: gzip,deflate' http://drupal6/node/1</code></p>
<dl class="more">
<dt><a href="http://drupal.org/node/282862">Drupal Performance Measurement &amp; Benchmarking</a></dt>
<dd>Testing with ab and simple changes you can make within Drupal.</dd>
<dt><a href="http://drupaleasy.com/blogs/ryanprice/2009/04/drupal-performance-testing-apache-benchmark">On Drupal Performance: Testing with Apache Benchmark</a></dt>
<dd>Covers server side tools and walks through ab options and use.</dd>
<dt><a href="http://ezra-g.com/blog/20080229/benchmarking-authenticated-drupal-users-with-apachebench">Benchmarking Authenticated Drupal Users with ApacheBench</a></dt>
<dd>Demonstrates how to pull out current session id and how to pass that to ab so that authenticated users can be tested.</dd>
<dt><a href="http://groups.drupal.org/node/26485">Has anyone tried nginx caching with Drupal?</a></dt>
<dd>Illustrative discussion where different Drupal setups are benchmarked with ab.</dd>
</dl>
<h2><a id="jmeter">JMeter</a></h2>
<p><a href="http://jakarta.apache.org/jmeter/">JMeter</a> is a Java desktop app designed to test function and performance. It is the preferred testing tool of many administrators.</p>
<dl class="more">
<dt><a href="http://github.com/jacobSingh/Drupal-Performance-Testing-Suite">Drupal-Performance-Testing-Suite</a></dt>
<dd>Perl script which runs a JMeter test on Drupal and provides graphs.</dd>
<dt><a href="http://groups.drupal.org/node/53963">jmeter scripts to test server performance</a></dt>
<dd>Some scripts to get you started testing with JMeter.</dd>
</dl>
<h2><a id="benchmarking-thoughts">Thoughts on benchmarking</a></h2>
<p>Benchmarking is essential if you wish to have an objective comparison between different setups. However, it is not the final measurement with regards to performance. Remember that <a href="#page-rendering">page rendering</a> times are what are important for users and that too needs to be optimized. Also, benchmarks tend to be artificial in the sense that they often measure unrealistic situations. Will all of your requests be for one anonymous page only? Maybe in the Slashdot situation but there are other considerations obviously. Finally, it is easy to focus intently on the number, especially when it comes to caching scores, and forget that minor differences may not make so much of a difference to real life scenarios. Don&#8217;t forget the logged in user.</p>
<hr />
<p>This article forms part of a series on Drupal performance and scalability. The first article in the series is <a href="http://cruncht.com/75/drupal-performance-scalability">Squeezing the last drop from Drupal: Performance and Scalability</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cruncht.com/99/drupal-benchmarking/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Custom Drupal Distributions (part 10)</title>
		<link>http://cruncht.com/97/custom-drupal-distributions</link>
		<comments>http://cruncht.com/97/custom-drupal-distributions#comments</comments>
		<pubDate>Fri, 05 Feb 2010 10:02:43 +0000</pubDate>
		<dc:creator>Murray Woodman</dc:creator>
				<category><![CDATA[Drupal Planet]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[mercury]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[pressflow]]></category>

		<guid isPermaLink="false">http://cruncht.com/?p=97</guid>
		<description><![CDATA[There are a couple of projects which have made it easy to achieve performance gains by making slight amendments to core or packaging up the code in a helpful manner. Pressflow makes it possible to use a reverse proxy such as Varnish, amongst other things. Mercury packages Pressflow up as a Amazon EC2 image. The [...]]]></description>
			<content:encoded><![CDATA[<a href="http://cruncht.com/97/custom-drupal-distributions" title="Custom Drupal Distributions (part 10)"><img src="http://cruncht.com/wp-content/uploads/2010/02/sharing-150x150.jpg" alt="Sharing" class="feed-image" title="Custom Drupal Distributions" /></a><p>There are a couple of projects which have made it easy to achieve performance gains by making slight amendments to core or packaging up the code in a helpful manner. Pressflow makes it possible to use a reverse proxy such as Varnish, amongst other things. Mercury packages Pressflow up as a Amazon EC2 image. The development of both these projects is the sign of a maturing ecosystem where serious deployments can easily be rolled out.</p>
<p><span id="more-97"></span></p>
<h2><a id='pressflow'>Pressflow</a></h2>
<p><a href='https://launchpad.net/pressflow'>Pressflow</a>  is a distribution which attempts to bring many of the improvements discussed above (SQL improvements, Varnish) into a single package. Pressflow is a standard Drupal install which has had its core modified to fix bottlenecks and facilitate the use of advanced caching features. FourKitchens don&#8217;t regard Pressflow as a fork since many of the initiatives found in Pressflow are contributed back into the development of the head of Drupal.
<p>So long as you haven&#8217;t hacked core yourself then using Pressflow is a simple matter of swapping out core drupal and replacing it with Pressflow.</p>
<p>In a nutshell Pressflow allows the following:</p>
<ul>
<li>Support for database replication</li>
<li>Support for Squid and Varnish reverse proxy caching</li>
<li>Optimization for MySQL</li>
<li>Optimization for PHP 5</li>
</ul>
<p class='summary'>High Traffic, Varnish required: Easy setup.</p>
<dl class='more'>
<dt><a href='http://fourkitchens.com/pressflow-makes-drupal-scale'>Pressflow makes Drupal scale</a></dt>
<dd>Announcement covering the advantages and features of Pressflow.</dd>
</dl>
<h2><a id='project-mercury'>Project Mercury</a></h2>
<p><a href='http://www.chapterthree.com/blog/josh_koenig/project_mercury_preconfigured_drupalvarnish_ec2_ami'>Project Mercury</a> is an innovative project from <a href=''>Chapter Three</a> which wraps up a tricked out PressFlow installation in to a preconfigured Amazon Machine Image (AMI) for use on Amazon EC2 instances. </p>
<dl class='more'>
<dt><a href='http://www.chapterthree.com/blog/zack_rosen/pantheon_project_blazes_ahead'>The Pantheon Project Blazes Ahead</a></dt>
<dd>Hot of the press: Mercury will also be available for deployment on other servers, not just EC2. Further there will be a Mercury On Demand service at Rackspace.</dd>
</dl>
<blockquote><p>The goal of this project is to make Drupal as fast as possible for as many people as possible. To that end, we are developing a pre-built Amazon Machine Image (AMI) which will allow anyone with an Amazon Web Services account to spin up an EC2 instance and see how all this works in real-time. The ultimate goal is a production-ready release that can be used for deploying real websites.</p></blockquote>
<p>If you want to get started using the image all you need to do is signup to Amazon Web Services and then start up an instance of your choosing. You know are in control of a fully configured, scalable server. This sounds easy in practice, however, if you are considering going down this path there are a couple of considerations:</p>
<ul>
<li>Amazon is not the cheapest provider of bandwidth, RAM and storage. Other virtual servers have better deals. You are paying for the ability to spawn servers on the fly</li>
<li>Persistent storage is an issue which needs to be overcome and managed if scaling out your web server.</li>
<li>There a bit of a learning curve with some of the tricks of the trade when managing the servers.</li>
</ul>
<p>Project Mercury and EC2 is a worthy combination if you really need the ability to serve massive amounts of traffic and also have the ability to temporarily scale out during peak times.</p>
<dl class='more'>
<dt><a href='http://groups.drupal.org/node/25617'>Project Mercury Benchmarks: 2000+ Requests Per Second!</a></dt>
<dd>Drupal is fast with APC + Page Cache. It is very fast with PressFlow and Varnish. NB. It would have been interesting to see how Boost went against Varnish for this test.</dd>
</dl>
<p>The configuration chosen by the project is interesting because it shows how other sites might go about setting up a scalable server. in brief the setup is as follows:</p>
<ul>
<li>Ubuntu 32 or 64 bit</li>
<li>Pressflow</li>
<li>APC for opcode cache</li>
<li>CacheRouter and APC/Memcached for No SQL caches</li>
<li>Varnish as reverse proxy</li>
</ul>
<p class='summary'>High Traffic, Varnish required, EC2 required: Easy setup considering.</p>
<dl class='more'>
<dt><a href='http://groups.drupal.org/node/25425'>Step-by-step: Setting up Varnish, Apache, APC and Solr Project Mercury Style</a></dt>
<dd>Step by step instructions for setting up Project Mercury on a Ubuntu server. Very helpful for admins wishing to install manually on their own server.</dd>
</dl>
<hr />
<p>This article forms part of a series on Drupal performance and scalability. The first article in the series is <a href="http://cruncht.com/75/drupal-performance-scalability">Squeezing the last drop from Drupal: Performance and Scalability</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cruncht.com/97/custom-drupal-distributions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Drupal Caching (part 9)</title>
		<link>http://cruncht.com/95/drupal-caching</link>
		<comments>http://cruncht.com/95/drupal-caching#comments</comments>
		<pubDate>Thu, 04 Feb 2010 10:02:08 +0000</pubDate>
		<dc:creator>Murray Woodman</dc:creator>
				<category><![CDATA[Drupal Planet]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://cruncht.com/?p=95</guid>
		<description><![CDATA[Several advanced options exist to take caching to the next level in Drupal. With advanced caching Drupal is able to scale to levels required by the most demanding of sites.

We have already discussed (i) page  and block caching and (ii) other caches which come baked into the core of Drupal in the Out of [...]]]></description>
			<content:encoded><![CDATA[<a href="http://cruncht.com/95/drupal-caching" title="Drupal Caching (part 9)"><img src="http://cruncht.com/wp-content/uploads/2010/02/squirrel-150x150.jpg" alt="Red Squirrel" class="feed-image" title="Drupal caching" /></a><p>Several advanced options exist to take caching to the next level in Drupal. With advanced caching Drupal is able to scale to levels required by the most demanding of sites.</p>
<p><span id="more-95"></span></p>
<p>We have already discussed (i) page  and block caching and (ii) other caches which come baked into the core of Drupal in the <a href="/87/drupal-performance-out-of-the-box">Out of the Box</a> section. Whilst big gains can be made through enabling simple config options it is possible to make several more improvements to the way Drupal caches and serves data, making Drupal a system which can scale to serve 1000s of requests a second if the need arises.</p>
<h2><a id="page-caching">Page Caching</a></h2>
<h3><a id="drupal-bootstrap">The Bootstrap</a></h3>
<p>Every time Drupal code needs to be run either through a web page or script, Drupal must undertake the boostrap process which has a certain amount of overhead. Generally this process cannot be avoided &#8211; you need to run code after all. However, there are a couple of cases where it is possible. Firstly, Normal Page caching avoids much of the bootstrap save for hook_boot() and hook_exit() hooks which are used by the statistics and throttle modules. Secondly, Aggressive Page Caching allows for all of the bootstrap to be avoided when serving content.  Finally, contributed modules such as Boost and Varnish allow for the avoidance of PHP and bootstrap since they are operating at a stage before PHP is invoked. This is the main reason Boost and Varnish are so attractive as a page cache.</p>
<h3><a id="boost">Boost</a></h3>
<p><a href="http://drupal.org/project/boost">Boost</a> is a page cache which works in a similar way to aggressive page caching in that it attempts to serve cached data without running the bootstrap. However, it is able to go one step further and avoid PHP being run as the redirect takes place in rewrite rules located in .htaccess. If the static file exists then it is served as the response. The end result is super fast response without running PHP or the bootstrap. This frees up the server to handle more requests from logged in users.</p>
<p>Boost is an attractive module because it is easy to install, has a lot of configuration options which give good control over cache building and invalidation. It is highly recommended.</p>
<p>One aspect of Boost which may be forgotten is that it isn&#8217;t an entirely a file based solution. If the operating system file cache is working well then there is a good chance that the &#8220;file&#8221; will come out of RAM rather than off disk. This makes Boost a competitive option when compared to other more complicated reverse proxy setups such as Squid or Varnish.</p>
<p>Because Boost is file based it is able to potentially cache a lot more data than a RAM based solution. If you have thousands or millions of pages to cache putting them all in RAM is probably not optimal. Best to put them on disk and save the RAM for DB indexes or possibly a CacheRouter cache.</p>
<p class="summary">All sites, especially big, infrequently changing, high traffic: Big gains. Easy setup.</p>
<h3><a id="varnish">Varnish</a></h3>
<p>The <a href="http://drupal.org/project/varnish">Varnish HTTP Accelerator Integration</a> module integrates Drupal with <a href="http://varnish-cache.org/">Varnish</a>, a reverse proxy which sits in front of Apache, PHP and Drupal. Varnish stores cached content in RAM and avoids the overhead of Apache and the Drupal bootstrap. As such it offers very high performance for anonymous users on cached pages and is the preferred option for many sites where scaling is paramount.</p>
<p>Varnish requires either a patch to core to add HTTP headers, PressFlow or Drupal 7. Most people are therefore running Varnish in conjunction with PressFlow.</p>
<p class="summary">High Traffic: Big gains when performance critical.</p>
<h3><a id="squid">Squid</a></h3>
<p>Squid is a reverse proxy similar to Varnish and is in use on drupal.org. It doesn&#8217;t seem to be such a commonly deployed solution probably because PressFlow has been altered to work with Varnish which is higher performing apparently.</p>
<p class="summary">Varnish preferred over Squid.</p>
<h3><a id="varnish-vs-boost">Varnish vs Boost</a></h3>
<p>A popular thread <a href="http://groups.drupal.org/high-performance">High Performance</a> Drupal group is for the <a href="http://groups.drupal.org/node/46042">perfect recipe for page caching</a> &#8211; whether to use Varnish or Boost. Varnish has the edge in speed over Apache+Boost as well as Nginx+Boost. Look at <a href="http://groups.drupal.org/node/45514#comment-119868">results</a> published by <a href="http://groups.drupal.org/user/16022">brianmercer</a>.</p>
<table style="height: 88px;" width="417">
<tbody>
<tr>
<th style="text-align: left;">Setup</th>
<th style="text-align: left;">Approx requests/s</th>
</tr>
<tr>
<td>Boost with Apache-prefork</td>
<td>500</td>
</tr>
<tr>
<td>Boost with Nginx</td>
<td>2000</td>
</tr>
<tr>
<td>Varnish</td>
<td>2400</td>
</tr>
</tbody>
</table>
<p>Varnish may have the edge in speed but it is more complex to install and requires a patched core. Some people may not want to run a non standard Drupal installation. Varnish also requires RAM to store the cached material &#8211; something which may be better spent elsewhere (database or max clients). Boost offers similar performance, is easy to install and has good control of cache invalidation and warmup. Boost is also able to serve files from RAM if the OS has cached them.</p>
<p>It is certainly up to you to decide which avenue to take. This guide is attracted to the relative simplicity of Drupal + Nginx + Boost over Pressflow + Varnish + Apache/Nginx. Nginx brings better performance to the web server as a whole, ie. pages not in the cache, and RAM savings if it is needed elsewhere. It must be stressed that each site has a different profile and the ultimate decision is up to you.</p>
<dl class="more">
<dt><a href="http://groups.drupal.org/node/26485">Has anyone tried nginx caching with Drupal?</a></dt>
<dd>Drupal Groups thread with lots of benchmarking. Final conclusion seems to be that Varnish vs Boost+Nginx is a pretty close thing.</dd>
<dt><a href="http://www.chapterthree.com/blog/josh_koenig/project_mercury_preconfigured_drupalvarnish_ec2_ami">Project Mercury: A pre-configured Drupal+Varnish EC2 AMI</a></dt>
<dd>Josh Konig of Chapter Three claims that Varnish is faster than Boost but gives no numbers.</dd>
<dt><a href="http://www.metaltoad.com/blog/quick-drupal-cacherouter-and-boost-benchmarks">Quick Drupal Cacherouter and Boost benchmarks</a></dt>
<dd>Dylan Tack likes Boost: &#8220;Response times are all close enough that it doesn&#8217;t really matter what caching backend you choose&#8230; The only factor that&#8217;s really relevant is how good your system&#8217;s cache expiration and regeneration logic is&#8230; it seems like Boost is the clear winner here as well.&#8221;</dd>
<dt><a href="http://groups.drupal.org/node/46042">What&#8217; recipe should i choose for best performance?</a></dt>
<dd>Discussion with participants split between Varnish and Boost depending on circumstances. Nginx+Boost seems pretty equal with Varnish.</dd>
<dt><a href="http://groups.drupal.org/node/21897">Caching: Modules that make Drupal scale</a></dt>
<dd>Table of modules, performance gains and features.</dd>
</dl>
<h2><a id="cache-router">Cache Router</a></h2>
<p><a href="http://drupal.org/project/cacherouter">Cache Router</a> is a module which enables you to cache the Drupal cache tables (including views, blocks, menus, variables, and filters) in RAM. Drupal no longer has to hit the database to pull out this content &#8211; a big win for logged in users who might not be able to enjoy the advantages of a page cache. Cache Router therefore fills an important niche in your caching strategy.</p>
<p>Cache Router is able to do this via a number of backends including APC or Memcache. APC should be considered if you are running Drupal on a single node and only need the single local store. <a href="http://www.chapterthree.com/blog/josh_koenig/project_mercury_preconfigured_drupalvarnish_ec2_ami">According to Josh Konig</a>, APC &#8220;is less error-prone, more secure, and allegedly as fast (if not faster) than running memcached according to the folks at Facebook.&#8221; Memcache can be used to share cached data between or across multiple servers.</p>
<p class="summary">High Traffic, Logged in: Massive gains when page cache not hit.</p>
<hr />
<p>This article forms part of a series on Drupal performance and scalability. The first article in the series is <a href="http://cruncht.com/75/drupal-performance-scalability">Squeezing the last drop from Drupal: Performance and Scalability</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cruncht.com/95/drupal-caching/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Drupal Troubleshooting (part 8)</title>
		<link>http://cruncht.com/93/drupal-troubleshooting</link>
		<comments>http://cruncht.com/93/drupal-troubleshooting#comments</comments>
		<pubDate>Wed, 03 Feb 2010 10:01:41 +0000</pubDate>
		<dc:creator>Murray Woodman</dc:creator>
				<category><![CDATA[Drupal Planet]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://cruncht.com/?p=93</guid>
		<description><![CDATA[Sometimes your installation will be slow for no apparent reason. Time to grab your toolkit and get started on finding and eliminating the problem. Here are some of the common problems faced by developers.

Database Queries
The database is generally the first place you look when trying to identify problems on a page. It is possible to [...]]]></description>
			<content:encoded><![CDATA[<a href="http://cruncht.com/93/drupal-troubleshooting" title="Drupal Troubleshooting (part 8)"><img src="http://cruncht.com/wp-content/uploads/2010/02/flat_tire-150x150.jpg" alt="Flat Tire" class="feed-image" title="Debugging Drupal" /></a><p>Sometimes your installation will be slow for no apparent reason. Time to grab your toolkit and get started on finding and eliminating the problem. Here are some of the common problems faced by developers.</p>
<p><span id="more-93"></span></p>
<h2><a id='database-queries'>Database Queries</a></h2>
<p>The database is generally the first place you look when trying to identify problems on a page. It is possible to identify problems through MySQl&#8217;s slow query log or through the query log of the Devel module.</p>
<dl class='more'>
<dt><a href='http://www.amazon.com/High-Performance-MySQL-Optimization-Replication/dp/0596101716/ref=sr_1_1?ie=UTF8&#038;s=books&#038;qid=1264569847&#038;sr=1-1'>High Performance MySQL: Optimization, Backups, Replication, and More</a></dt>
<dd>Must have book for anyone serious about getting the most from MySQL.</dd>
</dl>
<h3><a id='database-indexes'>Views and indexes</a></h3>
<p>The Views module makes building queries easy and sometimes a crucial part of the query will rely on a CCK column with no index. This tends to happen with sorting or filtering. Queries can run slow in these cases. The solution is to place an index on the column in question.</p>
<p class='summary'>Large Sites: CCK need index</p>
<h3><a id='database-joinst'>Views and Left Joins</a></h3>
<p>The Views module will often design queries with LEFT JOIN rather than INNER JOIN, especially when joining from the node table to a content type CCK table. In many cases you might only want an inner join, especially when the node table is very big. In these cases it is possible to rewrite the query by hacking the query in hook_views_pre_execute.</p>
<p class='summary'>Large Sites: Some View SQL inefficient</p>
<dl class='more'>
<dt><a href='http://drupal.org/node/372994'>Ability to INNER JOIN to node for a specific field</a></dt>
<dd>Discussion of Views SQL for joining between node and content type tables.</dd>
</dl>
<h3><a id='database-schema'>Inefficient schema</a></h3>
<p>The database design of some modules could possibly be improved. It is up to you the developer/administrator to ensure that you are happy with the internal design of contributed modules. If you find an inefficiency then submit an issue to the module, provide a patch or remove the module from your site.</p>
<h3><a id='database-composite-index'>Composite indexes</a></h3>
<p>MySQL is limited to using one index per table when sorting and filtering. This can make it tricky when you wish to use multiple AND clauses or involve a sort and a filter at the same time. In these cases adding a composite index can get you out of trouble.</p>
<h3><a id='problems-core'>Problems in Core</a></h3>
<p>In a number of cases there are problems in core of Drupal where queries are very slow on very big Drupal installations with millions of nodes. The author has experienced problems with:</p>
<ul>
<li>Inability to browse content in Admin section  due to join from member to user table</li>
<li>Inability to edit nodes with many CCK fields. Massive RAM usage when loading nodes.</li>
<li>Taxonomy pages take a very long time to display</li>
<li>Search system unable to index content</li>
</ul>
<p>These problems can be avoided partly by swapping Solr in for search and having it override the taxonomy pages. The other problems you just have to live with <img src='http://cruncht.com/wp-includes/images/smilies/icon_neutral.gif' alt=':-|' class='wp-smiley' />
<dl class='more'>
<dt><a href='http://wtanaka.com/drupal/million-nodes-6'>Drupal with millions of nodes</a></dt>
<dd>Some good research from Wesley Tanaka into problems with many nodes. Attitude from some here seems a little dismissive of the issues.</dd>
</dl>
<h2><a id='slow-modules'>Slow Modules</a></h2>
<p>Modules can also be badly written leading to poor performance in certain circumstances. Generally this happens when the module&#8217;s creator did not test the module against (i) large installations with many nodes (ii) complex installations with heavy nodes or detailed taxonomies. You generally will only run into these problems if your site is large. In some cases you can fix the module by creating more indexes in the DB. In others you just have to remove the module from your installation&#8230;. or submit a patch.</p>
<p>Some known offenders for large sites include:</p>
<ul>
<li>XML Sitemap</li>
<li>Node access</li>
<li>Taxonomy Browser</li>
<li>Fivestar</li>
</ul>
<dl class='more'>
<dt><a href='http://2bits.com/articles/how-drupals-nodeaccess-table-can-negatively-impact-site-performance.html'>How Drupal&#8217;s node_access table can negatively impact site performance</a></dt>
<dd></dd>
<dt><a href='http://2bits.com/articles/scalability-taxonomy-browser-module-restricting-number-terms.html'>Scalability of the Taxonomy Browser module: Restricting number of terms</a></dt>
<dd>&#8220;Query from hell&#8221; with many joins leading to queries which never finish.</dd>
<dt><a href='http://2bits.com/articles/xml-sitemap-6x-2x-how-drupal-modules-can-overload-site-during-cron-solutions.html'>XML Sitemap 6.x-2.x: How Drupal modules can overload a site during cron, with solutions</a></dt>
<dd>XML Sitemap module needs to be configured correctly</dd>
</dl>
<hr />
<p>This article forms part of a series on Drupal performance and scalability. The first article in the series is <a href="http://cruncht.com/75/drupal-performance-scalability">Squeezing the last drop from Drupal: Performance and Scalability</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cruncht.com/93/drupal-troubleshooting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Drupal Implementation Decisions (part 7)</title>
		<link>http://cruncht.com/91/drupal-implementation-decisions</link>
		<comments>http://cruncht.com/91/drupal-implementation-decisions#comments</comments>
		<pubDate>Tue, 02 Feb 2010 10:01:08 +0000</pubDate>
		<dc:creator>Murray Woodman</dc:creator>
				<category><![CDATA[Drupal Planet]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://cruncht.com/?p=91</guid>
		<description><![CDATA[Bad performance can be down to decisions you have made during development because you haven&#8217;t been aware of natural limitations of the system or the way Drupal works. The more you poke about the more you will understand. Here&#8217;s a grab bag of things which may bite you.

Search
The search which comes inbuilt with Drupal has [...]]]></description>
			<content:encoded><![CDATA[<a href="http://cruncht.com/91/drupal-implementation-decisions" title="Drupal Implementation Decisions (part 7)"><img src="http://cruncht.com/wp-content/uploads/2010/02/bail-150x150.jpg" alt="Bail" class="feed-image" title="Drupal implementation mistakes" /></a><p>Bad performance can be down to decisions you have made during development because you haven&#8217;t been aware of natural limitations of the system or the way Drupal works. The more you poke about the more you will understand. Here&#8217;s a grab bag of things which may bite you.</p>
<p><span id="more-91"></span></p>
<h2><a id='search'>Search</a></h2>
<p>The search which comes inbuilt with Drupal has long been regraded as unsatisfactory for larger sites:</p>
<ul>
<li>failure to index large sites efficiently</li>
<li>slow at returning results</li>
<li>relatively limited feature set compared to dedicated search solutions</li>
</ul>
<p>Part of the problem lies with the fact that standard relational databases such as MySQL and interpreted languages such as PHP are not well suited to handle the large indexes and filtering required for big datasets. Search can also be an intensive process if the corpus is large and the traffic high. Being able to move search off the main box will give more resources to Drupal to do other things. Both of the following solutions can help solve these issues.</p>
<h3 id='solr'>Solr</h3>
<p>Enter the Apache Lucene project and Apache Solr.</p>
<dl class='more'>
<dt><a href='http://lucene.apache.org/'>Welcome to Lucene!</a></dt>
<dd>Lucene &#8220;provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.&#8221;</dd>
<dt><a href='http://lucene.apache.org/solr/'>Welcome to Solr</a></dt>
<dd>&#8220;Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world&#8217;s largest internet sites&#8221;</dd>
</dl>
<p>The <a href='http://drupal.org/project/apachesolr'>Apache Solr Search Integration</a> module integrates Solr with Drupal. Installation requires a setup of a JVM and a container such as Tomcat or Jetty &#8211; something which is a little outside the usual LAMP configuration. Installation is not so difficult with some good documentation available. If this is beyond you then it may be worth checking out the <a href='http://acquia.com/products-services/acquia-search'>Acquia Search</a> hosted solution.
<p>The beauty of Solr is that it is able to index massive datasets, millions if not 100s of millions of nodes, and return results surprisingly quickly. The faceted search (taxonomy, language, CCK, Author, Content Type) is its main selling point which will be sure to impress users. It really is a product which will take your site to the next level if search is important to you. More importantly Solr solves crippling scaling problems with core search as well as viewing taxonomy terms with many nodes.</p>
<p>Check out the following search for <a href="http://uriverse.com/search/apachesolr_search/madonna?filters=language%3Aen%20type%3Adwrk">&#8220;Madonna&#8221; filtered by &#8220;Artistic Works&#8221; type and &#8220;English&#8221; language</a>. Alternatively, here is <a href="http://uriverse.com/search/apachesolr_search/madonna?filters=type%3Adpsn%20language%3Ait">Madonna filtered by &#8220;Person&#8221; type and &#8220;Italian&#8221; language</a>. <a href="http://www.darkbrownbuckets.com/media/madonna.jpg">Italians do it better</a>.</p>
<p>Solr is a natural fit for installation on another server because it runs as a web service over HTTP. This is a very easy way to take load of your main server, especially if search is a big part of your site. The disk requirements of the index and Java memory requirements for Solr can be significant on big sites so moving it off the main server may well be a necessity. My experience on <a href="http://uriverse.com">Uriverse</a> (a large amount of smallish nodes) would suggest that each node takes around 200 bytes of RAM in the JVM. Your requirements could vary dramatically from this but this may be helpful as a rough guide for those wanting to know about RAM consumption.</p>
<p class='summary'>Large sites: Core search fails. Solr shines.</p>
<dl class='more'>
<dt><a href='http://wiki.apache.org/solr/SolrPerformanceFactors'>SolrPerformanceFactors</a></dt>
<dd>Solr documentation including optimization.</dd>
<dt><a href='http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr'>Scaling Lucene and Solr</a></dt>
<dd>Some very good notes on scaling Solr. File system cache is important as well as JVM and need 2+GB on top of JVM for big index.</dd>
<dt><a href='http://stackoverflow.com/questions/1546898/how-to-reduce-solr-memory-usage'>how to reduce solr memory usage?</a></dt>
<dd>Smaller documents, less facets, no sorting.</dd>
<dt><a href='http://old.nabble.com/How-much-disk-space-does-optimize-really-take-to25790344.html#a25792748'>How much disk space does optimize really take</a></dt>
<dd>Be careful of optimize &#8211; required disk space can double.</dd>
</dl>
<h3 id='google-custom-search'>Google Custom Search</h3>
<p>Another option for those users looking to switch away from the in-built search is <a href="http://www.google.com/cse">Google Custom Search</a> &#8211; a search service offered by Google where Google indexes your data and stores the results on their servers. You are able to search the data via a simple form on your site. Google Custom Search is free for individuals, who don&#8217;t mind ads with their results, with a business version starting at $100 pa. The <a href="http://drupal.org/project/google_cse">Google Custom Search Module</a> integrates it into Drupal by providing a block with the form.</p>
<h2><a id='module-bloat'>Module bloat</a></h2>
<p>Every module you add to a Drupal site leads to the consumption of more RAM reducing the number of simultaneous clients you can server. Modules also consume extra CPU. As a site designer you need to be aware that all modules added will have some cost on the performance of your site. Is the extra functionality worth the performance cost?</p>
<dl class='more'>
<dt><a href='http://2bits.com/articles/server-indigestion-the-drupal-contributed-modules-open-buffet-binge-syndrome.html'>Server indigestion: The Drupal contributed modules &#8220;open buffet binge&#8221; syndrome</a></dt>
<dd>Minimal install of Drupal has Apache process around 17MB-33M. A bloated system has 93M bloating to 100MB after a few requests. This means fewer requests can be handled due to RAM consumption.</dd>
</dl>
<p class='summary'>Feature rich sites: RAM wasted. Low max clients.</p>
<h2><a id='noad-load'>Node load</a></h2>
<p>The loading of a node in Drupal can be a very quick/light or very slow/heavy process depending on the circumstances. It all comes down to (i) how much data is in the node and (ii) how fast that data comes back out of the database. If you have nodes with many CCK fields sitting in a database which hasn&#8217;t been able to cache all of the necessary indexes, a node load is something which you really want to avoid. Instead of lazily loading data when required a node load loads it all in at once! In extreme cases a node could take seconds to load.</p>
<p>This become a problem when you want to handle many nodes at once, such as when you want to display node teasers in a View for example. If you notice your page slowing down when displaying 50 teasers on a view then it is a good bet that the DB is getting hammered trying to load in all that data just to display the title, description and url_alias! In these cases skip the teaser and just do it with fields. You should notice a big improvement.</p>
<p>Similarly, performing massive (millions of nodes) imports for new/updated nodes can be prohibitive as well. In these rare cases you need to skip the API and execute SQL on the DB directly.</p>
<p class='summary'>Many Nodes, Big Nodes: Speed and RAM suffers</p>
<h2><a id='cck-design'>CCK design</a></h2>
<p>It&#8217;s important to know how to work within the confines of a system to get the most out of it so it is well worth investigating how data is stored in the backend for nodes and CCK fields. Smart, sensible design of CCK fields should ensure that database access is kept to a sensible level. Whilst this guide recommends designing a data model first and then worrying about data access times second, it is worth knowing the consequences of your decisions.
<p>In a nutshell, CCK will create another table in the backend for multi fields and fields which are shared between content types. There&#8217;s no avoiding the first reason but in the case of shared fields, a separate DB query will need to be run for each shared property. This can be an issue on very large sites on nodes with a lot of properties when you want to keep DB queries to a minimum.</p>
<p>A couple of rules of thumb would be:</p>
<ul>
<li>Try to keep the properties of a content type encapsulated to that particular content type. eg. you might be tempted to share book.author and film.screenwriter in a single CCK field. Unless you are going to be doing queries across both then it makes sense to store the properties separately with their content types.</li>
<li>If you are going to share properties between content types, then the more sharing that can be done the better. eg. a &#8216;geo-location&#8217;, &#8216;country&#8217;, &#8216;intended-audience&#8217; are all possible candidates for stretching across several content types. This design is views friendly, aiding efficient queries across multiple content types.</li>
<li>It is probably better to lean towards designing content types which use taxonomy types and have some null properties instead of creating too many &#8217;sub-class&#8217; content types to do the job. eg. lets say you have two possible types, &#8216;it-employee&#8217; and &#8216;accounts-employee&#8217;, which share some properties (birthday, address) but not others (preferred-os). It probably would be better to add an employee-type taxonomy to do the sub-classing and have the unshared preferred-os property as optional. This ensures easy filtering using taxonomy and allows for fast employee retrieval from a single table.</li>
</ul>
<p class='summary'>Complex Data Model: Inefficient data</p>
<h2><a id='module-development'>Module development</a></h2>
<p>When designing your own modules you should have an eye to efficient design and use caching the Drupal way. Using the Drupal cache mechanism means that the custom caches are available to Cache Router to store in RAM (if appropriate).</p>
<p>It is also possible to use the &#8220;static&#8221; variable keyword to cache the variable for that particular script.</p>
<dl class='more'>
<dt><a href='http://www.lullabot.com/articles/a_beginners_guide_to_caching_data'>A beginner&#8217;s guide to caching data</a></dt>
<dd>Steps module developers can take to cache data the PHP/Drupal way.</dd>
</dl>
<p class='summary'>Custom Modules: cache data the Drupal way</p>
<h2><a id='keep-up-to-date'>Keep up to date!</a></h2>
<p>From time to time different point releases are made to Drupal core. Keeping up to date with these patches will not only help you stay secure it may also speed up your site. If you take a look at the SQL patches many of them are ALTER statements which add indexes to the database which could help slow queries on big sites. Further, code inefficiences may be removed through better coding techniques.</p>
<p class='summary'>Drupal Core updates: Better performance (especially for big sites) and stability.</p>
<hr />
<p>This article forms part of a series on Drupal performance and scalability. The first article in the series is <a href="http://cruncht.com/75/drupal-performance-scalability">Squeezing the last drop from Drupal: Performance and Scalability</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cruncht.com/91/drupal-implementation-decisions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Drupal LAMP Server Tuning (part 6)</title>
		<link>http://cruncht.com/89/drupal-lamp-server-tuning</link>
		<comments>http://cruncht.com/89/drupal-lamp-server-tuning#comments</comments>
		<pubDate>Mon, 01 Feb 2010 09:59:41 +0000</pubDate>
		<dc:creator>Murray Woodman</dc:creator>
				<category><![CDATA[Drupal Planet]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://cruncht.com/?p=89</guid>
		<description><![CDATA[Getting the most from your Drupal site means getting the most from your server &#8211; optimizing the various layers of the the LAMP stack. This includes the filesystem, database, web server, PHP, RAM and CPU. Tuning the LAMP stack is a major subject requiring a lot of study and practice to become proficient. It&#8217;s something [...]]]></description>
			<content:encoded><![CDATA[<a href="http://cruncht.com/89/drupal-lamp-server-tuning" title="Drupal LAMP Server Tuning (part 6)"><img src="http://cruncht.com/wp-content/uploads/2010/02/tune-150x150.jpg" alt="Tune" class="feed-image" title="Tune your Drupal LAMP stack" /></a><p>Getting the most from your Drupal site means getting the most from your server &#8211; optimizing the various layers of the the LAMP stack. This includes the filesystem, database, web server, PHP, RAM and CPU. Tuning the LAMP stack is a major subject requiring a lot of study and practice to become proficient. It&#8217;s something you will probably never completely master <img src='http://cruncht.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Try Googling <a href="http://www.google.com.au/search?q=lamp+performance+tune">lamp performance tune</a> for a few articles to whet your appetite. For now, we&#8217;ll cover a few of the major considerations for Drupal, although most of this advice would apply to any PHP web app running on Linux.</p>
<p><span id="more-89"></span></p>
<dl class="more">
<dt><a href="http://http://drupal.org/node/2601">Server tuning considerations</a></dt>
<dd>Drupal documentation covering the basics.</dd>
<dt><a href="http://www.ibm.com/developerworks/linux/library/l-tune-lamp-1.html">Tuning LAMP systems, Part 1: Understanding the LAMP architecture</a></dt>
<dd>Intermediate article covering LAMP.</dd>
<dt><a href="http://www.ibm.com/developerworks/linux/library/l-tune-lamp-2.html">Tuning LAMP systems, Part 2: Optimizing Apache and PHP</a></dt>
<dd>Intermediate article covering Apache and PHP.</dd>
<dt><a href="http://www.ibm.com/developerworks/linux/library/l-tune-lamp-3.html">Tuning LAMP systems, Part 3: Tuning your MySQL server</a></dt>
<dd>Intermediate article covering MySQL.</dd>
</dl>
<h2><a id="php-opcode-cache">Opcode cache</a></h2>
<p>Opcode caches cache the compiled form of a PHP script in shared memory to avoid the overhead of parsing and compiling the code every time the script runs. This saves RAM and reduces script execution time.</p>
<p>Quite a bit of benchmarking has been done in the Drupal and PHP communities between <a href="http://php.net/manual/en/book.apc.php">APC</a>, <a href="http://eaccelerator.net/">eAccelerator</a> and <a href="http://xcache.lighttpd.net/">XCache</a>. eAccelerator may have the edge in raw performance, but it appears that APC is the preferred opcode cache in the Drupal community because it is well maintained and less buggy.</p>
<p class="summary">All sites: faster and less RAM. Moderate install.</p>
<dl class="more">
<dt><a href="http://buytaert.net/drupal-webserver-configurations-compared">Drupal web server configurations compared</a></dt>
<dd>APC gives 2x to 4x increase in throughput under load. PHP5 is around 10% slower.</dd>
<dt><a href="http://2bits.com/articles/php-op-code-caches-accelerators-a-must-for-a-large-site.html">PHP op-code caches / accelerators: Drupal large site case study</a></dt>
<dd>Op-code caches are a must for large sites serving many pages.</dd>
<dt><a href="http://2bits.com/articles/benchmarking-apc-vs-eaccelerator-using-drupal.html">Benchmarking APC vs. eAccelerator using Drupal</a></dt>
<dd>eAccelerator is faster and smaller than APC. Both offer around 6x &#8211; 7x times speedup over PHP.</dd>
<dt><a href="http://2bits.com/articles/high-php-execution-times-drupal-and-tuning-apc-includeonce-performance.html">High PHP execution times for Drupal, and tuning APC for include_once() performance</a></dt>
<dd>Make sure apc.shm_size can fit the whole page else there will be no caching.</dd>
</dl>
<h2><a id="database">Database</a></h2>
<p>There are a number of choices to be made when tuning your MySQL database server. The MySQLTuner script can be helpful for identifying outstanding issues you may be unaware of. It can be run on a functioning production server to see how your database is performing in the wild. It&#8217;s possible to take a best guess at config options on your dev machine but you aren&#8217;t going to know how things are going to shape up until real users start hitting the DB.</p>
<dl class="more">
<dt><a href="http://blog.mysqltuner.com/">MySQLTuner</a></dt>
<dd> Perl script which is able to report on the operation of your MySQL installation and offer suggestions as to what can be fixed.</dd>
<dt><a href="http://www.howtoforge.com/tuning-mysql-performance-with-mysqltuner">Tuning MySQL Performance with MySQLTuner</a></dt>
<dd>Helpful tutorial.</dd>
</dl>
<h3><a id="myisam-innodb">Storage Engine: InnoDB vs MyISAM</a></h3>
<p>A default install of Drupal 6 installs the DB tables as MyISAM. This will change in Drupal 7 with the default set to InnoDB. A Drupal 6 installation may well have some InnoDB tables as modules may create new tables in the InnoBD engine. Your installation may therefore be a mix between the two engines.</p>
<p>In many places on the web you will read statements such as &#8216;All high performance Drupal sites run InnoDB&#8221;. This is not necessarily so as there are some cases where MyISAM may still be preferred although with recent changes to Drupal core the pendulum has swung to InnoDB as a sensible default.</p>
<p>A list of the main difference between the engines is as follows:/p&gt;</p>
<ul>
<li>InnoDB is transactional (better integrity), MyISAM isn&#8217;t</li>
<li>InnoDB more reliable (better recovery), MyISAM can be repaired</li>
<li>InnoDB has row level locking (better concurrency), MyISAM locks tables</li>
<li>InnoDB uses clustered indexes (faster access to data), MyISAM indexes just the keys</li>
<li>InnoDB has a bigger memory footprint</li>
</ul>
<p>In general, you would consider sticking with MyISAM if</p>
<ul>
<li>Memory footprint was an issue. If you have very big indexes which might only just fit into the key buffer then MyISAM could offer faster lookups.</li>
<li>Most activity is read only.</li>
</ul>
<p>InnoDB tables definitely should be used for all of the Drupal cache tables since this is where most contention is likely to occur.</p>
<p>Finally, it must be noted that Drupal was written based on the MyISAM engine and as such many queries were not optimized for InnoDB. The SELECT COUNT(*) is particularly slow in InnoDB because it must scan all rows to calculate the count. Many of these shortcomings have been removed in the PressFlow distribution and have since made their way back into core.</p>
<p class="summary">All sites: InnoDB for less contention on cache<br />
Most sites: InnoDB for everything else<br />
Big unchanging sites: MyISAM faster reads less RAM</p>
<dl class="more">
<dt><a href="http://tag1consulting.com/MySQL_Engines_MyISAM_vs_InnoDB">MySQL Engines: MyISAM vs. InnoDB</a></dt>
<dd>InnoDB is a good fit for many cases and &#8220;in most cases, InnoDB is the correct choice for a Drupal site&#8221;. Very good comparison between the two engines.</dd>
<dt><a href="http://2bits.com/articles/mysql-innodb-performance-gains-as-well-as-some-pitfalls.html">MySQL InnoDB: performance gains as well as some pitfalls</a></dt>
<dd>InnoDB does row level locking but lookup is slower for some slow queries. NB. Pressflow distribution fixes some slow InnoDB queries.</dd>
<dt><a href="http://www.mysqlperformanceblog.com/2007/01/08/innodb-vs-myisam-vs-falcon-benchmarks-part-1/">InnoDB vs MyISAM vs Falcon benchmarks – part 1</a></dt>
<dd>Myth that MyISAM is faster than InnoDB in all cases.</dd>
<dt><a href="http://groups.drupal.org/node/35188">Which Tables can be converted to InnoDB</a></dt>
<dd>High Performance discussion emphasizing that InnoDB should definitely be used for cache tables and complex joins in CCK if memory allows.</dd>
</dl>
<h3 id="mysql-configuration">MySQL Configuration</h3>
<p>There are a number of MySQL config variables which must be tweaked to suit your data. It is impossible to specify one set of options to suit all sites. A few rules of thumb are offered below.</p>
<dl class="more">
<dt><a href="http://www.databasejournal.com/features/mysql/article.php/3367871/Optimizing-the-mysqld-variables.htm">Optimizing the mysqld variables</a></dt>
<dd>Clear article with some good rules of thumb for MySQL variables.</dd>
</dl>
<h4><a id="mysql-key-buffer">Key buffer</a></h4>
<p>If you are running MyISAM  tables then the key buffer is a very important variable to set. The key buffer stores table indexes in memory, allowing for fast lookups and joins. For large node, node_version and url_alias tables it is a must to have enough room to fit these tables into memory, otherwise your site will very slow on the most basic of operations: looking up nodes, titles and paths.</p>
<p>One rule of thumb is to set this buffer to somewhere between 25% and 50% of the memory on the server. To determine the best value up front sum the size of all the .MYI files.</p>
<p class="summary">MyISAM sites: most queries faster. Essential.</p>
<dl class="more">
<dt><a href="http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_key_buffer_size">key_buffer_size</a></dt>
<dd>Documentation on the use of key_buffer_size.</dd>
</dl>
<h4><a id="mysql-query-cache">Query cache</a></h4>
<p>MySQL has a query cache which stores results up to a certain size in memory. The cache is very handy for quickly returning commonly accessed data when all other forms of caching (reverse proxies, page cache, Drupal caches) have not been invoked. Queries which may take sometime return almost instantly.</p>
<dl class="more">
<dt><a href="http://www.databasejournal.com/features/mysql/article.php/3110171/MySQLs-Query-Cache.htm">MySQL&#8217;s Query Cache</a></dt>
<dd>Covers config and operation of the query cache.</dd>
</dl>
<p>During the development and testing of a site the query cache can catch developers out since a query may appear to be performing quite well the second and subsequent times through. To really test a query you need to fire up mysql client (or phpmyadmin) and add the SQL_NO_CACHE option to the query to see the real time it takes. Don&#8217;t be fooled!</p>
<dl class="more">
<dt><a href="http://dev.mysql.com/doc/refman/5.1/en/query-cache-in-select.html">Query Cache SELECT Options</a></dt>
<dd>Documentation on the use of SQL_NO_CACHE.</dd>
</dl>
<p>The query cache is destroyed if any row in the table is changed and so it cannot be relied upon if tables are changing frequently. The cache shines when the are big tables which don&#8217;t change that often. Unless your site has such characteristics it is best to limit it so that it fits small unchanging tables and then some for the most popular queries. Examination of cache hit rates will show you if it needs to be extended or reduced.</p>
<dl class="more">
<dt><a href="http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_query_cache_size">query_cache_size</a></dt>
<dd>Documentation on the use of query_cache_size.</dd>
</dl>
<p class="summary">All sites: common queries faster</p>
<h4><a id="innodb-buffer-pool">InnoDB Buffer Pool Size</a></h4>
<p>If you are running InnoDB tables then it is essential to optimize the InnoDB Buffer Pool Size, increasing the memory to reduce query time. InnoDB is more memory intensive and so the pool will be larger than that used for MyISAM tables. MySQL documentation suggests that the size can be upped to 80% of physical memory. Anymore could lead to swap issues.</p>
<p class="summary">InnoDBsites: most queries faster. Essential.</p>
<dl class="more">
<dt><a href="http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_buffer_pool_size">innodb_buffer_pool_size</a></dt>
<dd>Documentation on the use of innodb_buffer_pool_size.</dd>
</dl>
<h4>Other variables</h4>
<p>Other variables worth tweaking include the following. See <a href="http://www.databasejournal.com/features/mysql/article.php/3367871/Optimizing-the-mysqld-variables.htm">Optimizing the mysqld variables</a> for more.</p>
<ul>
<li>table cache</li>
<li>sort buffer</li>
<li>read_rnd_buffer_size</li>
<li>tmp_table_size</li>
</ul>
<h3><a id="database-warmup">Database Warmup</a></h3>
<p>A warm database will perform much better than a recently started one because its caches and buffers will be primed with keys and data. It therefore makes sense to warm up a DB every time the database is restarted. The best way to do this is to load in the indexes of commonly used tables. This guide recommends loading in node, node_revisions and url_alias. Taxonomy information could be good candidates as well.</p>
<p><code><br />
USE drupal6;<br />
LOAD INDEX INTO CACHE node;<br />
LOAD INDEX INTO CACHE node_revisions;<br />
LOAD INDEX INTO CACHE url_alias;<br />
LOAD INDEX INTO CACHE term_data;<br />
LOAD INDEX INTO CACHE term_node;<br />
</code></p>
<p>This SQL code can then be put in a script and run when MySQL restarts. It is possible to configure the <code>init_file</code> variable in my.cnf to tell mysql where to find the startup SQL.</p>
<p><code>init-file = /etc/mysql/init-file.sql</code></p>
<p class="summary">Many nodes: Most queries where index relied upon.</p>
<dl class="more">
<dt><a href="http://dev.mysql.com/doc/refman/5.0/en/index-preloading.html">Index Preloading</a></dt>
<dd>How to use <code>LOAD INDEX INTO CACHE t1</code>.</dd>
<dt><a href="http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_init_file">init_file</a></dt>
<dd>How to use <code>init_file</code> variable to specify startup SQL.</dd>
</dl>
<h3><a id="database-indexes">Database Indexes</a></h3>
<p>Indexes on columns can dramatically speed up queries if the columns are used for filtering, sorting or joining. Generally, Drupal has most of the indexes you need covered, however, there are some areas where standard tables can benefit from an additional index. It is recommended that you profile your queries to see where things are slow before adding indexes in a scattergun approach because adding indexes can harm performance if they are not being used properly. You can use MySQL&#8217;s slow query log for queries with no index to identify areas for improvement.</p>
<p><a href="http://groups.drupal.org/user/29726">mikeytown2</a> has come up with <a href="http://groups.drupal.org/node/56438#comment-160418">a list</a> of tables which could do with an index:</p>
<blockquote>
<ul>
<li>All CCK fields that you use in a view. File Field: create an index on the fid; date: index on date; index on value; etc&#8230;</li>
<li>access: type, mask, status</li>
<li>comments: timestamp</li>
<li>node_comment_statistics: comment_count</li>
<li>menu_links: external, updated, customized, depth</li>
<li>users: pass, status</li>
<li>menu_custom: title</li>
<li>date_format_types: title</li>
<li>filter_formats: roles</li>
<li>content_group: weight, type_name, group_name</li>
<li>term_data: name</li>
<li>system: name</li>
<li>imagecache_preset: presetname</li>
<li>blocks: module, delta</li>
<li>system: status, type</li>
<li>content_node_field: type, widget_type</li>
</ul>
</blockquote>
<h2><a id="webserver">Web server</a></h2>
<p>Apache + MPM Prefork + mod_php is the default web server configuration in the LAMP stack. This combination does consume large amounts of RAM which can be a problem for handling many requests. It can also be quite heavy and slow for serving static content. Many administrators have looked to replace it with other combinations including multithreaded processes (MPM Worker) and external PHP (mod_fcgid) as well as swapping it out completely for another server such as Nginx. This guide has adopted the position that Apache problems can be ameliorated somewhat by removing unneeded modules, running fcgid to connect with PHP and using MPM Worker to enable multithreading per process. However, in some cases this won&#8217;t be enough and Nginx is a must.</p>
<h3>Apache vs Nginx</h3>
<p>Other Drupal users have replaced Apache with faster more lightweight  (RAM and CPU) web servers such as <a href="http://nginx.org/en/">Nginx</a> and <a href="http://www.lighttpd.net/">Lighttpd</a>. Nginx is generally preferred over Lighttpd because of memory leaks in the latter. It is currently possible to run Nginx without losing any functionality in Drupal. Boost, a module based on .htaccess rules, now supports Nginx so it is feasible to run Nginx as the main web server. If you are constrained by CPU or have high loads then this certainly is an option worth considering.</p>
<p>Setting up Nginx is not trivial but it is reasonably straight forward if you are comfortable with compiling and patching. There are some good tutorials on the Web for user who want to do this.</p>
<p class="summary">Low resources, High Traffic, Many logged in: Possible to get more for less with Nginx.</p>
<dl class="more">
<dt><a href="http://www.joeandmotorboat.com/2008/02/28/apache-vs-nginx-web-server-performance-deathmatch/">Apache vs Nginx : Web Server Performance Deathmatch</a></dt>
<dd>&#8220;Nginx seems to compete pretty well with Apache and there doesn’t seem like there is a good reason not to use it especially in CPU usage constrained situations (ie. huge traffic, slow machines and etc).&#8221;</dd>
<dt><a href="http://groups.drupal.org/node/20813">In reply to kbahey: apache vs nginx </a></dt>
<dd>Discussion and results over the pros and cons of Nginx vs various Apache setups.</dd>
<dt><a href="http://www.allthepages.org/archives/2009/02/how-get-drupal-working-nginx">How to get Drupal working with Nginx</a></dt>
<dd>Simple guide for installing and configuring Nginx on a server with only 256MB RAM. Uses FastCGI which may not be preferred method.</dd>
<dt><a href="http://interfacelab.com/nginx-php-fpm-apc-awesome/">NGINX + PHP-FPM + APC = Awesome</a></dt>
<dd>&#8220;The following guide will walk you through setting up possibly the fastest way to serve PHP known to man&#8230;In this article, we’ll be installing nginx http server, PHP with the PHP-FPM patches, as well as APC.&#8221;</dd>
<dt><a href="http://php-fpm.org/">PHP-FPM &#8211; A simple and robust FastCGI Process Manager for PHP</a></dt>
<dd>Preferred way of connecting Nginx with PHP. Currently in PHP core for 5.3.2+ but not yet released. Requires patch to PHP 5.2.</dd>
</dl>
<h3><a id="apache-unneeded-modules">Apache: Unneeded modules</a></h3>
<p>It is possible to turn off unneeded modules in Apache to reduce memory footprint. The modules you require depends very much on your setup.</p>
<p>The traditional way of controlling modules in Apache has been through the LoadModule directive in httpd.conf. Ubuntu and Debian do it differently with the /etc.apache2/mods-available directory and the a2enmod command. To list all modules to enable try:</p>
<p><code><br />
$ sudo a2enmod</p>
<p>$ sudo /etc/init.d/apache2 force-reload<br />
</code></p>
<p>And to see what you have enabled you can do <code>$ sudo a2dismod</code>.</p>
<p class="summary">All sites: Good savings in RAM</p>
<dl class="more">
<dt><a href="http://groups.drupal.org/node/41320">What Apache2 modules can be disabled?</a></dt>
<dd>Lists of modules which should be enabled in Apache2.</dd>
</dl>
<h3><a id="apache-threading">Apache threading: MPM Worker (multi threaded) MPM Prefork</a></h3>
<p>The use of <a href="http://httpd.apache.org/docs/2.0/mod/worker.html">MPM Worker</a> allows for the handling of more requests due to multithreading in each process. It has a smaller memory footprint than Prefork and is faster. According to docs, Apache must be compiled with the <code>--with-mpm</code> argument in order to install Worker as &#8220;prefork&#8221; is the default on Unix systems.</p>
<p class="summary">RAM limited: Worker preferable to Prefork.</p>
<dl class="more">
<dt><a href="http://httpd.apache.org/docs/2.0/misc/perf-tuning.html#compiletime">Compile-Time Configuration Issues</a></dt>
<dd>&#8220;Choosing an MPM&#8221; section covers differences between the two models.</dd>
<dt><a href="http://httpd.apache.org/docs/2.0/mpm.html">Multi-Processing Modules (MPMs)</a></dt>
<dd>Apache documentation on installation.</dd>
<dt><a href="http://ivan.gudangbaca.com/installing_apache2_and_php5_using_mod_fcgid">Installing Apache2 and PHP5 using mod_fcgid</a></dt>
<dd>Hey, you don&#8217;t have to recompile Apache. Tutorial on how to install MPM Worker using apt-get with Apache2 on Ubuntu. Just the ticket.</dd>
<dt><a href="http://www.complich8.net/archives/404">mpm-worker versus mpm-prefork, and mod_php versus fastcgi</a></dt>
<dd>PreFork and FastCGI is still a win if you find that Worker is unstable due to long downloads as this person did.</dd>
</dl>
<h3><a id="webserver-fastcgi">Connectors: mod_php, FastCGI, mod_fcgid</a></h3>
<p>The use of mod_php with Apache is the most common setup for calling PHP. mod_php works by embedding PHP into every Apache process. This has the disadvantage of a large memory footprint for each Apache process. FastCGI and mod_fcgid overcomes this problem and reduces resource utilization with no gains in performance.</p>
<p><a href="http://groups.drupal.org/user/327">kbahey</a> <a href="http://groups.drupal.org/node/27174#comment-94376">lists the disadvantages of mod_php</a> as follows:</p>
<ul>
<li>All PHP loaded into the process</li>
<li>Heavy process even if flat file</li>
<li>Many processes will hog RAM</li>
</ul>
<p class="summary">Use mod_fcgid for lower memory and DB/Network connections</p>
<dl class="more">
<dt><a href="http://buytaert.net/drupal-webserver-configurations-compared">Drupal webserver configurations compared</a></dt>
<dd>The most common, Apache+mod_php is the slowest. Tests conducted with FastCGI which is faster. NB: FastCGI has subsequently suffered from stability issues.</dd>
<dt><a href="http://2bits.com/articles/apache-fcgid-acceptable-performance-and-better-resource-utilization.html">Apache with fcgid: acceptable performance and better resource utilization</a></dt>
<dd>Informative article which comes out in favor of mod_fcgid over FastCGI and mod_php. This is the must read article if you wish to attempt fcgid.</dd>
<dt><a href="http://groups.drupal.org/node/44938">Configure Apache for high performance on drupal 6</a></dt>
<dd>Some solid comments from kbahey from 2bits regarding stable setup: Apache, MPM Worker, fcgid, APC (code cache), memcache No SQL.</dd>
</dl>
<h3><a id="apache-maxclients">Apache MaxClients</a></h3>
<p>The MaxClients parameter controls how many simultaneous clients Apache is able to serve. If it is set to high RAM will be chewed up and the Machine will go into swap. If it is set to low then your site will be unnecessarily limited by the number of clients it can serve. The setting of this value should be determined after consideration of (i) how much spare RAM is available on the server and (ii) how much RAM each Apache process consumes. Obviously you will want to maximize available RAM through frugal allocation of RAM to MySQL, JVM, etc and minimize the size of Apache process through techniques described above.</p>
<p><a href="http://2bits.com/">2bits</a> <a href="http://2bits.com/articles/tuning-the-apache-maxclients-parameter.html">provide</a> the following formula:</p>
<p><code>MaxClients = (Total Memory - Operating System Memory - MySQL memory) / Size Per Apache process.</code></p>
<p>The only addition this guide would make is that it is important to leave some RAM free for the OS file buffer to allow efficient operation of the OS.</p>
<dl class="more">
<dt><a href="http://2bits.com/articles/tuning-the-apache-maxclients-parameter.html">Tuning the Apache MaxClients parameter</a></dt>
<dd>How to set MaxClients param.</dd>
</dl>
<h3 id="htaccess">.htaccess</h3>
<p>If you are running Apache then it is possible to either use .htaccess or the apache conf file to specify directives such as rewrite rules, etc. If you use .htaccess then Apache must look for .htaccess rules in the directory hierarchy for every request. This can take some time even if no rules are found. You may consider putting the rules in httpd.conf/apache2.conf if you are looking to eek out the most performance from your site.</p>
<p class="summary">.htaccess can slow down site if performance is crucial.</p>
<dl class="more">
<dt><a href="http://www.fubra.com/blog/2008/01/07/htaccess-vs-httpdconf/">.htaccess vs httpd.conf</a></dt>
<dd>Evidence that .htaccess can slow a site by 6.6%.</dd>
</dl>
<h2><a id="ram">RAM: A precious resource</a></h2>
<p>Given the above, serious thought should be given to how the RAM on your box is to be divided up. In a nutshell we have the following apps contesting for their fair share:</p>
<ul>
<li>The JVM if you are running Solr</li>
<li>MySQL query cache and key buffers</li>
<li>Apache processes for client requests</li>
<li>PHP if it runs outside Apache</li>
<li>Memcached for holding Drupal caches</li>
<li>The file system cache</li>
</ul>
<p>Consider the following when deciding how to divide up your box:</p>
<ul>
<li>The JVM needs a certain amount or else Solr will crash.</li>
<li>MySQL really should have indexes buffered for MyISAM and InnoDB. Use MySQLTuner. If they can&#8217;t fit then buy more RAM or (i) reduce max clients and (ii) forget about CacheRouter.</li>
<li>Apache MaxClients should be set to consume available RAM.</li>
<li>The file system cache needs to be big enough to allow smooth running of system.</li>
</ul>
<hr />
<p>This article forms part of a series on Drupal performance and scalability. The first article in the series is <a href="http://cruncht.com/75/drupal-performance-scalability">Squeezing the last drop from Drupal: Performance and Scalability</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cruncht.com/89/drupal-lamp-server-tuning/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Drupal Performance Out of the Box (part 5)</title>
		<link>http://cruncht.com/87/drupal-performance-out-of-the-box</link>
		<comments>http://cruncht.com/87/drupal-performance-out-of-the-box#comments</comments>
		<pubDate>Sun, 31 Jan 2010 09:57:34 +0000</pubDate>
		<dc:creator>Murray Woodman</dc:creator>
				<category><![CDATA[Drupal Planet]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://cruncht.com/?p=87</guid>
		<description><![CDATA[Drupal has a number of features which allow for performance to be improved without the addition of external modules or for complicated configuration by administrators. A slow system can easily be turned into one which performs well under heavy loads.

Firstly, Drupal shortcuts the execution of unneeded code by using an internal caching system which stores [...]]]></description>
			<content:encoded><![CDATA[<a href="http://cruncht.com/87/drupal-performance-out-of-the-box" title="Drupal Performance Out of the Box (part 5)"><img src="http://cruncht.com/wp-content/uploads/2010/02/lunch_box-150x150.jpg" alt="Lunch Box" class="feed-image" title="Drupal performance out of the box" /></a><p>Drupal has a number of features which allow for performance to be improved without the addition of external modules or for complicated configuration by administrators. A slow system can easily be turned into one which performs well under heavy loads.</p>
<p><span id="more-87"></span></p>
<p>Firstly, Drupal shortcuts the execution of unneeded code by using an internal caching system which stores results of expensive routines. Secondly, Drupal improves page rendering time by allowing for the aggregation static JS and CSS files. Thirdly, page output can be compressed saving download time. Fourthly, Drupal allows for manual configuration of the caching of Blocks and Pages which can improve performance significantly.</p>
<h2 id='core-caching'>Drupal Core caching system</h2>
<p>Drupal comes with a number of in-built caches which store the results of expensive calculations (strings) in the database so that they can be retrieved quickly later on. There are six caches enabled by default: cache, cache_block, cache_menu, cache_filter, cache_form and cache_page. Contributed modules are able to create their own caches for storing data which is handy for module designers.  This default cache system provides improved performance across the whole app.</p>
<p>The caching system is pluggable and allows for custom storage engines to be substituted in for the default database implementation. Later in this guide you will see how the Cache Router module is able to swap in memory based storage in place of the database, making the retrieval of cached content that much faster.</p>
<p class='summary'>All sites: Improved performance across the app. No config.</p>
<h3 id='aggregate'>Aggregate and compress JS and CSS</h3>
<p>Drupal&#8217;s modular system means that pages can have a large number of CSS and JS includes which results in a lot of client server communication &#8211; slowing the page draw time down. The problem can be alleviated by merging the files and then compressing them. This results in less includes and faster downloads. Up to 90% of download time can be attributed to downloading CSS, JS and images so it makes sense to aggregate and compress if possible.</p>
<p>During development it is advisable to keep this option turned off so that CSS and JS errors can be troubleshooted.</p>
<p class='summary'>All users: Lower render times. One click config.</p>
<dl class='more'>
<dt><a href='http://developer.yahoo.com/performance/rules.html#num_http'>Minimize HTTP Requests</a></dt>
<dd>Yahoo place this at the top of their list for ways to reduce download time. 40%-6-% of users are first time users so client side caching is no help to them. Fewer HTTP requests are.</dd>
</dl>
<h3 id='page-cache'>Page Cache for anonymous users</h3>
<p>Pages for anonymous users can be cached, meaning that a full build of the page isn&#8217;t necessary for each new request which comes in, providing you with savings in CPU and DB load as well as giving the user much faster response times. This is a massive win for your website, especially if the majority of your page requests are from anonymous users. Basically it can help you survive a Slashdotting. Drupal offers &#8220;aggressive&#8221; and &#8220;normal&#8221; options &#8211; normal page caching is recommended for most websites. Other options for Page Caching are discussed below.
<p>During development it is advisable to keep this option turned off so that any changes to logic or design can be troubleshooted.</p>
<p class='summary'>Anonymous users: Big wins in speed and CPU. One click config.</p>
<dl class='more'>
<dt><a href='http://buytaert.net/drupal-vs-joomla-performance'>Drupal vs Joomla: performance</a></dt>
<dd>An older article comparing Joomla and Drupal. Joomla faster on non-cached pages but caching makes Drupal win.</dd>
</dl>
<h3 id='block-cache'>Block Cache</h3>
<p>Enabling the block cache allows finer grained control over cached content. Caching blocks which don&#8217;t change frequently will enable speedups for logged in users who need a dynamic page built for them each request.</p>
<p class='summary'>Logged in users: Moderate wins. Easy to implement.</p>
<dl class='more'>
<dt><a href='http://lists.drupal.org/pipermail/documentation/2008-March/005949.html'>Drupal guide to caching</a></dt>
<dd>Covers the various database tables which store data for Drupal&#8217;s caching system.</dd>
</dl>
<h3 id='page-compression'>Page Compression</h3>
<p>It is also possible to enable page compression for the pages sent. This will reduce the page size by 50% or more depending on the page. Users requesting big pages on slower connections will love this. </p>
<p class='summary'>Slow connections: Massive win. Easy to implement.</p>
<dl class='more'>
<dt><a href='http://www.mostlygeek.com/tech/how-to-make-drupal-run-85x-faster-in-5-minutes/'>How to make Drupal run 8.5x faster in 5 minutes…</a></dt>
<dd>Page cache provides a 3x speedup.</dd</p>
<dt><a href='http://blamcast.net/articles/speed-up-drupal'>How I Survived a 2300% Traffic Increase With Drupal</a></dt>
<dd>Demonstrates that these out of the box techniques can be very effective even for sites on shared hosting.</dd>
</dl>
<hr />
<p>This article forms part of a series on Drupal performance and scalability. The first article in the series is <a href="http://cruncht.com/75/drupal-performance-scalability">Squeezing the last drop from Drupal: Performance and Scalability</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cruncht.com/87/drupal-performance-out-of-the-box/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Drupal Hosting Environment (part 4)</title>
		<link>http://cruncht.com/85/drupal-hosting-environment</link>
		<comments>http://cruncht.com/85/drupal-hosting-environment#comments</comments>
		<pubDate>Sat, 30 Jan 2010 09:56:18 +0000</pubDate>
		<dc:creator>Murray Woodman</dc:creator>
				<category><![CDATA[Drupal Planet]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://cruncht.com/?p=85</guid>
		<description><![CDATA[Sometimes the server environment must be taken as a given and we must work within the bounds of what has been provided. Moving servers can be a time consuming process and not something you wish to do if it can be avoided. It&#8217;s best to make the right decisions up front ensuring you have room [...]]]></description>
			<content:encoded><![CDATA[<a href="http://cruncht.com/85/drupal-hosting-environment" title="Drupal Hosting Environment (part 4)"><img src="http://cruncht.com/wp-content/uploads/2010/02/greener_grass_hosting-150x150.jpg" alt="Greener Grass" class="feed-image" title="Grass is greener with dedicated and shared hosting" /></a><p>Sometimes the server environment must be taken as a given and we must work within the bounds of what has been provided. Moving servers can be a time consuming process and not something you wish to do if it can be avoided. It&#8217;s best to make the right decisions up front ensuring you have room to grow if needed. Opting for a shared hosting plan can limit the resources available, the control you have as well as ability to grow.</p>
<p><span id="more-85"></span></p>
<h2><a id='shared-hosting'>Shared Hosting</a></h2>
<p>Shared hosting plans are cheap ($5 per month and up) and it is possible to run small to moderate Drupal sites on them successfully if you take a few precautions. However, they do come with limitations which can make performance optimization difficult.</p>
<h3><a id='ram-cpu-throttle'>RAM and CPU throttled</a></h3>
<p>Drupal is a system which can be heavy on system resources, especially RAM, and so running it on shared hosting can be a challenge in some situations. In many cases a relatively minimal Drupal install will exceed maximum limits enforced by your hosting company. In these cases you will receive a polite letter telling you to fix your site or else you can upgrade to a dedicated server. Don&#8217;t worry. See <a href='#out-of-the-box'>Out of the box</a> for some quick and easy wins on shared hosting plans.</p>
<p>Servers set up like this often have oversold the box you are hosted on and you will be at the mercy of other users who might be exceeding their allotment. Your performance will be adversely affected in these cases. YMMV.</p>
<h3><a id='limited-control'>Limited control</a></h3>
<p>The major downside of shared servers is the limited control you have over the hosting environment, leaving you with relatively few options to improve performance. Luckily Drupal has some handy features built in which make it possible to improve speed, memory and CPU. These are dealt with in more detail below.</p>
<p>It is possible to find good virtual hosting plans for around $20 to $30 a month and it is recommended to give these a try if you desire more control over your installation. You must be comfortable with server admin to do this though so it isn&#8217;t everyone&#8217;s cup of tea: partitioning, installation, configuration, security, certificates, SVN, backup, mail, monitoring, etc. <a href='http://www.linode.com/?r=ebdc977aea6d3ec02c7c6a176073580bf836875b'>Linode</a> (contains affiliate id) is well respected in the Drupal community and is often recommended for virtualized servers.</p>
<h2><a id='dedicated'>Dedicated and Virtual Hosting</a></h2>
<p>Obviously a lot of options are opened up when you move across from shared hosting to a dedicated or virtual setup. Full access to the server means that you are able to implement all of the options discussed in this guide. You all so have the freedom to implement systems which scale out allowing a number of different configurations, including load balancers, multiple web servers, master-slave databases, memory sharing for caches, etc..</p>
<dl class='more'>
<dt><a href='http://www.johnandcailin.com/blog/john/scaling-drupal-open-source-infrastructure-high-traffic-drupal-sites'>scaling drupal &#8211; an open-source infrastructure for high-traffic drupal sites</a></dt>
<dd>Incremental recipes for scaling a Drupal server out by adding more hardware: basic install on a single box, splitting off the DB, load balanced multi web servers, redundant load balancers with multi web servers, database clustering.</dd>
</dl>
<p>This guide doesn&#8217;t examine these techniques in any detail as it is mostly focused on getting the most performance possible out of a single box. Many Virtual hosts allow your to scale up to a new plan, unlocking more memory, disk and processor without any other system changes.</p>
<dl class='more'>
<dt><a href='http://www.xenscale.com/docs/vps-comparison-matrix'><br />
Comparison of Major VPS Providers</a></dt>
<dd>Linode comes out on top again. RAM looks to be the limiting factor here though.</dd>
<dt><a href='http://journal.dedasys.com/2008/11/24/slicehost-vs-linode'>Slicehost vs Linode</a></dt>
<dd>Linode comes out ahead in memory/bandwidth/cpu/storage per $.</dd>
</dl>
<p>Another option for those looking to scale their hardware with virtual servers is to go with a service such as <a href="http://aws.amazon.com/ec2/">Amazon Elastic Compute Cloud</a> (Amazon EC2).</p>
<blockquote><p>Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers.</p></blockquote>
<p>The main difference between EC2 and other virtual solutions such as SliceHost and Linode, is that EC2 is able to scale out to new servers when demand requires it. This could be an attractive option for admins who expect large and fluctuating flows of traffic. Project Mercury is a Pressflow image for deployments to EC2 and is <a href="/97/custom-drupal-distributions#project-mercury">discussed later</a>.</p>
<h2 id='hosted'>Hosted Drupal</h2>
<p>Hosted Drupal is a final option for those who want a great setup with quality management without the maintenance hassles for a monthly fee. As the Drupal community matures hosting solutions are beginning to appear. Hosted plans should offer you a range of scaling options without any further setup required.(If your company isn&#8217;t listed here please let me know.)</p>
<dl class='more'>
<dt><a href='http://acquia.com/products-services/acquia-hosting'>Acquia Hosting</a></dt>
<dd>Acquia offer a highly tuned environment for your apps: cloud based hosting (EC2) with load balancing, redundancy, opcode cache, LAMP tuning and Varnish for high performance and scalability. Starts at $500 per month.</dd>
</dl>
<hr />
<p>This article forms part of a series on Drupal performance and scalability. The first article in the series is <a href="http://cruncht.com/75/drupal-performance-scalability">Squeezing the last drop from Drupal: Performance and Scalability</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cruncht.com/85/drupal-hosting-environment/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
