Squeezing the last drop from Drupal
Performance and Scalability
Murray Woodman
cruncht.com
A bit about me
- Working with Drupal for 6 months
- Have worked on one large site: uriverse.com
- 13M nodes
- 90 languages
- 190M rows in DB
- Not much experience with heavy traffic sites but a fair amount of research.
Performance vs Scalability
- Performance refers to the speed of the system. Basically, how long does it take to produce a page?
- Scalability refers to the ability of the system to handle more requests. ie. as the load increases on a system how is it placed to handle it.
Easy Wins and Hard Slog
A combination of techniques can get the most out of Drupal.

© rogersmith
It looks hard
It is not always easy to scale Drupal -- not because Drupal sucks, but simply because scaling the LAMP stack (including Drupal) takes no small amount of skill. You need to buy the right hardware, install load balancers, setup MySQL servers in master-slave mode, setup static file servers, setup web servers, get PHP working with an opcode cache, tie in a distributed memory object caching system like memcached, integrate with a content delivery network, watch security advisories for every component in your system and configure and tune the hell out of everything.
Dries Buytaert
But there are some easy wins
Even novice developers and admins can achieve good performance.
- All sites can benefit from some easy tweaks
- Sensible decisions can avoid problems
- Continual learning process
- Hosted and pre-rolled solutions
Server Tools
Diagnostic tools to measure and monitor performance.

© origomi
Performance Measuring
Essential for development and troubleshooting.
- YSlow: time to return "doc"
- Devel: sql timer, page timer, RAM
- MySQL Administrator: health
- mysql client: show processlist
- mytop: MySQL report
- top, ps: processes CPU and RAM
- vmstat, free: virtual memory and RAM
Performance Monitoring
Required for ongoing maintenance and problem identification.
- Munin
- Google Analytics
- Woopra
- AWStats
Drupal and Server Profile
Every site is different. How does your site compare?

© alexbarlow
Different kinds of sites
First option sites are more likely run into problems.
- Nodes: many vs few
- Traffic: high vs low
- Users: logged in vs anonymous
- Page browsing: dispersed vs concentrated
- Contention: many writes vs few writes
- Content: heavy vs light
- Functionality: rich vs poor
- Audience: dispersed vs concentrated
Hosting Environment
Hosting environment determines base performance and flexibility.

© candiedwomanire
Hosting Options
Price, performance and simplicity. Pick two.
- Shared
- Dedicated
- Virtual (Linode, Slicehost)
- Cloud: (EC2 - Mercury)
- Hosted: (Acquia, Mercury)
Performance Out of the Box
Unoptimized Drupal is slow but Drupal has some tricks up its sleeves.

© kp_sonny
Performance built in
YSlow scores pretty good due to:
- Core caching system
- Aggregate and compress JS and CSS
- Page Cache for anonymous users
- Block Cache
- Page Compression
LAMP Server Tuning
Drupal runs in the context of the LAMP stack. Many different levels which can be tuned.

© crashmaster
Opcode cache
Compiles PHP and puts it into shared memory. Big win for all sites.
Database
Tune database according to your needs.
- Engine: InnoDB vs MyISAM
- Query Buffer
- Key Cache and InnoDB Buffer pool
- Warm Up
Web Server
Apache + mod_php + MPM Prefork is the default: heavy and slow.
- Apache vs Nginx
- fcgid vs mod_php
- Worker vs Prefork
- Unneeded modules
- MaxClients
RAM is precious
Judiciously allocate memory. Avoid swapping at all costs.
- JVM with Solr
- Database indexes
- CacheRouter
- OS file cache
- Apache processes
Implementation Decisions
Work with Drupal, not against it.

© rowanbank
Issues to consider
In the back of your mind...
- Solr for Search
- Module bloat
- Node load
- CCK design
- Module design - caching, algorithms
- Keep Drupal up to date - DB patches
Drupal Troubleshooting
Oh no! It's broken. What is wrong? How to fix it.

© aasmundbo
Usual supects
Crack out your performance measuring tools (YSlow, Devel, mysql) to identify issues.
- Views and indexes
- Views and left joins
- Problems in core
- Dodgy modules
Caching
Drupal can perform at "internet scale" and handle the biggest of sites.

©
Advanced page caching options
Avoid the bootstrap and PHP and maximise performance for anonymous users.
- Internal page cache (normal + aggressive)
- Boost
- Varnish (Pressflow)
Varnish vs Boost
Comparison between the two.
- Varnish is faster
- Boost is simpler
- RAM vs File
- Fit more pages onto File System
- Patch vs no patch
- Pressflow vs Plain Old Drupal
CacheRouter
CacheRouter provides pluggable backends to cache tables and speed things for logged in users. Load taken off MySQL.
- database: default
- file: shared
- APC: single box with mod_php
- Memcache: multiple boxes or fcgid
Custom Drupal Distributions
Easy routes to tuned setups. Drupal ecosystem is maturing.

© ryanr
Complexity hidden
End to end solutions covering most tips in this guide.
- Pressflow: SQL, Varnish, CacheRouter, Code tweaks
- Mercury: Pressflow on EC2
- Mercury on different platforms
- Acquia
Benchmarking Tools
Hammer your server with the following tools.
Page Rendering
The final leg to the browser makes a big difference... up to 90%.

© fatboyke
Drupal performs well by default
YSlow shows good performance in:
- CSS and JS aggregation
- Page compression
Rendering improvements
Areas where you can make even more of a difference:
- CSS sprites
- Compress binary files
- Content Delivery Networks
- Turn on Expires Apache module
Quick Reference
What site do you have? Recap of suggestions.

© koalazymonkey
All sites
Easy wins for all sites.
- Best server for your budget and requirements
- Enable CSS and JS optimization in Drupal
- Enable compression in Drupal
- Enable Drupal page cache and consider Boost
- Install APC if available
- Tune MySQL for decent query cache and key buffer
- Optimize file size where possible
Low Resources
When RAM is an issue, cut the fat.
- Boost stops PHP load and Bootstrap
- Sensible module selection
- Avoid node load in views lists
- Smaller JVMs possibly if running Solr
- Nginx smaller than Apache
- mod_fcgid has smaller footprint over mod_php
Server farm
Dedicated and virtual hosting gives you more options.
- Split off Solr
- Split off DB server, watch the latency
- With Cache Router select Memcache over APC for shared pools
- Master + slaves for DB
- Load balancing across web servers
Big Site
Millions of nodes uncovers DB weak spots.
- Buy more RAM for database indexes
- Index columns, especially for views
- Thoroughly check slow queries
- Warm up database
- Swap in Solr for search
- Solr to handle taxonomy pages
High activity
Small memory footprints and high contention of database.
- Boost or Varnish
- Nginx over Apache
- InnoDB on cache tables
Many logged in users
Page cache not operational - finer grained caching.
- View/Block caching
- CacheRouter (APC or Memcache)
High contention
New nodes and new cached elements.
Heavy Content
Big (uncached) multimedia files.
- Optimized files
- Well positioned server
- CDN
Rich Functionality
Lots of code.
- Well behaved modules
- Not too many modules
- View/Block caching
Audience dispersed
Global audience, local store.