28 Jul

Mirasvit Full Page Cache

I’ve talked about Full Page Cache before, and how a fast site is important for your customers (not to mention GoogleBot), and ultimately for better sales conversions.

From a SysAdmin point of view, sites with some kind of FPC can handle much more traffic, with fewer server resources (read: cheaper for you, IT Manager) and are usually much better at handling sudden traffic spikes.

Varnish is as fast as it gets. But Varnish requires a lot of skill to implement well and work around any niggles (and there are always issues).  Now, I love Varnish, but for many it’s just too complex, or time consuming, especially if you’re working to a tight deadline. Instead, there are plenty of code-based solutions which aim to implement Enterprise-like FPC but for a fraction of the cost.

I’ve seen a lot of Community Edition customers using this extension successfully, and I wanted to see what it was all about.

https://mirasvit.com/magento-extensions/full-page-cache.html

Special mention here to the folks at Mirasvit, who were were kind enough to send us a copy for evaluation at Rackspace. The turnaround was good so I’m happy you’ll get a responsive support experience. For us, that’s really important.

 

Your Milage May Vary

I was testing with stock Magento Community 1.9.1.0 and the sample data.

The settings I’ll discuss here should work fine for most, but your milage may vary if your Magento store is heavily customised. Always test new modules in a staging environment before implementing on your live website.

Installation

I pretty much followed the bundled instructions – no need for me to detail it here but it was very straightforward. See also the Mirasvit FPC user manual.

 

Configuration

Let’s dive into the config, in your Magento Dashboard (System > Configuration > MIRASVIT EXTENSIONS/Full Page Cache )

General Settings

MirasvitFPC_General

  • Enabled: Yes (obviously)
  • Cache Lifetime (sec): I’ve gone for two days here, you could use more. If your site gets indexed by a search engine once a day, the first hit will warm up, and the page won’t have expired by the next day’s index. If your site traffic is quite low, and it could be a few days between page views of any one particular product, then you should keep this value high, like a week (604800 seconds).
  • Flush Cache Expr: Leave it empty to disable the auto-flushing. I tested that saving a product will automatically expire the relevant pages, so you are not likely see out-of-date content.  My general rule is that you shouldn’t have to specifically flush caches (development aside); the more you flush them, the less effective they are.
  • Max. Cache Size (Mb): 128 is probably OK for most, but you might need more if you have a lot of products/categories. You should understand where you cache is, though, before increasing this. For example, if you’re using a 512M Redis instance from ObjectRocket, then setting this higher than 500 would start to cause problems if it gets full. For a local Redis instance, your maxmemory directive in /etc/redis.conf will be relevant here.
  • Max. Number of Cache Files: 20000 seems ample; you might need to increase this if you have a lot of SKUs, categories, etc.

Crawler Settings

The crawler seems to work really well. What I like is that it only crawls the pages  your customers are actually hitting, rather than just spidering the whole site needlessly. MirasvitFPC_Crawler

  • Enabled: Yes. If your site is pretty busy, and your expiry times are high, then you might find your customers do a great job of warming up the cache for you. For quieter sites though, or to ensure that most people hit cache most of the time, definitely enable it.
  • Number of Threads: 1.  First of all, you should find out how many CPU cores are available on your server. My test site is running on a small Cloud Server, with only one vCPU core, and my load testing experience tells me the default of ‘2’ might slow things down for my 1 vCPU core. lscpu is a command you can run to find out quickly.  Half that number, as a rule of thumb, should safely avoid impacting performance for real users.
  • Thread Delay: I’ve put half a second in there to further reduce load impact.
  • Limit of Crawled URLs per Run / Schedule : A higher limit here will warm up the cache more quickly, in conjunction with the Schedule, but the idea here is to prevent the crawler from running away with itself and endlessly hammering your server. The default setup is going to crawl up to 10 URLs every 15 minutes, which is fairly conservative and only 40 pages per hour. Something like 20 URLs every 10 minutes should be fine. If you wanted to get even more granular, we could run every 10 mins but avoid peak hours (let’s say they are 12-2pm and 6-10pm),  with something like:
    • */10  0-11,15-17,22-23  *  *  *
  • Sort Crawler urls by: Popularity. Sounds sensible; I didn’t bother setting up the custom order.
  • Run crawler as apache user: No. I didn’t need to do this, although my PHP is running under FPM as its own user, and that user is also running the Magento Cron job.

Cache Rules

This is really the nuts and bolts of what gets cached and what doesn’t.MirasvitFPC_CacheRules

  • Max. Allowed Page Depth: 10. I’ve seen sites where heavily layered or even cyclical navigation leads to endless unique URLs, and it’s not practical to cache them all. This is there to prevent over-caching of those pages, and 10 seems like a decent default value.
  • Cacheable Actions: The defaults here are the home page, product pages, and category pages. That’s probably fine for most; you might need to add bits if you have heavy CMS pages, or if your store is heavily customised.
  • Allowed/Ignored Pages: What it says on the tin. Maybe you have a special CMS page which includes a live Twitter feed, and you don’t want to cache it.
  • User Agent Segmentation: If you have a responsive theme, you won’t need this. If any part of your code relies on device detection, like a different tablet layout, or a popup about your iPhone app, then it’s likely you need to use this. Above is my example which should take care of most popular devices right now (2015); you might need to work on your own expressions depending on which devices/browsers your site care about. One thing you should not do is separate GoogleBot or other engines/bots/crawlers – if you do that then they’re less likely to get the page from your main cache. Faster for them is good for your rankings, and hitting the cache is good for your server load.

Debug

The debug options are pretty self-explanatory, and should usually be disabled in production.   The Time Stats are really handy to compare uncached vs. cached performance, and I like that you can show these only for your IP address(es). The code is using $_SERVER[‘REMOTE_ADDR’] though, so it won’t work behind reverse proxies or load balancers.

MirasvitFPC_DebugHints

 

Cache Management

The obvious thing here is that you get another option for Full Page Cache under System > Cache Management. You’ll need to enable that, then flush all cache, for the FPC to start working.

Everyone loves a nice graph – ask NewRelic – and understanding how your cache is performing will help you drive a faster experience for your users.  Here’s what Mirasvit FPC adds to your Cache Management page:

MirasvitFPC_stats

You can zoom the graph to a smaller time period, or get an overview for much longer.

Source: http://fpc.demo.mirasvit.com/admin/?demo=fpc (because my test store didn’t have enough data for an interesting graph).

More screenshots on the Mirasvit website or Magento Connect.

One Caveat

When testing with two browsers side by side, I did at first get some crossover where one browser would see the page with cart contents showing from the other session. I found that this was down to the way Magento includes the Session ID in the URL by default, combined with the default of not doing any session validation.  After disabling that, everything worked as expected.

  • System > Configuration > Web > Session Validation:  “Use SID on Frontend” =  “No”.
  •  Clear all cache to apply.

What I liked

  • Easy setup. I just plonked the files in place, and pressed “go”.  You may want to tweak the default settings as above, but it pretty much works out of the box
  • No extra local.xml config. It just uses whatever <cache><backend> you already have configured, which is great. I was already using Redis, and Mirasvit lapped it up.
  • Good support: That’s the main theme in the comments on Magento Connect, and the team did respond to my email within a day. For an extra $50 USD, Mirasvit will even install the plugin for you – great if you don’t have the skills or don’t have a developer on hand.
  • Cache Rules are really nice to configure, and extras like User Agent separation mean that it’s very flexible.
  • Built in Crawler seemed to work really well, and it won’t smash your server to pieces.

I didn’t test:

  • Dynamic blocks.  Blocks and layouts are going to be unique for each store, so working on the default probably won’t help. Mirasvit provide full documentation and offer to help with this as part of the installation service. You may not need to configure this – in my experience, hole-punching for dynamic blocks generally creates more complexity and extra work in the long run for your frontend developer. Simply using the cache as-is will still cut out 90% of server load while keeping your deployment simple.
  • Debug stats behind a reverse proxy.  A lot of the customers I work with have their main web server(s) behind a load balancer, or maybe a CDN like CloudFlare. It’d be nice to see this implemented from the Magento client IP, which can be configured in local.xml to get the real IP from  X-Forwarded-For or any other HTTP headers.
    • UPDATE from Mirasvit: “We will use similar approach in our next releases.”

Final thoughts

Quick to get going, feature rich, and not overly complicated, it’s a great alternative to a complex Varnish configuration. My cached page loads (coming from Redis) were showing around 37-70 ms, which is on par with the Enterprise FPC.  With great support too, and all for a one-off $149, it’s probably the best $149 you could spend for your Community Edition Magento store.

 

01 Jun

6 reasons your Magento site went down

This article is about extreme traffic overwhelming your site.

There’s plenty of marketing you can do to drive traffic, but TV appearances are by far the most hard-hitting. Your target audience is sitting on the sofa with a laptop or tablet at the ready, and they’re all going to hit your site within the same 10 seconds or so. Email campaigns can have a similar effect, but usually you can stagger delivery to limit the impact. Read on especially if you’re a startup planning on launching with a big bang.

Specifically, we’re talking about:

Before we start, I’m assuming your site is well hosted and generally performs well under normal conditions (Magento category page Time To First Byte < 1.0 second without FPC). If it doesn’t, then stop reading. These will be band-aids rather than solutions.

Here are six things you can do to prepare for huge traffic spikes.
 
 

1. Plan for failure

Despite everything else in this article, it can be hard to gauge just how much of a load spike you’ll get.  Get some error pages in place, at every level possible. Your developers or digital agency should be able to knock something up in no time.

Do: Think about including a discount code on error pages, encouraging your customers to come back later. I’m told it really works.

Do: Make it look nice, on brand, including contact details. Some are light-hearted and fun – I’ve even seen Pac-Man embedded to play while waiting – but the main message you need is to encourage a repeat visit. Default or generic error pages do not inspire confidence in your brand.

Don’t: Include any images, CSS, etc, on this error page from your web servers, in case they’re not responding. Host these assets elsewhere; a CDN would be perfect.

 
 

2. Full Page Cache

A good Full Page Cache is essential to absorb the majority of your traffic, and the majority of server load.

Can be quite a complicated thing to get right though, so I wrote a separate article with my thoughts on Magento Full Page Cache.

If you’re doing it right, your pages and categories should be coming back in around less than 100 milliseconds.

 

3. Database

First, use persistent connections.  I’ve seen the sudden influx of DB connections overwhelm the TCP stack on the DB host. Make sure MySQL is configured to accomodate that, though. Every single PHP-FPM or Apache child process could be hanging onto a connection. We can do some quick maths here: If you have ten web servers with pm.max_children=250, then your MySQL max_connections needs to be 2500. Add a few more for any monitoring or diagnostics.

Configure it in local.xml:

<connection>
    <host><![CDATA[magento_db_host]]></host>
    <username><![CDATA[magento_db_user]]></username>
    <password><![CDATA[magento_db_pass]]></password>
    <dbname><![CDATA[magento]]></dbname>
    <active>1</active>
    <persistent>1</persistent>
</connection>

Replication?

No.
Three points about this:

  1. Database is not normally the bottleneck. CPU for PHP execution is. You will want a reasonably powerful machine with good I/O, but so long as your FPC is effective, it’s very unlikely that you’ll need to scale out.
  2. Broken shopping carts, and other weird behaviour. Replication takes time, like a second or two, which can be long enough to cause a problem. For the most part, the Magento read/write separation does account for replication delay but third party extensions might not. It’s especially important when they are related to shopping carts or checkout functionality.
  3. Replication is not resilience.  I want to mention this,  because I think a lot of people ask for replication out of a misunderstanding that it’ll make the site more resilient or Highly Available. A master-slave setup still has single points of failure, and Magento will error out if it can’t connect to the master or ANY of your slaves. A multi-master implementation could work if you have a floating IP, but in my experience the complexity far outweighs the benefit. At Rackspace, our go-to HA solution to run MySQL (Percona usually) under the Red Hat Cluster Suite, and that works brilliantly. Magento gets one Database connection, which is a floating IP between resilient nodes. The backend servers are often less powerful than the Web servers; see point #1

That’s it for the database. I’m not going  into general DB optimisation here, but Major Hayden’s mysqltuner.pl is a good start.

 
 

4. Backend Cache Scaling

Most of the time, you’ll be sharing your Redis cache between web nodes. This is important for management of the cache via Magento admin. At a massive scale though, you can overwhelm the physical network, TCP stack on the Redis host, and run into performance problems because Redis is single-threaded.

Too many servers on one Redis instance

The simple solution is to install a local Redis instance on each web server, and connect on localhost or a UNIX socket. Cuts out the network load completely, and scales out to the Nth degree.

One Redis instance on each Web server

The major disadvantage of doing this though is that management operations, like clearing the cache, or general invalidation when you make changes, will not happen across the board.  Here’s a quick-and-dirty proof of concept bash script for clearing out all your caches at once, assuming you are also configuring each to listen on your local or isolated network:

REDIS_SERVERS="192.168.100.5 192.168.100.6 192.168.100.7 192.168.100.8 192.168.100.9"
for server in $REDIS_SERVERS; do
    echo -e "FLUSHALL" | nc $server 6379
done

NB: This kind of cache setup is only for extreme cases; 99% of the time a single Redis instance is OK for Magento cache. I like to use a second one for Enterprise full_page_cache. You could look into Redis sharding for ultimate performance, but that’s a little more complicated than we need here. This is for a one-off event, and when it’s done you can scale back you the single Redis instance for easier cache management.

NB: Cache storage must not be confused with Session storage. It often is, when the same technology is involved. Despite the above, I would still advise keeping all your sessions in one place, mainly because I don’t like to rely on load balancers’ session persistence. It’s very unlikely to saturate the network as the Cache traffic can. I prefer Memcached over Redis for sessions; it’s simple and multi-threaded. On that note, ensure MAXCONN and CACHESIZE are suitably configured.

 
 

5. CDN

Content Delivery Networks are not a magic solution, and usually have no effect on those initial page loads, nor the PHP load on your web servers. While some CDNs do have full page caching features, I haven’t seen anyone successfully integrate them into an application as complex as Magento.

What a CDN will do, however, is speed up the delivery of extra content for the overall page load. Especially if you’ve got an ocean between your customers and your server(s). If all your customers are in the same country as your server, though, it probably won’t be that much faster and might not be worth the effort.

The biggest advantage for me is to reduce the network load on your infrastructure. On most Magento stores (most websites in general), the bulk of actual data content is product imagery. Offloading that to a CDN will definitely help to avoid network saturation, and load on net devices like firewalls and load balancers.

You need to be using a CDN which pulls from origin; the days of trying to upload with ImageCDN are long gone. And for faster pageloads, you can use separate URLs for your skin, media, and javascript elements, leveraging parallel downloads. Once those are set up, it’s pretty trivial in Magento to configure the URLs under System > Configuration > Web. It might be a little more work for SSL, but if most of your window shopping is done over plain HTTP then start with the unsecure base URLs for the quickest win.

 
 

6. Load testing

You need to know where your website actually stands in terms of traffic, and you need to do it properly.

Don’t: rely on tools like siege, or services like loader.io and blitz.io. They can be extremely useful of course, but only if you are able to interpret the results properly. Unless you have a deep understanding of HTTP protocol headers, cookies, unique session IDs, these tools probably won’t help you all that much.
 
Do: get it done professionally. You need a test that can mimic actual human user journeys, and repeat them on a massive scale. jMeter is good, but doing it right can be complicated and very time consuming. I would argue this is best left to professionals who do just that. Not cheap, but a necessary investment for your website’s future. Your hosting provider might offer professional load testing services, or refer you to someone who can. Soasta are excellent.

27 May

Magento Full Page Cache

I find myself talking about this a lot, so here are my musings on Magento and Full Page Cache, written down.

To run at scale, and keep 90% of the load away from your CPUs, you just have to have a Full Page Cache that works. A quick way to test is to measure the Time To First Byte. If FPC is working, your TTFB should be around or under 100ms (not including network latency) after about 2-3 requests.

From an SEO and Customer perspective:

Important: Before thinking about Full Page Cache, you do still need reasonable performance without it. Otherwise you’re really just masking other problems and your customers are going be less than satisfied when they hit a page that isn’t cached.  If your TTFB is much more than 1 second, you first need to talk to your developer and/or hosting provider about optimisation and config.  Additionally, this absolutely does not negate the need for a good cache backend, like Redis.

Full Page Cache is the icing on what should already be a tasty cake.

 

Magento 2.x

Magento 2.x has Varnish 3/4 support out of the box, and it’s recommended over Redis as a Full Page Cache.

  1. Install Varnish, and have it listen on port 80 in front of Apache/nginx.
  2. Configure Magneto to use Varnish, and export a VCL.  Stores > Settings > Configuration > Advanced > System > Full Page Cache. 
  3. This should create a Varnish VCL under <Docroot>/var. Configure Varnish to use that.
  4. If you have multiple web servers, you need to:
    • Ensure your local network range is included in acl_purge{} in the Varnish VCL.
    • define an array for http_cache_hosts in the app/etc/env.php configuration

Enterprise Edition 1.x

In Magento Enterprise, you just turn on Full Page Cache, under System > Cache Management. Simple as that! If your developers advise that it needs to remain off for X functionality to work, then get better developers.  If the TTFB is still slow after 3 or 4 requests, despite FPC being turned on, then it’s likely that a third party extension is preventing content being cached (or immediately invalidating the FPC entries). Engage your developers about this.

As far as the config goes, you can use the local.xml to configure <full_page_cache> independently from <cache> . Use a second Redis instance if you get plenty of traffic; it’s nicer for sys admin management and gives you an extra thread.

Community Edition 1.x

Third party code

As with most Enterprise features, there are plenty of Community extensions to replicate Full Page Cache. Usually though, you need a skilled Magento developer for good integration. In the case of FPC, it’s likely you’ll need to work on your templates for dynamic block placeholders.

Gordon Lesti’s free module is popular; Extendware, Mirasvit and Brim have good solutions for a small fee; there are countless others to choose from. If these extensions work for you, they’re often a great balance between performance and complexity.

UPDATE: Full article on how to configure Mirasvit FPC. It’s one of the best.

Varnish

Instead of a code-based FPC, it’s possible to use Varnish. Varnish can be insanely fast, but you need Magento extension to integrate and manage it properly. Amongst their weaponry are such diverse elements as setting the right cache-control headers, TTLs, purging expired content, and giving you the all-important Varnish VCL.

Turpentine is the most feature-rich of the Varnish config, and works great under most conditions. But my one issue with it is that the necessary ESI requests for form_keys can happen many times per page (depending on your templates), and these really add up. I have seen those form_key requests alone overwhelm even a very large web infrastructure, during high traffic events like TV promo and load testing.

My personal favourite is to use the Phoenix PageCache implementation for Varnish, with a few of my own VCL tweaks to sort out User-Agent normalisation, SSL termination, and one or two other bits. With PageCache, the form_keys are generated by an embedded C function, then stored in a header for later re-use. It’s a much more efficient way of dealing with Magento’s form_keys. The free PageCache module for Community Edition doesn’t handle hole-punching for logged-in users as Turpentine can, so everyone with a session bypasses Varnish, but that also makes it simpler to manage; you don’t need to mess with your templates. In my experience it does the best job of handling load spikes and gives you the fewest headaches.

 

Varnish is not a silver bullet

If configured badly, it can cause you no end of problems. Broken sessions, add-to-cart not working, seeing someone else’s shopping cart, and generic 503 errors are very common. On the other hand, a stray Vary: header here, or an unnecessary session_start() there, and the site will seem to work but Varnish probably won’t be caching much.   Varnish can make your site blisteringly fast, but you need some solid experience for a successful implementation.

Do: talk to your dev agency and/or hosting provider for advice and expertise.

Don’t: follow a beginners’ how-to and hope for the best.