27 Mar

Is Magento indexing breaking my site?

Magento indexing in Community Edition 1.x (and older EE ≤ 1.12) is an absolute train wreck.

Sooner or later it’s going to take your site down, or key parts of it like searching and checkout.

This article is about how to tell whether or not this is a problem for your site.

The worst thing is that the default behaviour is to reindex every time you save a product. Chances are this will be business hours, when your content editors are working on products. Not good.

System > Index Management.

System > Index Management.

 

The two that usually take the longest are are:

  1. Catalog URL Rewrites”This holds all references to every product, including old products if you’re keeping links for SEO.
  2. “Catalog Search Index”By default this is a MyISAM table with a FULLTEXT index, so the whole table gets locked during reindexing.  In MySQL 5.6 variants we can switch this table to InnoDB for row-level locking, but there is usually still disruption.

 

How long does it take?

Longer for larger catalogs, but you can find out from the database:

mysql> SELECT * FROM index_process;

Or, a quicker way:

mysql>  SELECT indexer_code, TIMESTAMPDIFF(SECOND, started_at, ended_at) as duration from index_process ORDER BY duration DESC;
+---------------------------+----------+
| indexer_code              | duration |
+---------------------------+----------+
| catalog_url               |       16 |
| catalog_product_attribute |        2 |
| catalog_category_product  |        2 |
| catalogsearch_fulltext    |        2 |
| catalog_product_price     |        1 |
| catalog_product_flat      |        0 |
| catalog_category_flat     |        0 |
| cataloginventory_stock    |        0 |
| tag_summary               |        0 |
+---------------------------+----------+
9 rows in set (0.00 sec)

This was just with uses the stock data and took 16 seconds for the catalog_url index. Most real-world shops will take several minutes; half an hour is pretty normal.  I’ve seen it take hours, where there are tens of thousands of SKUs.

Tell-tale signs

When indexing is causing a problem, you’re most likely to see Magento errors like this in your /var/report/ directory:

SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction

Anyone on StackOverflow might suggest increasing the innodb_lock_wait_timeout, but I don’t think that helps. Here’s why:
The default timeout 50 seconds, but a lot of browsers, reverse proxies etc, tend to time out after 30 seconds. You’re likely to see a 503/unavailable/bad gateway/ or something similar. Regardless, your customers are very unlikely to wait 50 seconds or longer for a page load. Increasing this value might make the SQL errors go away but it doesn’t address the root cause.

More subtly, if you’re using Nginx, especially behind a reverse proxy, you might see 499 errors in your access log. Although not an official RFC-defined error code, it means Nginx gave up because the connection was terminated at the client end. I wanted to mention that because sometimes this is the only error you’ll see, even if Magento/PHP/MySQL aren’t throwing errors lower down the stack.

Timing is key

Two ways to see exactly when it’s happening.

mysql> SELECT * FROM index_process;

The timestamps in this table will show you the most recent runs, and also remind you which ones are set to update on save (mode=”real_time”).

  • # grep [Rr]eindex access_log

    This will show up exactly when the Admin Dashboard was used to do indexing.

    # grep [Rr]eindex access_log
    127.0.0.1 - - [19/Mar/2015:13:43:34 +0000] "GET /index.php/admin/process/reindexProcess/process/1/key/3587af3c674a88e5304db11774e36326/ HTTP/1.1" 302 - "https://www.domain.com/index.php/admin/process/list/key/1129d37eabefa571c1956a19f45f632b/" "User Agent String"
    127.0.0.1 - - [19/Mar/2015:13:43:34 +0000] "GET /index.php/admin/process/reindexProcess/process/1/key/3587af3c674a88e5304db11774e36326/ HTTP/1.1" 302 - "https://www.domain.com/index.php/admin/process/list/key/1129d37eabefa571c1956a19f45f632b/" "User Agent String"
    127.0.0.1 - - [19/Mar/2015:13:44:04 +0000] "POST /index.php/admin/process/massReindex/key/0d75d56b5e90243c0175922deecb4e43/ HTTP/1.1" 302 - "https://www.domain.com/index.php/admin/process/list/key/1129d37eabefa571c1956a19f45f632b/" "User Agent String"
    
  • “massReindex” shows up when using the bulk reindex option
  • “reindexProcess/process/N” is a single index refresh, where N corresponds to the ID in the index_process table.

So, take the timestamp in the log here, let’s say 13:44:04. I know from earlier that it takes about 23 seconds to get through all my indexes, and given that log entries are generally made when a request is finished, I can count backwards and to work out we could’ve had possible website disruption between 13:43:41 and 13:44:04.  Not much of an issue for my test site, but in the real world, we’re usually looking at several minutes, usually enough for your customers to get bored and shop somewhere else.

I’ve found this to be really useful when supporting customers wondering why their website broke at X time on X day.

The solutions?

… another post for another day. UPDATE: here’s that post.