20 Apr

Foolproof Magento Indexing

This is for Community Edition and Enterprise Editions before 1.13.

Once you have established whether Magento indexing is breaking your site, here is the simple 1-2-3 solution.

Generally, reindexing in the daytime on a busy site can cause problems, and by default Magento will fully reindex after any product/catalogue changes. The gist of this is that you probably don’t want that to happen in peak business hours.

1. Manual indexes.

Two of the indexes are more likely to cause you problems than any of the others – the URL rewrites and the Fulltext search. Set them to manual – the others should be OK.

Magento manual indexing

System > Index Management

Alternatively you can set this directly in the database:

mysql> UPDATE index_process SET mode="manual" WHERE indexer_code="catalog_url";
mysql> UPDATE index_process SET mode="manual" WHERE indexer_code="catalogsearch_fulltext";

2. Configure a cron job to do that manual reindex, every day.

crontab -e -u username

username is the user which runs your PHP-FPM, or just apache for mod_php.  I try to avoid having root run these jobs; it creates lock files in ~/var/ which the application user will not be able to work with.

Your added cron job should look something line this:

@daily /usr/bin/php /path/to/magento/documentroot/shell/indexer.php reindexall >/dev/null 2>&1

I’ve used @daily as a cron shortcut, which is usually midnight (server timezone). You could be more specific if you like, for example if you need to avoid other jobs like database backups. This is in addition to the normal Magento cron running every 5 minutes.
Obviously you need to replace /path/to/magento/documentroot with whatever’s relevant in your hosting environment.

If you don’t have access or confidence to do this via SSH, your hosting provider should be able to help.

3. Ignore the banner.

Might seem like a silly thing to mention, but  I’ve often seen cases where a diligent member of staff was following the advice and doing the reindex, unaware that it was causing problems and will be done by the cron job anyway. If you have a large team of admin staff, just be sure to let them all know.



Business critical updates?

When I suggest this, I’m often greeted with something like, “..but it’s absolutely essential that new products are searchable and available via their URLs IMMEDIATELY!

You have a few choices here:

  1. Think about your business requirements vs. impact vs. cost. Do you really need that? All the time? If it’s just occasionally, then continue as above and deal with the occasional manual reindex in the daytime.

  2. Third party code. This extension claims to do the job. There are probably others, too. I can’t vouch for it because I’m not really a developer. As with all third party extensions, the fewer the better and, of course, YMMV.

  3. Buy the Enterprise Edition. Or upgrade if you’re on EE < 1.13. There are plenty of other reasons for this, but index management is a major factor. If you have enough products for indexing to be an issue, and it really is business critical that your indexes are up-to-the-minute fresh, and you need vendor escalation with your software, then it’s a no-brainer. Talk to your finance director, bite the bullet, and invest in software that does the job out of the box.