pg_utils: Low_used_indexes.sql, how it works and why it should be lower than 0.01?
If you work with PostgreSQL the chances that you are aware or used our
pg_utils are quite high. One of them is
low_used_indexes.sql that lately we received a few questions about, so I thought it might be a good idea to write a short blog post about it.
As you might gather from its name,
low_used_indexes.sql goal is to identify indexes that are rarely used. “Rarely” is the key here, since those indexes that are never used can be discarded and those that are used frequently should definitely be kept. With
low_used_indexes.sql we are searching for indexes that need closer investigation since while taking up the precious disk space their usage also can’t be justified from the business logic perspective.
How is this “rare usage” calculated? Let’s take a look at this example:
SELECT pg_stat_user_indexes.schemaname||'.'||pg_stat_user_indexes.relname, indexrelname, pg_stat_user_indexes.idx_scan, (coalesce(n_tup_ins,0)+coalesce(n_tup_upd,0)-coalesce(n_tup_hot_upd,0)+coalesce(n_tup_del,0)) as write_activity, pg_stat_user_tables.seq_scan, pg_stat_user_tables.n_live_tup, pg_size_pretty(pg_relation_size(pg_index.indexrelid::regclass)) as size from pg_stat_user_indexes join pg_stat_user_tables on pg_stat_user_indexes.relid=pg_stat_user_tables.relid join pg_index ON pg_index.indexrelid=pg_stat_user_indexes.indexrelid where pg_index.indisunique is false and pg_stat_user_indexes.idx_scan::float/(coalesce(n_tup_ins,0)+coalesce(n_tup_upd,0)-coalesce(n_tup_hot_upd,0)+coalesce(n_tup_del,0)+1)::float<0.01 and (coalesce(n_tup_ins,0)+coalesce(n_tup_upd,0)-coalesce(n_tup_hot_upd,0)+coalesce(n_tup_del,0))>10000 order by 4 desc,1,2
At first glance, it’s not clear why
pg_stat_user_indexes.idx_scan is divided by
write_activity and why it should be lower than 0.01? Afterall, we could choose indexes
withidx_scan = 0?
First, unfortunately, among indexes that are seldomly used
idx_scan=0 will miss those that are used once in twenty-four hours or once a week or once a month.
Second, if there are not many writes in the table such rarely used indexes do not pose any threat, that is aside the fact that they still use your disk space. However, if you have rarely used indexes AND your table has many writes you will sooner or later be facing quite severe consequences in terms of disk space. So, in the latter case it is better to delete it and perform a regular
sequential scan, especially if
pg_stat_user_tables.seq_scan isn’t equal to 0.
<0.01 is a somewhat rule-of-thumb number that is mainly based on our own experience working with a variety of clients.
Normally indexes that fall into this category are worthwhile reviewing and based on what business logic stands behind them, in what queries they are used and based on that consider if there won’t be any faults in the process if you leave them as they are.
Usually, the smaller the table the less risky is to remove one of those rarely used indexes, but one should consider any deletion of that level carefully.
Hope this is useful and if you would like us to post more information like this about the rest of our utils let us know in the comments below.