somedb=#CREATE EXTENSION unaccent;CREATE EXTENSION somedb=#SELECT unaccent('Hôtel'); unaccent ---------- Hotel somedb=#SELECT * FROM table WHERE lower(unaccent(table.id))= lower('Jose');
And speed things up a bit by generating an accentless, lower-cased index:
somedb=#CREATE INDEX CONCURRENTLY ON table(lower(unaccent(id)));
I use the unaccent module for that – which is something different entirely than what you are linking to.
unaccent is a text search dictionary that removes accents (diacritic signs) from lexemes.
Install once per database with:
CREATE EXTENSION unaccent;
It also provides the function
unaccent() you could use with your example (where
LIKE seems not needed).
SELECT*FROM users WHERE unaccent(name)= unaccent('João');
To use an index for that kind of query, create an index on the expression. However, Postgres only accepts
IMMUTABLE functions for functional indexes. If a function can return a different result under different circumstances, the index could silently break.
unaccent() is only
IMMUTABLE. According to this thread on pgsql-bugs, this is due to three reasons:
- It depends on the behavior of a dictionary.
- There is no hard-wired connection to this dictionary.
- It therefore also depends on the current
search_path, which can change easily.
Some tutorials on the web instruct to just alter the function and declare it
IMMUTABLE. This is a brute-force method that might break under rare circumstances.
Others suggest a simple
IMMUTABLE wrapper function, like I did myself in the past.
Another alternative would be this module with an IMMUTABLE
unaccent() function by Musicbrainz, provided on Github. Haven’t tested it myself. I think I have come up with a better idea:
Best for now
All of this put together I propose this mostly safe approach that is just as efficient as other solutions floating around, but safer: Create a simple wrapper function with the two-parameter variant that hard-codes the correct dictionary as well as the right
CREATEOR REPLACE FUNCTION f_unaccent(text) RETURNS text AS$func$SELECT unaccent('unaccent',$1)$func$ LANGUAGE sql IMMUTABLE SET search_path =public, pg_temp;
search_path to the schema where you install your extensions (default is
Why the dangling
pg_temp? To rule out temporary objects coming first. More in the manual here.
You can build a functional index using that, since it is declared
CREATEINDEX users_unaccent_name_idx ON users(f_unaccent(name));
Adapt your query to use the index:
SELECT*FROM users WHERE f_unaccent(name)= f_unaccent('João');
If you want to use it with
LIKE (and a pattern that is not left-anchored), you can combine this with the module
pg_tgrm in PostgreSQL 9.1 or later. Create a functional GIN or GIST index. Example for GIN:
CREATEINDEX users_unaccent_name_trgm_idx ON users USING gin (f_unaccent(name) gin_trgm_ops);
Be aware that GIN and GIST indexes are somewhat more expensive to maintain. Would be used in a query like:
SELECT*FROM users WHERE f_unaccent(name)LIKE('%'|| f_unaccent('João')||'%');
I have written a more detailed answer about pattern matching and performance in a recent answer on dba.SE.
pg_tgrm also provides very useful operators for “similarity”
% and “distance”