PgSearch full-text search with GIN
prerequisites
- install PgSearch gem
gem 'pg_search'
- If there is exist data before install pg_search gem, Insert data to pg_search_documents Table
$ rails console > PgSearch::Multisearch.rebuild(YourModel, clean_up: true)
- seed file for large data
$ rails console 200000.times do FactoryBot.create :your_model end
Before GIN(Generalized Inverted Index)
"Execution Time: 7238.816 ms"
> ActiveRecord::Base.connection.execute("EXPLAIN ANALYZE #{PgSearch.multisearch('Diablo').to_sql}").values
(7342.5ms) EXPLAIN ANALYZE SELECT "pg_search_documents".* FROM "pg_search_documents" INNER JOIN (SELECT "pg_search_documents"."id" AS pg_search_id, (ts_rank((to_tsvector('simple', coalesce("pg_search_documents"."content"::text, ''))), (to_tsquery('simple', ''' ' || 'Diablo' || ' ''')), 0)) AS rank FROM "pg_search_documents" WHERE ((to_tsvector('simple', coalesce("pg_search_documents"."content"::text, ''))) @@ (to_tsquery('simple', ''' ' || 'Diablo' || ' ''')))) AS pg_search_ce9b9dd18c5c0023f2116f ON "pg_search_documents"."id" = pg_search_ce9b9dd18c5c0023f2116f.pg_search_id ORDER BY pg_search_ce9b9dd18c5c0023f2116f.rank DESC, "pg_search_documents"."id" ASC
=>
[["Gather Merge (cost=117241.98..117645.20 rows=3456 width=126) (actual time=7011.033..7063.181 rows=3053 loops=1)"],
[" Workers Planned: 2"],
[" Workers Launched: 2"],
[" -> Sort (cost=116241.95..116246.27 rows=1728 width=126) (actual time=6951.903..6960.173 rows=1018 loops=3)"],
[" Sort Key: (ts_rank(to_tsvector('simple'::regconfig, COALESCE(pg_search_documents_1.content, ''::text)), '''diablo'''::tsquery, 0)) DESC, pg_search_documents.id"],
[" Sort Method: quicksort Memory: 537kB"],
[" Worker 0: Sort Method: quicksort Memory: 584kB"],
[" Worker 1: Sort Method: quicksort Memory: 551kB"],
[" -> Nested Loop (cost=0.42..116149.03 rows=1728 width=126) (actual time=57.831..6938.574 rows=1018 loops=3)"],
[" -> Parallel Seq Scan on pg_search_documents pg_search_documents_1 (cost=0.00..106474.33 rows=1728 width=89) (actual time=55.811..6794.069 rows=1018 loops=3)"],
[" Filter: (to_tsvector('simple'::regconfig, COALESCE(content, ''::text)) @@ '''diablo'''::tsquery)"],
[" Rows Removed by Filter: 72317"],
[" -> Index Scan using pg_search_documents_pkey on pg_search_documents (cost=0.42..5.35 rows=1 width=122) (actual time=0.026..0.026 rows=1 loops=3053)"],
[" Index Cond: (id = pg_search_documents_1.id)"],
["Planning Time: 102.483 ms"],
["JIT:"],
[" Functions: 24"],
[" Options: Inlining false, Optimization false, Expressions true, Deforming true"],
[" Timing: Generation 18.007 ms, Inlining 0.000 ms, Optimization 11.028 ms, Emission 79.955 ms, Total 108.990 ms"],
["Execution Time: 7238.816 ms"]]After GIN(Generalized Inverted Index)
"Execution Time: 294.301 ms"
> ActiveRecord::Base.connection.execute("EXPLAIN ANALYZE #{PgSearch.multisearch('Diablo').to_sql}").values
(296.3ms) EXPLAIN ANALYZE SELECT "pg_search_documents".* FROM "pg_search_documents" INNER JOIN (SELECT "pg_search_documents"."id" AS pg_search_id, (ts_rank((to_tsvector('simple', coalesce("pg_search_documents"."content"::text, ''))), (to_tsquery('simple', ''' ' || 'Diablo' || ' ''')), 0)) AS rank FROM "pg_search_documents" WHERE ((to_tsvector('simple', coalesce("pg_search_documents"."content"::text, ''))) @@ (to_tsquery('simple', ''' ' || 'Diablo' || ' ''')))) AS pg_search_ce9b9dd18c5c0023f2116f ON "pg_search_documents"."id" = pg_search_ce9b9dd18c5c0023f2116f.pg_search_id ORDER BY pg_search_ce9b9dd18c5c0023f2116f.rank DESC, "pg_search_documents"."id" ASC
=>
[["Gather Merge (cost=9465.68..9540.09 rows=647 width=394) (actual time=214.256..268.902 rows=3053 loops=1)"],
[" Workers Planned: 1"],
[" Workers Launched: 1"],
[" -> Sort (cost=8465.67..8467.29 rows=647 width=394) (actual time=205.182..217.448 rows=1526 loops=2)"],
[" Sort Key: (ts_rank(to_tsvector('simple'::regconfig, COALESCE(pg_search_documents_1.content, ''::text)), '''diablo'''::tsquery, 0)) DESC, pg_search_documents.id"],
[" Sort Method: quicksort Memory: 871kB"],
[" Worker 0: Sort Method: quicksort Memory: 800kB"],
[" -> Nested Loop (cost=28.95..8435.47 rows=647 width=394) (actual time=1.086..190.694 rows=1526 loops=2)"],
[" -> Parallel Bitmap Heap Scan on pg_search_documents pg_search_documents_1 (cost=28.53..3622.37 rows=647 width=357) (actual time=0.781..18.373 rows=1526 loops=2)"],
[" Recheck Cond: (to_tsvector('simple'::regconfig, COALESCE(content, ''::text)) @@ '''diablo'''::tsquery)"],
[" Heap Blocks: exact=1402"],
[" -> Bitmap Index Scan on index_pg_search_documents_on_content (cost=0.00..28.25 rows=1100 width=0) (actual time=1.038..1.056 rows=3053 loops=1)"],
[" Index Cond: (to_tsvector('simple'::regconfig, COALESCE(content, ''::text)) @@ '''diablo'''::tsquery)"],
[" -> Index Scan using pg_search_documents_pkey on pg_search_documents (cost=0.42..7.19 rows=1 width=390) (actual time=0.012..0.012 rows=1 loops=3053)"],
[" Index Cond: (id = pg_search_documents_1.id)"],
["Planning Time: 0.618 ms"],
["Execution Time: 294.301 ms"]]How to make pg_search_document index with GIN (Generalized Inverted Index)
% docker-compose run --no-deps web rails generate migration add_index_to_documents_content
class AddIndexToDocumentsContent < ActiveRecord::Migration[6.1]
def change
add_index :pg_search_documents, %[to_tsvector('simple', coalesce("pg_search_documents"."content"::text, ''))], using: :gin, name: "index_pg_search_documents_on_content"
end
end
% docker-compose run --no-deps web rails db:migrateCheck pg_search_documents index
you can see "index_pg_search_documents_on_content" that is new generated index with GIN
% docker-compose run --no-deps web rails console
> ActiveRecord::Base.connection.indexes(:pg_search_documents)
>=>
[#<ActiveRecord::ConnectionAdapters::IndexDefinition:0x000056053f9e1c00
@columns="to_tsvector('simple'::regconfig, COALESCE(content, ''::text))",
@comment=nil,
@lengths={},
@name="index_pg_search_documents_on_content",
@opclasses={},
@orders={},
@table=:pg_search_documents,
@type=nil,
@unique=false,
@using=:gin,
@where=nil>,
#<ActiveRecord::ConnectionAdapters::IndexDefinition:0x000056053f9e12a0
@columns=["searchable_type", "searchable_id"],
@comment=nil,
@lengths={},
@name="index_pg_search_documents_on_searchable",
@opclasses={},
@orders={},
@table=:pg_search_documents,
@type=nil,
@unique=false,
@using=:btree,
@where=nil>]
https://github.com/Casecommons/pg_search/wiki/Building-indexes#how-to-create-a-model-scope-and-use-multisearch-index