Postgres Full Text Search

Postgres has built-in functions to handle Full Text Search queries. This is like a "search engine" within Postgres.

to_tsvector()

Converts your data into searchable "tokens". to_tsvector() stands for "to text search vector". For example:

select to_tsvector("the green eggs and ham")
-- Returns 'egg':2 'green':1 'ham':4


select to_tsvector("the green egg and ham")
-- Returns 'egg':2 'green':1 'ham':4

Collectively these tokens are called a document which Postgres can use for comparisons, where every token is a lexeme (unit of lexical meaning). The stopwords (and, or, the, ...) are conveniently omitted. As you can see, the eggs will be normalized as a lexeme in English: egg.

to_tsquery()

to_tsquery(), which accepts a list of words that will be checked against the normalized vector we created with to_tsvector().

The @@ operator to check if tsquery matches tsvector, it's returns true (t) if matched, otherwise returns false (f).

Let's see some queries below:

select to_tsvector("the green eggs and ham") @@ to_tsquery("egg")
?column?
--------
t
select to_tsvector("the green eggs and ham") @@ to_tsquery("eggs")
?column?
--------
t

Use & for AND in the search query:

select to_tsvector("the green eggs and ham") @@ to_tsquery("eggs & red")
?column?
--------
f

Use | for OR in the search query:

select to_tsvector("the green eggs and ham") @@ to_tsquery("eggs | red")
?column?
--------
t

Use the proximity symbol <-> for searching for terms that are a certain "distance" apart. For example, search the phase green egg, green is followed immediately by a match for egg.

select to_tsvector("the green eggs and ham") @@ to_tsquery("green <-> egg")
?column?
--------
t

For example, search for the phase egg [1 word] ham, find egg and ham within 2 words of each other:

select to_tsvector("the green eggs and ham") @@ to_tsquery("egg <1> ham")
?column?
--------
t

Use the negation symbol ! to find phrases which don't contain a search term. For example, search for the phase that have egg but not ham:

select to_tsvector("the green eggs and ham") @@ to_tsquery("egg & !ham")
?column?
--------
f

References

DataData, Data Engineer, Database, Postgres