0

Table flights:

ID Path
1 NZ:EU
2 JP:CA
SELECT
    path
FROM
    flights
WHERE
    path ILIKE '%' || 'jpca' || '%'

Above query isn't working, it needs to return second row. But it works if I provide for example:

  • jp
  • ca
  • p
  • jp:ca
  • CA

It should also work for:

  • jpca
  • pj
  • cp
  • a:p

Regex answer is also accepted.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
MusabShakeel
  • 15
  • 1
  • 4
  • 2
    Why would you expect `'jpca'` to match the string `'JP:CA'`? If you want a match you need `'jp:ca'` –  May 14 '22 at 14:29
  • Do you mean `path ILIKE ANY '{%j%,%p%,%c%,%a%}'::text[]`? But tbh this doesn't make much sense – Bergi May 14 '22 at 14:35
  • @a_horse_with_no_name can it be done through regex? – MusabShakeel May 14 '22 at 14:44
  • @Bergi ye this won't be helpful – MusabShakeel May 14 '22 at 14:45
  • @MusabShakeel So what exactly *are* you looking for then? `path ILIKE '%jp%ca'`? – Bergi May 14 '22 at 19:29
  • @Bergi any letter/character matches in Path string, it needs to return that record. I provided example too. Postgres ILIKE is doing search linearly (left to right) on a string, but I need a solution which can search letter by letter or something like that. So if I provide "pj" it needs to return "JP:CA" record, if I provide jpca it needs to return "JP:CA" record and so on. – MusabShakeel May 14 '22 at 20:49
  • @MusabShakeel So you mean *all* of the letters of the string you provide must be found in the `offset` column value? But in arbitrary order? – Bergi May 14 '22 at 21:22
  • 1
    @Bergi I didn't properly understand your question, but if I provide `ac`, it matches second record and returns `JP:CA` but if I provide some odd string like `f`, `PH` etc. these aren't available in any plot so it'll just return no record (null). – MusabShakeel May 15 '22 at 09:07
  • @MusabShakeel Yes, that answers the question. So if you provided `':'`, it would match both rows? – Bergi May 15 '22 at 14:53
  • @Bergi yes it'll return both rows – MusabShakeel May 15 '22 at 17:51

2 Answers2

1

It's much easier if you process the column path before the match

Match individual characters

(Updated question.)

Assumptions:

  • All characters are significant, including punctuation.
  • A pattern matches if every character is found in the path.
  • Match case-insensitive.

Lower-case both operands and treat them as arrays.
If there can be duplicate letters, eliminate them for efficiency.

SELECT path
FROM   flights
WHERE  string_to_array(lower(path), null)
    @> string_to_array(lower('JPCA'), null);

Or:

...
WHERE  string_to_array(lower(path), null)  @> '{j,p,c,a}';

Returns all rows where path contains every character in the search patter.

@> is the array "contains" operator.

If the table is big, support it with an GIN index on the expression to make that fast (which is the point of this route):

CREATE INDEX flights_path_array_lower_gin_idx ON flights
USING gin (string_to_array(lower(path), null));

Related, with links to more:

If you don't need index support, a simper check will do:

...
WHERE path ~* ALL (string_to_array('JPCA', null))

~* is the case-insensitive regular expression match operator.

Related:

Substring matching

(Original question.)

Assumptions:

  • The sequence of characters in the search term must be matched.
  • Only ASCII letters are significant
  • Duplicate characters are significant
SELECT path
FROM   flights
WHERE  lower(regexp_replace(path, '[^a-zA-Z]', '', 'g')) ~ lower('JPCA');

This removes all characters except A-Z and a-z and casts the result to lower case before trying a regular expression match. Related:

If your table is big and you need it to be fast, create a trigram expression index:

CREATE INDEX flights_path_expr_idx ON flights
USING gin (lower(regexp_replace(path, '[^a-zA-Z]', '', 'g') gin_trgm_ops);

Requires the additional module pg_trgm installed. See:

Or add a "generated column" to your table and a plain B-tree index on that:

ALTER TABLE flights 
  ADD COLUMN path_ascii text GENERATED ALWAYS AS (lower(regexp_replace(path, '[^a-zA-Z]', '', 'g'))) STORED;

CREATE INDEX flights_path_ascii_trgm_idx ON flights USING gin (path_ascii gin_trgm_ops);

Then:

SELECT path FROM flights WHERE path_ascii ~ 'jpca';

See:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0

If you don't care for the characters of your term to occur in exactly that sequence, but only want to specify a search that matches each of these characters somewhere, you can use the ALL keyword to match multiple ILIKE patterns at once:

SELECT path
FROM flights
WHERE path ILIKE ALL( ARRAY['%j%', '%p%', '%c%', '%a%'] );

Now to generate that array from just a single string, you can use

SELECT *
FROM flights
WHERE path ILIKE ALL (SELECT '%' || regexp_split_to_table('jpca', '') || '%');

(online demo)

Bergi
  • 630,263
  • 148
  • 957
  • 1,375