We're developing a search as a part of a bigger system.
We have Microsoft SQL Server 2014 - 12.0.2000.8 (X64) Standard Edition (64-bit)
with this setup:
CREATE TABLE NewCompanies(
[Id] [uniqueidentifier] NOT NULL,
[Name] [nvarchar](400) NOT NULL,
[Phone] [nvarchar](max) NULL,
[Email] [nvarchar](max) NULL,
[Contacts1] [nvarchar](max) NULL,
[Contacts2] [nvarchar](max) NULL,
[Contacts3] [nvarchar](max) NULL,
[Contacts4] [nvarchar](max) NULL,
[Address] [nvarchar](max) NULL,
CONSTRAINT PK_Id PRIMARY KEY (Id)
);
Phone
is a structured comma separated digits string like"77777777777, 88888888888"
Email
is structured emails string with commas like"email1@gmail.com, email2@gmail.com"
(or without commas at all like"email1@gmail.com"
)Contacts1, Contacts2, Contacts3, Contacts4
are text fields where users can specify contact details in free form. Like"John Smith +1 202 555 0156"
or"Bob, +1-999-888-0156, bob@company.com"
. These fields can contain emails and phones we want to search further.
Here we create full-text stuff
-- FULL TEXT SEARCH
CREATE FULLTEXT CATALOG NewCompanySearch AS DEFAULT;
CREATE FULLTEXT INDEX ON NewCompanies(Name, Phone, Email, Contacts1, Contacts2, Contacts3, Contacts4, Address)
KEY INDEX PK_Id
Here is a data sample
INSERT INTO NewCompanies(Id, Name, Phone, Email, Contacts1, Contacts2, Contacts3, Contacts4)
VALUES ('7BA05F18-1337-4AFB-80D9-00001A777E4F', 'PJSC Azimuth', '79001002030, 78005005044', 'regular@hotmail.com, s.m.s@gmail.com', 'John Smith', 'Call only at weekends +7-999-666-22-11', NULL, NULL)
Actually we have about 100 thousands of such records.
We expect users can specify a part of email like "@gmail.com" and this should return all the rows with Gmail email addresses in any of Email, Contacts1, Contacts2, Contacts3, Contacts4
fields.
The same for phone numbers. Users can search for a pattern like "70283" and a query should return phones with these digits in them. It's even for free form Contacts1, Contacts2, Contacts3, Contacts4
fields where we probably should remove all but digits and space characters firstly before searching.
We used to use LIKE
for the search when we had about 1500 records and it worked fine but now we have a lot of records and the LIKE
search takes infinite to get results.
This is how we try to get data from there:
SELECT * FROM NewCompanies WHERE CONTAINS((Email, Contacts1, Contacts2, Contacts3, Contacts4), '"s.m.s@gmail.com*"') -- this doesn't get the row
SELECT * FROM NewCompanies WHERE CONTAINS((Phone, Contacts1, Contacts2, Contacts3, Contacts4), '"6662211*"') -- doesn't get anything
SELECT * FROM NewCompanies WHERE CONTAINS(Name, '"zimuth*"') -- doesn't get anything