I am developing a web application for managing customers. So I have a Customer entity which is made up by usual fields such as first_name, last_name, age etc.
I have a page where these customers are shown as a table. In the same page I have a search field, and I'd like to filter customers and update the table while the user is typing a something in the search field, using Ajax. Here is how it should work:
Figure 1: The main page showing all of the customers:
Figure 2: As long as the user types letter "b", the table is updated with the results:
Given that partial text matching is not supported in GAE, I have tricked and implemented it arising from what is shown here: TL;DR: I have created a Customers Index, that contains a Search Document for every customer(doc_id=customer_key). Each Search Document contains Atom Fields for every customer's field I want to be able to search on(eg: first_name, last_name): every field is made up like this: suppose the last_name is Berlusconi, the field is going to be made up by these Atom Fields "b" "be" "ber" "berl" "berlu" "berlus" "berlusc" "berlusco" "berluscon" "berlusconi". In this way I am able to perform full text matching in a way that resembles partial text matching. If I search for "Be", the Berlusconi customer is returned.
The search is made by Ajax calls: whenever a user types in the search field(the ajax is dalayed a little bit to see if the user keeps typing, to avoid sending a burst of requests), an Ajax call is made with the query string, and a json object is returned.
Now, things were working well in debugging, but I was testing it with a few people in the datastore. As long as I put many people, search looks very slow.
This is how I create search documents. This is called everytime a new customer is put to the datastore.
def put_search_document(cls, key):
"""
Called by _post_put_hook in BaseModel
"""
model = key.get()
_fields = []
if model:
_fields.append(search.AtomField(name="empty", value=""),) # to retrieve customers when no query string
_fields.append(search.TextField(name="sort1", value=model.last_name.lower()))
_fields.append(search.TextField(name="sort2", value=model.first_name.lower()))
_fields.append(search.TextField(name="full_name", value=Customer.tokenize1(
model.first_name.lower()+" "+model.last_name.lower()
)),)
_fields.append(search.TextField(name="full_name_rev", value=Customer.tokenize1(
model.last_name.lower()+" "+model.first_name.lower()
)),)
# _fields.append(search.TextField(name="telephone", value=Customer.tokenize1(
# model.telephone.lower()
# )),)
# _fields.append(search.TextField(name="email", value=Customer.tokenize1(
# model.email.lower()
# )),)
document = search.Document( # create new document with doc_id=key.urlsafe()
doc_id=key.urlsafe(),
fields=_fields)
index = search.Index(name=cls._get_kind()+"Index") # not in try-except: defer will catch and retry.
index.put(document)
@staticmethod
def tokenize1(string):
s = ""
for i in range(len(string)):
if i > 0:
s = s + " " + string[0:i+1]
else:
s = string[0:i+1]
return s
This is the search code:
@staticmethod
def search(ndb_model, query_phrase):
# TODO: search returns a limited number of results(20 by default)
# (See Search Results at https://cloud.google.com/appengine/docs/python/search/#Python_Overview)
sort1 = search.SortExpression(expression='sort1', direction=search.SortExpression.ASCENDING,
default_value="")
sort2 = search.SortExpression(expression='sort2', direction=search.SortExpression.ASCENDING,
default_value="")
sort_opt = search.SortOptions(expressions=[sort1, sort2])
results = search.Index(name=ndb_model._get_kind() + "Index").search(
search.Query(
query_string=query_phrase,
options=search.QueryOptions(
sort_options=sort_opt
)
)
)
print "----------------"
res_list = []
for r in results:
obj = ndb.Key(urlsafe=r.doc_id).get()
print obj.first_name + " "+obj.last_name
res_list.append(obj)
return res_list
Did anyone else had my same experience? If so, how have you solved it?
Thank you guys very much, Marco Galassi
EDIT: names, email, phone are obviously totally invented. Edit2: I have now moved to TextField, who look a little bit faster, but the problem still persist