0

There's a simple text field and as a user types, an api call will be made that will fetch name of people based on what user types. For example, when a user presses a button 'A', immediately an API request will be made and all names starting with 'A' will be pulled from the database. The problem currently is that since we have around 1 million entries, it is taking around 8-10 seconds for every request. Is there a way to make this faster?

Here is the axios post request that I'm making.

  fetchingDataOnChange = () => {
    let nameValue = this.state.searchName;
    var myParams = {
        nameValue: nameValue
    }
    axios.post('/get-hcp-data', myParams) 
        .then((res) => {
            const hcps = res.data;
            this.setState({ hcps: hcps, hcpName: Object.values(hcps.hcp_details_concat) })
            console.log(res);
        }, (error) => {
            console.log(error);
        });
}

And here is the function in flask:

 @app.route('/get-hcp-data', methods=['GET', 'POST'])
 def get_hcp_data():
    value = request.get_json()
    newValue = value['nameValue']
    engine = connect_pg()
    sql = "select * from dim_hcp_details where lower(hcp_name) like lower('"+newValue+"%') limit 50"
    hcp_details = pd.read_sql_query(sql,engine)
    hcp_details['hcp_details_concat'] = "[" + hcp_details['hcp_id'] + "]" + hcp_details['hcp_name']
    hcp_dict = hcp_details.to_dict()
    return jsonify(hcp_dict)

enter image description here

ShridharK
  • 365
  • 2
  • 14
  • 2
    The most important thing here is not to use `+` on SQL query strings. It leads to injection vulnerabilities. Use `params` as described in https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql_query.html . – Klaus D. Mar 04 '21 at 10:53
  • 1
    I would cache it. Seems like something that it's called multiple times with same values and same response – Capie Mar 04 '21 at 11:00
  • 1
    Use the debouncing technique in your frontend, or preload the data by calling the API. and then filter that internally on the frontend side. – neilharia7 Mar 04 '21 at 11:04
  • If it's pandas dataframe you're querying then most likely it performs `'a' in row.hcp_name` check on every row in dataframe which can be really inefficient with large dataframe. I would try to optimize it with some kind of index but I'm not sure if if's possible with pandas. – kobuz Mar 04 '21 at 11:11
  • @KlausD. Can you give an elaborate example of that, a small code of an alternative way will suffice since I'm not very familiar with that. – ShridharK Mar 04 '21 at 11:20
  • You can find an example at https://stackoverflow.com/q/24408557/3929826. For more: https://www.google.com/search?q=pandas+read_sql_query+injection – Klaus D. Mar 04 '21 at 11:29
  • I executed this statement but I'm getting an error that the '?' is not supported. hcp_details = pd.read_sql_query(("select * from dim_hcp_details where lower(hcp_name) like lower(%?) limit 50"),engine,params={newValue}) – ShridharK Mar 04 '21 at 11:56

0 Answers0