I am iterating over flatProduct.Catalogs
slice and populating my productCatalog
concurrent map in golang. I am using upsert method so that I can add only unique productID's
into my productCatalog
map.
This uses a linear scan to check for duplicate product IDs but in my case I have more than 700k productId's so it is very slow for me. I am looking for ways to make it more efficient.
Below code is called by multiple goroutines in parallel that is why I am using concurrent map here to populate data into it.
var productRows []ClientProduct
err = json.Unmarshal(byteSlice, &productRows)
if err != nil {
return err
}
for i := range productRows {
flatProduct, err := r.Convert(spn, productRows[i])
if err != nil {
return err
}
if flatProduct.StatusCode == definitions.DONE {
continue
}
r.products.Set(strconv.Itoa(flatProduct.ProductId, 10), flatProduct)
for _, catalogId := range flatProduct.Catalogs {
catalogValue := strconv.FormatInt(int64(catalogId), 10)
// how can I improve below Upsert code for `productCatalog` map so that it can runs faster for me?
r.productCatalog.Upsert(catalogValue, flatProduct.ProductId, func(exists bool, valueInMap interface{}, newValue interface{}) interface{} {
productID := newValue.(int64)
if valueInMap == nil {
return []int64{productID}
}
oldIDs := valueInMap.([]int64)
for _, id := range oldIDs {
if id == productID {
// Already exists, don't add duplicates.
return oldIDs
}
}
return append(oldIDs, productID)
})
}
}
Above upsert code is very slow for me and it takes lot of time to add unique product id's as a value in my concurrent map. Here is how productCatalog
is defined.
productCatalog *cmap.ConcurrentMap
Here is the upsert
method which I am using - https://github.com/orcaman/concurrent-map/blob/master/concurrent_map.go#L56
This is how I am reading data from this cmap:
catalogProductMap := clientRepo.GetProductCatalogMap()
productIds, ok := catalogProductMap.Get("200")
var data = productIds.([]int64)
for _, pid := range data {
...
}