13

I know I can do like this:

func randomFunc() {
    // do stuff
    go destroyObjectAfterXHours(4, "idofobject")
    // do other stuff
}

func destroyObjectAfterXHours(hours int, id string) {
    time.Sleep(hours * time.Hour)
    destroyObject(id)
}

but if we imagine destroyObjectAfterXHours is called a million times within a few minutes, this solution will be very bad.

I was hoping someone could share a more efficient solution to this problem.

I've been thinking about a potential solution where destruction time and object id was stored somewhere, and then there would be one func that would traverse through the list every X minutes, destroy the objects that had to be destroyed and remove their id and time info from wherever that info was stored. Would this be a good solution?

I worry it would also be bad solution since you will then have to traverse through a list with millions of items all the time, and then have to efficiently remove some of the items, etc.

phuclv
  • 37,963
  • 15
  • 156
  • 475
fisker
  • 979
  • 4
  • 18
  • 28
  • Can you describe your use case a bit more? I am asking because there might be a more elegant solution. For example, MongoDB allows auto deletion of documents based on a TTL index on a timestamp. – Markus W Mahlberg Dec 31 '17 at 19:21

4 Answers4

29

The time.AfterFunc function is designed for this use case:

func randomFunc() {
    // do stuff
    time.AfterFunc(4*time.Hour, func() { destroyObject("idofobject") })
    // do other stuff
}

time.AfterFunc is efficient and simple to use.

As the documentation states, the function is called in a goroutine after the duration elapses. The goroutine is not created up front as in the question.

Charlie Tumahai
  • 113,709
  • 12
  • 249
  • 242
  • As far as I can understand, this will spawn its own goroutine, meaning a million requests would still result in a million goroutines being created? – fisker Dec 30 '17 at 18:33
  • 1
    Yes, the runtime creates a goroutine to run the function when the timer expires. As far as expiration of the timers is concerned, the runtime implementation is very efficient (the runtime maintains bucketed heaps of timers). – Charlie Tumahai Dec 30 '17 at 19:07
  • Only when the timer expires? Meaning `// do other stuff` will be called long before `destoryObject` happens? And if `destroyObject` only takes a milisecond then I won't need to worry about having a million goroutines running at the same time? If that's the case it sounds almost too good to be true – fisker Dec 30 '17 at 19:11
  • 1
    The goroutine is created after the timer expires. The `do other stuff` will be executed long before `destroyOjbect` is called. You don't need to worry about millions of goroutines if `destroyObject` is quick and the expirations are spread over time. – Charlie Tumahai Dec 30 '17 at 19:25
  • It's really blowing my mind.. can't recall ever seeing something that should be so complicated become so simple.. will try to test it.. thanks again :)! – fisker Dec 30 '17 at 19:35
  • 1
    Is `AfterFunc` a blocking call? – Abhijit Sarkar Mar 22 '22 at 08:39
  • @abhijit-sarkar, no. It returns immediately the pointer to the Timer. It wouldn't make sense otherwise – Tommaso Resti Nov 18 '22 at 16:49
4

So I'd agree with your solution #2 than number 1.

Traversing through a list of a million numbers is much easier than having a million separate Go Routines

Go Routines are expensive (compared to loops) and take memory and processing time. For eg. a million Go Routines take about 4GB of RAM.

Traversing through a list on the other hand takes very little space and is done in O(n) time.

A good example of this exact function is Go Cache which deletes its expired elements in a Go Routine it runs periodically

https://github.com/patrickmn/go-cache/blob/master/cache.go#L931

This is a more detailed example of how they did it:

type Item struct {
    Object     interface{}
    Expiration int64
}


func (item Item) Expired() bool {
    if item.Expiration == 0 {
        return false
    }
   return time.Now().UnixNano() > item.Expiration
}

func RemoveItem(s []Item, index int) []int {
     return append(s[:index], s[index+1:]...)
}

func deleteExpired(items []Item){ 
    var deletedItems  []int
    for k, v := range items {
        if v.Expired(){
            deletedItems = append(deletedItems, k)
        }
    }
    for key, deletedIndex := range deleteditems{ 
        items = RemoveItem(items, deletedIndex)
    }
}

The above implementation could definitely be improved with a linked list instead of an array but this is the general idea

cjds
  • 8,268
  • 10
  • 49
  • 84
  • 1
    By the way, as you said, a linked list would be better, I think you should use golang build in double linked list here, https://golang.org/pkg/container/list/ – John Balvin Arias May 09 '18 at 16:53
1

This is an interesting question. I come to a solution where it use a heap to maintain the queue of items to be destroyed and sleep for exactly time until the next item is up for destruction. I think it is more efficient, but the gain might be slim on some cases. Nonetheless, you can see the code here:

package main
import (
    "container/heap"
    "fmt"
    "time"
)

type Item struct {
    Expiration time.Time
    Object     interface{} // It would make more sence to be *interface{}, but not as convinient
}

//MINIT is the minimal interval for delete to run. In most cases, it is better to be set as 0
const MININT = 1 * time.Second

func deleteExpired(addCh chan Item) (quitCh chan bool) {
    quitCh = make(chan bool)
    go func() {
        h := make(ExpHeap, 0)
        var t *time.Timer

        item := <-addCh
        heap.Push(&h, &item)
        t = time.NewTimer(time.Until(h[0].Expiration))

        for {
            //Check unfinished incoming first
            for incoming := true; incoming; {
                select {
                case item := <-addCh:
                    heap.Push(&h, &item)
                default:
                    incoming = false
                }
            }
            if delta := time.Until(h[0].Expiration); delta >= MININT {
                t.Reset(delta)
            } else {
                t.Reset(MININT)
            }

            select {
            case <-quitCh:
                return
            //New Item incoming, break the timer
            case item := <-addCh:
                heap.Push(&h, &item)
                if item.Expiration.After(h[0].Expiration) {
                    continue
                }
                if delta := time.Until(item.Expiration); delta >= MININT {
                    t.Reset(delta)
                } else {
                    t.Reset(MININT)
                }
            //Wait until next item to be deleted
            case <-t.C:
                for !h[0].Expiration.After(time.Now()) {
                    item := heap.Pop(&h).(*Item)
                    destroy(item.Object)
                }
                if delta := time.Until(h[0].Expiration); delta >= MININT {
                    t.Reset(delta)
                } else {
                    t.Reset(MININT)
                }
            }
        }
    }()
    return quitCh
}

type ExpHeap []*Item

func (h ExpHeap) Len() int {
    return len(h)
}

func (h ExpHeap) Swap(i, j int) {
    h[i], h[j] = h[j], h[i]
}

func (h ExpHeap) Less(i, j int) bool {
    return h[i].Expiration.Before(h[j].Expiration)
}

func (h *ExpHeap) Push(x interface{}) {
    item := x.(*Item)
    *h = append(*h, item)
}

func (h *ExpHeap) Pop() interface{} {
    old, n := *h, len(*h)
    item := old[n-1]
    *h = old[:n-1]
    return item
}

//Auctural destroy code.
func destroy(x interface{}) {
    fmt.Printf("%v @ %v\n", x, time.Now())
}

func main() {
    addCh := make(chan Item)
    quitCh := deleteExpired(addCh)

    for i := 30; i > 0; i-- {
        t := time.Now().Add(time.Duration(i) * time.Second / 2)
        addCh <- Item{t, t}
    }
    time.Sleep(7 * time.Second)
    quitCh <- true
}

playground: https://play.golang.org/p/JNV_6VJ_yfK

By the way, there are packages like cron for job management, but I am not familiar with them so I cannot speak for their efficiency.

Edit: Still I haven't enough reputation to comment :( About performance: this code basically has less CPU usage as it only wake it self when necessary and only traverse items that is up for destruction instead of the whole list. Based on personal (actually ACM experience), roughly a mordern CPU can process a loop of 10^9 in 1.2 seconds or so, which means on a scale of 10^6, traversing the whole list takes about over 1 millisecond excluding auctual destruction code AND data copy (which will cost a lot on average on more than thousands of runs, to a scale of 100 millisecond or so). My code's approach is O(lg N) which on 10^6 scale is at least a thousand times faster (considering constant). Please note again all these calculation is based on experience instead of benchmarks (there was but I am not able to provide them).

Edit 2: With a second thought, I think the plain solution can use a simple optimization:

func deleteExpired(items []Item){ 
    tail = len(items)
    for index, v := range items { //better naming
        if v.Expired(){
            tail--
            items[tail],items[index] = v,items[tail]
        }
    }
    deleteditems := items[tail:]
    items:=items[:tail]
}

With this change, it no longer copy data unefficiently and will not allocate extra space.

Edit 3: changing code from here I tested the memoryuse of afterfunc. On my laptop it is 250 bytes per call, while on palyground it is 69 (I am curious at the reason). With my code, A pointer + a time.Time is 28 byte. At a scale of a million, the difference is slim. Using After Func is a much better option.

leaf bebop
  • 7,643
  • 2
  • 17
  • 27
  • Really appreciate it, but looks a bit too complicated for me :p.. I'll try to study it closer. You think the performance would be better than the one in cjds's answer? – fisker Dec 30 '17 at 18:36
  • Thanks a lot <3 I'll try to integrate the code if Cerise's `time.AfterFunc`doesn't work the way I hope it does (: – fisker Dec 30 '17 at 19:23
  • The heap is indeed a good solution to the problem, but why not use the runtime's heap of timers? – Charlie Tumahai Dec 30 '17 at 19:35
  • Well, somehow I magically gain the ability to comment :). time.AfterFunc spawns a gorountine everytime It is called and only release the gorountine after the timer stops. So basically it is the same with your orignal solution. The profermance analysis has been edited into the answer. – leaf bebop Dec 30 '17 at 19:38
  • @leafbebop time.AfterFunc does **not** spawn a goroutine when called. [The goroutine is started in `goFunc`](https://github.com/golang/go/blob/acce8268b611613bef4c422a861bd4863e85b0f1/src/time/sleep.go#L171-L173). The function `goFunc` is [called when the time elapses](https://github.com/golang/go/blob/acce8268b611613bef4c422a861bd4863e85b0f1/src/runtime/time.go#L246). – Charlie Tumahai Dec 30 '17 at 19:55
  • Whether a new timer start a new gorountine is unclear to me. I am going to test it. Be right back. – leaf bebop Dec 30 '17 at 20:06
  • Yes, the test result confirm it that a Timer while using a channel does not spawn a goroutine (or it does but a very tiny one). The test result has been added into the answer. Thank you for the insight. – leaf bebop Dec 30 '17 at 20:24
1

If it is a one-shot, this can be easily achieved with

// Make the destruction cancelable
cancel := make(chan bool)

go func(t time.Duration, id int){
 expired := time.NewTimer(t).C
 select {
   // destroy the object when the timer is expired
   case <-expired:
     destroyObject(id)
   // or cancel the destruction in case we get a cancel signal
   // before it is destroyed
   case <-cancel:
     fmt.Println("Cancelled destruction of",id)
     return
 }
}(time.Hours * 4, id)

if weather == weather.SUNNY {
  cancel <- true
}

If you want to do it every 4 h:

// Same as above, though since id may have already been destroyed
// once, I name the channel different
done := make(chan bool)

go func(t time.Duration,id int){

  // Sends to the channel every t
  tick := time.NewTicker(t).C

  // Wrap, otherwise select will only execute the first tick
  for{
    select {
      // t has passed, so id can be destroyed
      case <-tick:
        destroyObject(id)
      // We are finished destroying stuff
      case <-done:
        fmt.Println("Ok, ok, I quit destroying...")
        return
    }
  }
}()

if weather == weather.RAINY {
  done <- true
}

The idea behind it is to run a single goroutine per destruction job which can be cancelled. Say, you have a session and the user did something, so you want to keep the session alive. Since goroutines are extremely cheap, you can simply fire off another goroutine.

Markus W Mahlberg
  • 19,711
  • 6
  • 65
  • 89