I have a table in my MySQL database containing 200K records. Each record contains a URL that should be processed in some way. The URL processing in my case is not a trivial task, so I have chosen to use the Gearman queue to run these as background jobs.
So, for the each record (URL) in my table I plan to create separate task and supply it to Gearman.
Also, the data in my table is not static and very often new URLs will be added there.
According to my business logic I need to continuously process this list of urls. When I have completed processing of the last record in my DB table I should move to the first one and process should be repeated for the all records again.
So my questions:
- How to better supply tasks to Gearman in this case?
- Should I use cron or is it possible to organize logic where Gearman will automatically pull tasks?
- How many tasks can be submitted in one time to Gearman?
So, could you please tell me how best to implement this system?