1. When/how to execute process:
Rake task that sync files with DB and exits
Create a rake task that scans through your persistent directory and updates the database. The rake task can be executed manually, or added as a cron job to the system. You can load the rails environment by declaring a task with task :sync => :environment do
.
Rake task that keeps running
You could use a gem such as listen to detect file changes to this directory and sync them to the DB. Or your rake task could sleep for a few minutes internally. You could also make this task a system service by creating a upstart job.
Background job library
Use a job library such as resque with resque-scheduler.
2. Code structure
I suggest the rake task/background job only serves as a mechanism to kick off the sync process. Your actual syncing logic/code should still be in normal ruby classes/modules.
3. How to perform xml -> db sync
If you can dictate what the xml looks like, I recommend using activerecord to_xml and from_xml to export/create/update or have a look what they do internally (attributes= method).
If you can not dictate the format, use nokogiri or xmlsimple as mentioned in this post to parse the xml files. Then use the normal active record querying/creating/updating methods to update the database. Use activerecord validations to make sure your data is always consistent in the database. If you want to make sure that no one can insert invalid data with other database connectors (not active record), you could use database indexes.