2

I've got the mailman gem integrated into my rails project. It fetches emails from gmail successfully. In my app there is a model Message for my emails. The emails are properly saved as Message model.

The problem is that the emails are saved multiple times sometimes and I can't recognize a pattern. Some emails are saved once, some two times and some are saved three times.

But I can't find the failure in my code.

Here is my mailman_server script:

script/mailman_server

#!/usr/bin/env ruby
# encoding: UTF-8
require "rubygems"
require "bundler/setup"
require File.expand_path(File.join(File.dirname(__FILE__), '..', 'config', 'environment'))
require 'mailman'

Mailman.config.ignore_stdin = true
#Mailman.config.logger = Logger.new File.expand_path("../../log/mailman_#{Rails.env}.log", __FILE__)

if Rails.env == 'test'
  Mailman.config.maildir = File.expand_path("../../tmp/test_maildir", __FILE__)
else
  Mailman.config.logger = Logger.new File.expand_path("../../log/mailman_#{Rails.env}.log", __FILE__)
  Mailman.config.poll_interval = 15
  Mailman.config.imap = {
    server: 'imap.gmail.com',
    port: 993,  # usually 995, 993 for gmail
    ssl: true,
    username: 'my@email.com',
    password: 'my_password'
  }
end

Mailman::Application.run do
  default do
    begin
      Message.receive_message(message)
    rescue Exception => e
      Mailman.logger.error "Exception occurred while receiving message:\n#{message}"
      Mailman.logger.error [e, *e.backtrace].join("\n")
    end
  end
end

The email is processed inside my Message class:

  def self.receive_message(message)
    if message.from.first == "my@email.com"
      Message.save_bcc_mail(message)
    else
      Message.save_incoming_mail(message)
    end
  end

  def self.save_incoming_mail(message)
    part_to_use = message.html_part || message.text_part || message
    if Kontakt.where(:email => message.from.first).empty?
      encoding = part_to_use.content_type_parameters['charset']
      Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.from.first, inbound: true, time: message.date
    else
      encoding = part_to_use.content_type_parameters['charset']
      Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.from.first, inbound: true, time: message.date, messageable_type: 'Company', messageable_id: Kontakt.where(:email => message.from.first).first.year.id
    end
  end

  def self.save_bcc_mail(message)
    part_to_use = message.html_part || message.text_part || message
    if Kontakt.where(:email => message.to.first).empty?
      encoding = part_to_use.content_type_parameters['charset']
      Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.to.first, inbound: false, time: message.date
    else
      encoding = part_to_use.content_type_parameters['charset']
      Message.create topic: message.subject, message: part_to_use.body.decoded.force_encoding(encoding).encode('UTF-8'), communication_partner: message.to.first, inbound: false, time: message.date, messageable_type: 'Company', messageable_id: Kontakt.where(:email => message.to.first).first.year.id
    end
  end

I have daemonized the mailman_server with this script:

script/mailman_daemon

#!/usr/bin/env ruby

require 'rubygems'  
require "bundler/setup"  
require 'daemons'

Daemons.run('script/mailman_server') 

I deploy with capistrano.

This are the parts which are responsible for stopping, starting and restarting my mailman_server:

script/deploy.rb

set :rails_env, "production" #added for delayed job  
after "deploy:stop",    "delayed_job:stop"
after "deploy:start",   "delayed_job:start"
after "deploy:restart", "delayed_job:restart"
after "deploy:stop",    "mailman:stop"
after "deploy:start",   "mailman:start"
after "deploy:restart", "mailman:restart"

namespace :deploy do
  desc "mailman script ausfuehrbar machen"
  task :mailman_executable, :roles => :app do
   run "chmod +x #{current_path}/script/mailman_server"
  end

  desc "mailman daemon ausfuehrbar machen"
  task :mailman_daemon_executable, :roles => :app do
   run "chmod +x #{current_path}/script/mailman_daemon"
  end
end

namespace :mailman do  
  desc "Mailman::Start"
  task :start, :roles => [:app] do
   run "cd #{current_path};RAILS_ENV=#{fetch(:rails_env)} bundle exec script/mailman_daemon start"
  end

  desc "Mailman::Stop"
  task :stop, :roles => [:app] do
   run "cd #{current_path};RAILS_ENV=#{fetch(:rails_env)} bundle exec script/mailman_daemon stop"
  end

  desc "Mailman::Restart"
  task :restart, :roles => [:app] do
   mailman.stop
   mailman.start
  end
end

Could it be that multiple instances of the mailman server are started during my deploy at nearly the same time and then each instance polls nearly at the same time? The second and third instance pools before the first instance marks the email as read and polls and processes the email as well?

Update 30.01.

I had set the polling intervall to 60 seconds. but that changes nothing.

I checked the folder where the mailman pid file is stored. there is only one mailman pid file. So there is definitely only one mailman server running. I checked the logfile and can see, that the messages are fetched multiple times:

Mailman v0.7.0 started
IMAP receiver enabled (my@email.com).
Polling enabled. Checking every 60 seconds.
Got new message from 'my.other@email.com' with subject 'Test nr 0'.
Got new message from 'my.other@email.com' with subject 'Test nr 1'.
Got new message from 'my.other@email.com' with subject 'test nr 2'.
Got new message from 'my.other@email.com' with subject 'test nr 2'.
Got new message from 'my.other@email.com' with subject 'test nr 3'.
Got new message from 'my.other@email.com' with subject 'test nr 4'.
Got new message from 'my.other@email.com' with subject 'test nr 4'.
Got new message from 'my.other@email.com' with subject 'test nr 4'.

So that seems to me, that the problem is definitely in my mailman server code.

Update 31.1.

Seems to me, that is has something to do with my production machine. when I'm testing this in development with the exact same configuration (changed my local database from sqlite to mysql this morning to test it) as on the production machine I don't get duplicates. Probably is everything ok with my code, but there is a problem with the production machine. Will ask my hoster if they could see a solution for this. To fix this I will go with Ariejan'S suggestion.

The solution: I found the problem. I deploy to a machine where the tmp directory is a shared one between all releases. I forgot to define the path where the pid file of the mailman_daemon should be saved. So it was saved in the script directory instead of the /tmp/pids directory. Because of this the old mailman_daemon could not be stopped after a new deploy. That had led to an army of working mailman_daemons which were polling my mailaccount... After killing all these processes all went well! No more duplicates!

tshepang
  • 12,111
  • 21
  • 91
  • 136
coderuby
  • 1,188
  • 1
  • 11
  • 26
  • 1
    Hi railsnewbie, seems like a tough question. I took the liberty of sponsoring your question on CodersClan over here - http://www.codersclan.net/ticket/211 – Dror Jan 30 '14 at 09:42

2 Answers2

2

This may be some concurrency/timing issue. E.g. new mails are imported before the ones currently processing have been saved.


Edit: Just noticed you have Mailman.config.poll_interval set to 15. This means it will check for new messages every 15 seconds. Try increasing this value to the default 60 seconds. Regardless of this setting, it might be a good idea to add the deduplication code I mentioned below.


My tip would be to also store the message_id from each email, so you can easily spot duplicates.

Instead of:

Message.create(...)

do:

# This makes sure you have the latest pulled version.
message = Message.find_or_create(message_id: message.message_id)
message.update_attributes(...)

# This makes sure you only import it once, then ignore further duplicates.
if !Message.where(message_id: message.message_id).exists?
  Message.create(...)
end

For more info on message_id: http://rdoc.info/github/mikel/mail/Mail/Message#message_id-instance_method

Remember that email and imap are not meant to be consistent data stores like you'd expect Postgres or Mysql to be. Hope this helps you sort out the duplicate mails.

Ariejan
  • 10,910
  • 6
  • 43
  • 40
  • I have set the intervall to 60 but that changes nothing. I will go with your suggestion to store the message_id and check that before I save a new message. – coderuby Jan 30 '14 at 20:13
  • I will let this question open until tomorrow. Can't believe that this is normal behaviour of the mailman server. I update my question with some new information now. When there will be no posting which will answer the question why the mails are imported multiple times by tomorrow, I will accept your answer. Thanks for your effort! – coderuby Jan 30 '14 at 20:20
  • I updated my code according your suggestion. But I have still duplicate mails. I looked at your code and understand this part: if !Message.where(message_id: message.message_id).exists? Message.create(...) end. But doesn't the first part do mostly the same? First it finds a Message object with the same message_id when it's in the database and when its not there it instantiates a new one. In both cases it updates the object and saves it to the database. How makes it sure, that I have the latest pulled version? It only checks the database, doesn't it? – coderuby Jan 31 '14 at 14:44
  • I'm a Rails beginner, so perhaps it's not obvious to me. Won't the line `Message.where(message_id: message.message_id).exists?` always return `true` since we are finding or creating such a record in the above line `Message.find_or_create(message_id: message.message_id)`. In other words, the block with `Message.create` should not be needed. – Dennis Mar 25 '14 at 16:13
0

I found the problem. I deploy to a machine where the tmp directory is a shared one between all releases. I forgot to define the path where the pid file of the mailman_daemon should be saved. So it was saved in the script directory instead of the /tmp/pids directory. Because of this the old mailman_daemon could not be stopped after a new deploy. That had led to an army of working mailman_daemons which were polling my mailaccount... After killing all these processes all went well! No more duplicates!

coderuby
  • 1,188
  • 1
  • 11
  • 26