2

I'm pulling data from several remote DataSources, restructuring to fit my models schema and finally passing the array to MyModel::saveAll();

I'd like to avoid importing duplicate records (ie, don't import if MyModel.external_id = 120 & MyModel.external_type = 'basecamp.comment' already exists in db).

What's the most efficient way of going about this?

Sample data:

$data['MyModel'] = [
    [
        'title' => 'foo',
        'created' => '2013-12-18 11:29:06',
        'external_id' => 120,
        'external_type' => 'github.commit'
    ],
    [
        'title' => 'bar',
        'created' => '2013-12-18 13:22:06',
        'external_id' => 120,
        'external_type' => 'basecamp.comment'
    ]
];

NB: Notice that MyModel.external_id isn't unique on it's own.

Mystic
  • 9
  • 2
Timm
  • 157
  • 2
  • 8

2 Answers2

4

This is where validation comes into play. In your MyModel class, add the following:

public $validate = array(
    'external_type' => array(
        'rule' => 'idAndTypeUnique',
        'message' => "Type and ID already exist"
    )
);

public function idAndTypeUnique() 
{
    $existing = $this->find('first', array(
        'conditions' => array(
            'external_id' => $this->data[$this->name]['external_id'],
            'external_type' => $this->data[$this->name]['external_type']
         )
    ));

    return (count($existing) == 0);
}

Your saveAll() call would look like:

$this->MyModel->saveAll($data, array('validate' => true));
Derek
  • 4,575
  • 3
  • 22
  • 36
  • This works if I loop though the `$data` first and try saving individual records, but it won't work with my original data structure (since one failed validation will stop the `saveAll()` from executing). Correct? – Timm Mar 23 '14 at 13:41
  • 1
    Check the docs on the various validation options in the saveAll() method (not knowing what version of cake PHP you are using I based mine off of 2.4) to be sure, but I believe it only stops valid ones from saving (when invalid ones fail) when the validate option is 'first'. Just depends on how you want to handle invalid data and alert the user. All or nothing, let some save while some fail... Up to your specific use case. – Derek Mar 24 '14 at 04:04
3

The easiest way is to make a unique index on those two fields.

alter table my_model add unique index(external_id, external_type);

This forces the constraint in the database level.

If you want to force this constraint in the cake layer, then check this out: cakephp isUnique for 2 fields?

Community
  • 1
  • 1
Ben Hitchcock
  • 1,368
  • 10
  • 12
  • The result here is that the `saveAll()` call fails with a ´SQLSTATE[23000]: Integrity constraint violation´ error. That stops the duplicates from ending up in the database, but it doesn't save the unique rows (using the original dataset, or looping though and saving each row separately) – Timm Mar 23 '14 at 14:05