I'm assuming your use cases to be:
- Retrieving addresses by their account_id
- Retrieving account_ids by an address
- Finding accounts in a particular city/state/zip
I would recommend the following two things:
Index each address as a separate document
I would index each address as a separate document. Having a separate
doc for each address will enable you to keep the relationships
between different fields (which you would lose if you had an array of
cities and an array of states for each account).
Index each field separately
I would index each field (city, state, etc) separately. Breaking out each field will enable you to search them independently (eg get all the addresses in Cleveland, OH), use them as facets, boost scores based on them, etc.
Here's an example of some documents in my proposed schema:
[
{"type": "add",
"id": "<see below>",
"fields": {
"account_id": "123456",
"name": "John Smith",
"address_1": "1 Main St",
"address_2": "Apt 1",
"city": "Davenport",
"state": IA,
"zip": 52081
}
},
{"type": "add",
"id": "<see below>",
"fields": {
"account_id": "123456",
"name": "John Smith",
"address_1": "2 Elm St",
"city": "Lincoln",
"state": NE,
"zip": 23452
}
}
]
Generating Document IDs:
Note that you'd need some non-random way to construct unique document_ids (unique per account+address, not just per account). Something like the account_id plus a hash of the address,city,state,zip would work, or you could add another column to your table to uniquely identify them (I prefer the latter).