Sorry if this is a bit of an abstract question, I'll try to provide more details.
I run "experiments" (eg test runs of various software), each experiment has its own set of metadata (basically key/value pairs, like start time, end time, name, resource cardinality, system type, etc) and one or more time-series data related to various performance metrics (eg CPU and memory usage from start to end at 10 seconds intervals). The amount of data will not be huge; at most some gigabytes per month.
I'd like to store this data in a single system (eg not metadata in MySQL and performance data in some specialized time series database). Would elasticsearch be a good fit for this? How would I best index the data?
EDIT: to be clearer, here are some thoughts on how to organize the data. For the metadata, use a metadata
index, for example like so for experiment aa_12:
{
"_id": "aa_12",
"_source": {
"name": "aa_12",
"start": 1420070400001,
"end": 1420097400001,
"system": "cluster-1",
"nodes": 6,
...
}
}
Having the experiment name as the _id makes the occasional updates easier (I suppose).
then for the time series associated to this experiment use an index perfdata
for example as follows:
{
"_source": {
"host": "cluster-1-1",
"experiment": "aa_12",
"cpu1": 44,
"cpu5": 40,
"cpu15": 41,
"memtot": 16384,
"memused": 5025,
... rest of metrics
"time": 1420070410001
}
}
so I could query, for example, "give me metric X for host Y for the duration of experiment Z" and get metric graphs using kibana/timelion. My concern at this point is that the perfdata
index could grow to contain lots of entries (not very big in size overall, but still some hundred thousand/million entries). Does the above make sense?