I am trying to run an instance of weaviate, but am running across an issue with memory consumption. I have weaviate running in a docker container with 16GB of memory, which looking in the documentation seems like it would be enough for over 1M records (I am using 384 dim vectors just like in the example).
The application connecting to weaviate is constantly inserting and querying for data. The memory usage continues to go up until eventually running out of memory and the docker container dies. This is only around 20k records.
Is this a problem with garbage collection never happening?
UPDATE:
The version of weaviate in question is 1.10.1 and not currently using any modules. Incoming records already have vectors so no vectorizer is being used. The application searches for records similar to the incoming record based on some metadata and nearVector filters then inserts the incoming record. I will be upgrading to 1.12.1 to see if this helps at all, but in the meantime here are some of the suggested memory measurements.
7k records:
docker stats memory usage: 2.56GB / 16GB
gc 1859 @750.550s 0%: 0.33+33+0.058 ms clock, 26+1.2/599/1458+4.6 ms cpu, 2105->2107->1102 MB, 2159 MB goal, 80P
gc 1860 @754.322s 0%: 0.17+34+0.094 ms clock, 13+1.0/644/1460+7.5 ms cpu, 2150->2152->1126 MB, 2205 MB goal, 80P
gc 1861 @758.598s 0%: 0.39+35+0.085 ms clock, 31+1.4/649/1439+6.8 ms cpu, 2197->2199->1151 MB, 2253 MB goal, 80P
11k records:
docker stats memory usage: 5.46GB / 16GB
gc 1899 @991.964s 0%: 1.0+65+0.055 ms clock, 87+9.9/1238/3188+4.4 ms cpu, 4936->4939->2589 MB, 5062 MB goal, 80P
gc 1900 @999.496s 0%: 0.17+58+0.067 ms clock, 13+2.8/1117/3063+5.3 ms cpu, 5049->5052->2649 MB, 5178 MB goal, 80P
gc 1901 @1008.717s 0%: 0.38+65+0.072 ms clock, 30+2.7/1242/3360+5.7 ms cpu, 5167->5170->2710 MB, 5299 MB goal, 80P
17k records:
docker stats memory usage: 11.25GB / 16GB
gc 1932 @1392.757s 0%: 0.37+110+0.019 ms clock, 30+4.6/2130/6034+1.5 ms cpu, 10426->10432->5476 MB, 10694 MB goal, 80P
gc 1933 @1409.740s 0%: 0.14+108+0.052 ms clock, 11+0/2075/5666+4.2 ms cpu, 10679->10683->5609 MB, 10952 MB goal, 80P
gc 1934 @1427.611s 0%: 0.31+116+0.10 ms clock, 25+4.6/2249/6427+8.2 ms cpu, 10937->10942->5745 MB, 11218 MB goal, 80P
20k records:
docker stats memory usage: 15.22GB / 16GB
gc 1946 @1658.985s 0%: 0.13+136+0.077 ms clock, 10+1.1/2673/7618+6.1 ms cpu, 14495->14504->7600 MB, 14866 MB goal, 80P
gc 1947 @1681.090s 0%: 0.28+148+0.045 ms clock, 23+0/2866/8142+3.6 ms cpu, 14821->14829->7785 MB, 15201 MB goal, 80P
GC forced
gc 16 @1700.012s 0%: 0.11+2.0+0.055 ms clock, 8.8+0/20/5.3+4.4 ms cpu, 3->3->3 MB, 7MB goal, 80P
gc 1948 @1703.901s 0%: 0.41+147+0.044 ms clock, 33+0/2870/8153+3.5 ms cpu, 15181->15186->7973 MB, 15570 MB goal, 80P
gc 1949 @1728.327s 0%: 0.29+156+0.048 ms clock, 23+18/3028/8519+3.9 ms cpu, 15548->15553->8168 MB, 15946 MB goal, 80P
pprof
flat flat% sum% cum cum%
7438.24MB 96.88% 96.88% 7438.74MB 96.88% github.com/semi-technologies/weaviate/adapters/repos/db/inverted.(*Searcher).docPointersInvertedNoFrequency.func1
130.83MB 1.70% 98.58% 7594.13MB 98.91% github.com/semi-technologies/weaviate/adapters/repos/db/inverted.(*Searcher).DocIDs
1MB 0.013% 98.59% 40.55MB 0.53% github.com/semi-technologies/weaviate/adapters/repos/vector/hnsw.(*hnsw).Add
0 0% 98.59% 65.83MB 0.86% github.com/go-openapi/runtime/middleware.NewOperationExecutor.func1
UPDATE 2:
Problem still exists after upgrading to 1.12.1