First of all, let me mention that I think METIS is the wrong tool here, because it is used for graph partitioning, where the emphasis is on minimizing the number of edges between partitions while keeping the partitions balanced (more or less equal sizes)
What you probably want to do is community detection within social networks, i.e. the search for clusters which maximize internal connectivity (large number of edges between nodes from the same cluster) and minimize external connectivity (small number of edges between different clusters).
This can be achieved by maximizing the so-called Modularity of the clustering
There are several approaches to tackle this problem, a popular heuristic being Label propagation.
If you don't want to implement the algorithm yourself, I would recommend using a framework like NetworKit (unfortunately, I don't know any other such frameworks yet), which implements Label propagation, some modularity-based algorithms and many helpful tools.
But back to your original question:
What is -ptype=rb/kway
?
There are multiple ways how you can approach the graph partitioning problem: You can either try to partition the graph into your desired number of partitions directly (k-way partitioning) or you can split the graph in half repeatedly until you have the desired number of partitions (recursive bisection, rb)
What is the jth constraint?
METIS allows you to try and optimize multiple balance constraints at the same time, i.e. if you have multiple types of calculations on the graph that should all be more or less balanced among the compute nodes.
See the manual:
Many important types of multi-phase and multi-
physics computations require that multiple quantities be load balanced simultaneously.
[...]
METIS includes partitioning routines that can be used to partition a graph in the presence of such multiple balancing
constraints. Each vertex is now assigned a vector of m weights and the objective of the partitioning routines is
to minimize the edge-cut subject to the constraints that each one of the m weights is equally distributed among the
domains.
EDIT: Since you clarified that you wanted to look at a fixed number of clusters, I see how graph partitioning could be helpful here. Let me illustrate what ufactor
means:
The imbalance of a partitioned graph is (in this simple case) computed as the maximum of the imbalance for each partition, which is roughly the quotient partition size / average partition size
. So if we allow a maximum imbalance of 2, this means that the largest partition is twice as big as the average partition. Note however that ufactor
doesn't specify the imbalance directly, it specifies how many permille away from 1 the imbalance is allowed to be.
So ufactor=143
actually means that your maximal allowed imbalance is 1.143, which makes sense since your clusters are not that far from each other. So in your case, you will probably use larger values for ufactor to allow the groups to be of quite different sizes.
Consequences of large imbalance
If your imbalance is too large, it might happen that all the strongly-connected parts land in the same partition while only isolated nodes are put in the other partitions. This is due to the fact that the algorithm tries to minimize the number of cut edges between different partitions, which will be lower if we put all the high-degree nodes in the same partition.
What about spectral partitioning, ...?
The general approach of METIS works as follows:
Most input graphs are too large to partition directly, which is why so-called multilevel methods are used:
- The graph is first coarsened (nodes are combined while trying to preserve the graph structure) until its size becomes feasible to partition directly
- The coarsest graph is partitioned using an initial partitioning technique, where we could use a variety of approaches (combinatorial bisection, spectral bisection, exact solutions using ILPs, ...).
- The graph is then uncoarsened, where in each step a small number of nodes are moved from partition to partition in a local search to improve the overall edge cut.
My personal recommendation
I should however note that while graph partitioning might be a valid model for your case, METIS itself might not be the ideal implementation for you:
As you can read on the METIS homepage, it is mostly used for rather sparse graphs ('finite element methods, linear programming, VLSI, and transportation'), whereas social networks are much denser and have a different structure (degrees follow a power-law distribution)
The coarsening approach of METIS uses heavy edge matching to combine nodes which are somehow close together, which works great for the intended applications, for social networks however, clustering-based coarsening techniques might prove more efficient.
Another library that is a bit slower in general, but implements some presets especially for social networks is KaHIP, see the manual for details.
(I should mention however that I am biased in this regard, since I worked extensively with this library ;-) )