This module contains algorithms for the computation of community structure on graphs.
| community_structure (g, n_iter, n_spins[, gamma, corr, spins, ...]) | Obtain the community structure for the given graph, used a Potts model approach. |
| modularity (g, prop[, weight]) | Calculate Newman’s modularity. |
| condensation_graph (g, prop[, weight]) | Obtain the condensation graph, where each vertex with the same ‘prop’ value is condensed in one vertex. |
Obtain the community structure for the given graph, used a Potts model approach.
| Parameters: | g : Graph
n_iter : int
n_spins : int
gamma : float (optional, default: 1.0)
corr : string (optional, default: “erdos”)
spins : PropertyMap
weight : PropertyMap (optional, default: None)
t_range : tuple of floats (optional, default: (100.0, 0.01))
verbose : bool (optional, default: False)
history_file : string (optional, default: None)
|
|---|---|
| Returns: | spins : PropertyMap
|
See also
Notes
The method of community detection covered here is an implementation of what was proposed in [reichard-statistical-2006]. It consists of a simulated annealing algorithm which tries to minimize the following hamiltonian:
where p_{ij} is the probability of vertices i and j being connected, which reduces the problem of community detection to finding the ground states of a Potts spin-glass model. It can be shown that minimizing this hamiltonan, with \gamma=1, is equivalent to maximizing Newman’s modularity ([newman-modularity-2006]). By increasing the parameter \gamma, it’s possible also to find sub-communities.
It is possible to select three policies for choosing p_{ij} and thus choosing the null model: “erdos” selects a Erdos-Reyni random graph, “uncorrelated” selects an arbitrary random graph with no vertex-vertex correlations, and “correlated” selects a random graph with average correlation taken from the graph itself. Optionally a weight property can be given by the weight option.
The most important parameters for the algorithm are the initial and final temperatures (t_range), and total number of iterations (max_iter). It normally takes some trial and error to determine the best values for a specific graph. To help with this, the history option can be used, which saves to a chosen file the temperature and number of spins per iteration, which can be used to determined whether or not the algorithm converged to the optimal solution. Also, the verbose option prints the computation status on the terminal.
Note
If the spin property already exists before the computation starts, it’s not re-sampled at the beginning. This means that it’s possible to continue a previous run, if you saved the graph, by properly setting t_range value, and using the same spin property.
If enabled during compilation, this algorithm runs in parallel.
References
| [reichard-statistical-2006] | Joerg Reichardt and Stefan Bornholdt, “Statistical Mechanics of Community Detection”, Phys. Rev. E 74 (2006) 016110, arXiv:cond-mat/0603718 |
| [newman-modularity-2006] | M. E. J. Newman, “Modularity and community structure in networks”, Proc. Natl. Acad. Sci. USA 103, 8577-8582 (2006), arXiv:physics/0602124 |
Examples
>>> from pylab import *
>>> from numpy.random import seed
>>> seed(42)
>>> g = gt.load_graph("community.xml")
>>> pos = (g.vertex_properties["pos_x"], g.vertex_properties["pos_y"])
>>> spins = gt.community_structure(g, 10000, 20, t_range=(5, 0.1),
... history_file="community-history1")
>>> gt.graph_draw(g, pos=pos, pin=True, vsize=0.3, vcolor=spins,
... output="comm1.png", size=(10,10))
<...>
>>> spins = gt.community_structure(g, 10000, 40, t_range=(5, 0.1),
... gamma=2.5,
... history_file="community-history2")
>>> gt.graph_draw(g, pos=pos, pin=True, vsize=0.3, vcolor=spins,
... output="comm2.png", size=(10,10))
<...>
>>> clf()
>>> xlabel("iterations")
<...>
>>> ylabel("number of communities")
<...>
>>> a = loadtxt("community-history1").transpose()
>>> plot(a[0], a[2])
[...]
>>> savefig("comm1-hist.png")
>>> clf()
>>> xlabel("iterations")
<...>
>>> ylabel("number of communities")
<...>
>>> a = loadtxt("community-history2").transpose()
>>> plot(a[0], a[2])
[...]
>>> savefig("comm2-hist.png")
The community structure with \gamma=1:
The community structure with \gamma=2.5:
Calculate Newman’s modularity.
| Parameters: | g : Graph
prop : PropertyMap
weight : PropertyMap (optional, default: None)
|
|---|---|
| Returns: | modularity : float
|
See also
Notes
Given a specific graph partition specified by prop, Newman’s modularity [newman-modularity-2006] is defined by:
where e_{rs} is the fraction of edges which fall between vertices with spin s and r.
If enabled during compilation, this algorithm runs in parallel.
References
| [newman-modularity-2006] | M. E. J. Newman, “Modularity and community structure in networks”, Proc. Natl. Acad. Sci. USA 103, 8577-8582 (2006), arXiv:physics/0602124 |
Examples
>>> from pylab import *
>>> from numpy.random import seed
>>> seed(42)
>>> g = gt.load_graph("community.dot")
>>> spins = gt.community_structure(g, 10000, 10)
>>> gt.modularity(g, spins)
0.53531418856240398
Obtain the condensation graph, where each vertex with the same ‘prop’ value is condensed in one vertex.
| Parameters: | g : Graph
prop : PropertyMap
weight : PropertyMap (optional, default: None)
|
|---|---|
| Returns: | condensation_graph : Graph
vcount : PropertyMap
ecount : PropertyMap
|
See also
Notes
Each vertex in the condensation graph represents one community in the original graph (vertices with the same ‘prop’ value’), and the edges represent existent edges between vertices of the respective communities in the original graph.
Examples
>>> from pylab import *
>>> from numpy.random import poisson, seed
>>> seed(42)
>>> g = gt.random_graph(1000, lambda: poisson(3), directed=False)
>>> spins = gt.community_structure(g, 10000, 100)
>>> ng = gt.condensation_graph(g, spins)
>>> size = ng[0].new_vertex_property("double")
>>> size.a = log(ng[1].a+1)
>>> gt.graph_draw(ng[0], vsize=size, vcolor=size, splines=True,
... eprops={"len":20, "penwidth":10}, vprops={"penwidth":10},
... output="comm-network.png", size=(10,10))
<...>
Community network of a random graph. The color and sizes of the nodes indicate the size of the corresponding community.