networks.cluster.py¶

class apparent.networks.cluster.NetworkClusterer(clusterer=None, **kwargs)¶

A class to cluster networks using a pairwise distance matrix or user-provided cluster labels.

This class provides a flexible interface for clustering networks based on their pairwise distance matrices. It supports multiple clustering algorithms and can handle both precomputed distance matrices and manual cluster assignments.

Parameters:

clusterer (object or str, optional) – A clustering algorithm instance or string identifier. If None, uses AgglomerativeClustering with 2 clusters. If string, must be one of ‘kmeans’, ‘agglomerative’, or ‘dbscan’.
**kwargs (dict) – Additional keyword arguments passed to the clustering algorithm.

model¶

The clustering model (e.g., KMeans, AgglomerativeClustering, or DBSCAN).

Type:: object

labels_¶

Cluster labels for each graph.

Type:: np.ndarray

Examples

>>> import numpy as np
>>> from apparent.networks import NetworkClusterer
>>>
>>> # Create sample distance matrix
>>> distances = np.array([[0, 1, 2], [1, 0, 1.5], [2, 1.5, 0]])
>>>
>>> # Cluster with default agglomerative clustering
>>> clusterer = NetworkClusterer()
>>> labels = clusterer.fit(pairwise_distances=distances)
>>> print(f"Cluster labels: {labels}")

__init__(clusterer=None, **kwargs)¶

Initialize the NetworkClusterer.

Parameters:

clusterer (object or str, optional) – A clustering algorithm instance or string identifier. If None, uses AgglomerativeClustering with 2 clusters and ‘average’ linkage. If string, must be one of ‘kmeans’, ‘agglomerative’, or ‘dbscan’.
**kwargs (dict) – Additional keyword arguments to initialize the clustering algorithm. For AgglomerativeClustering, common parameters include ‘n_clusters’, ‘linkage’, and ‘metric’.

Raises:

ValueError – If ‘ward’ linkage is specified with precomputed distances, or if an unsupported clustering method string is provided.

Examples

>>> clusterer = NetworkClusterer()  # Default agglomerative
>>> clusterer = NetworkClusterer('kmeans', n_clusters=3)
>>> clusterer = NetworkClusterer('dbscan', eps=0.5)

fit(pairwise_distances=None, manual_labels=None)¶

Fit the clustering model to the pairwise distance matrix, or accept manual labels.

This method performs clustering on the provided distance matrix or assigns manual labels. For algorithms that support precomputed distances, the distance matrix is used directly. Otherwise, the matrix is flattened to create feature vectors.

Parameters:

pairwise_distances (np.ndarray, optional) – A symmetric pairwise distance matrix (n x n) between graphs. Required unless manual_labels are provided.
manual_labels (np.ndarray, optional) – User-defined cluster labels for each graph. If provided, clustering is skipped and these labels are used directly.

Returns:

labels_ – Cluster labels for each graph. Also stored in self.labels_.

Return type:

np.ndarray

Raises:

ValueError – If neither pairwise_distances nor manual_labels are provided, or if the distance matrix is not square and symmetric.

Examples

>>> distances = np.array([[0, 1, 2], [1, 0, 1.5], [2, 1.5, 0]])
>>> clusterer = NetworkClusterer(n_clusters=2)
>>> labels = clusterer.fit(pairwise_distances=distances)
>>> print(f"Cluster labels: {labels}")