Title: Motif-Based Spectral Clustering of Weighted Directed Networks
Version: 0.2.3
Description: Tools for spectral clustering of weighted directed networks using motif adjacency matrices. Methods perform well on large and sparse networks, and random sampling methods for generating weighted directed networks are also provided. Based on methodology detailed in Underwood, Elliott and Cucuringu (2020) <doi:10.48550/arXiv.2004.01293>.
URL: https://github.com/wgunderwood/motifcluster
Language: en-US
BugReports: https://github.com/wgunderwood/motifcluster/issues
License: GPL-3
Encoding: UTF-8
RoxygenNote: 7.2.1
Depends: R (≥ 3.6.0)
Imports: igraph (≥ 1.2.5), Matrix (≥ 1.2), RSpectra (≥ 0.16.0)
Suggests: covr (≥ 3.5.0), knitr (≥ 1.28), mclust (≥ 5.4.6), rmarkdown (≥ 2.1), testthat (≥ 2.3.2)
VignetteBuilder: knitr
NeedsCompilation: no
Packaged: 2022-11-18 04:29:33 UTC; will
Author: William George Underwood [aut, cre]
Maintainer: William George Underwood <wgu2@princeton.edu>
Repository: CRAN
Date/Publication: 2022-11-18 08:10:02 UTC

Compute a right-multiplication with the ones matrix

Description

Compute a * (b %*% one_mat) where a, b, ones_mat are square matrices of the same size, and ones_mat contains all entries equal to one. The product * is an entry-wise (Hadamard) product, while %*% represents matrix multiplication. This method is more efficient than the naive approach when a or b are sparse.

Usage

a_b_one(a, b)

Arguments

a, b

Square matrices.

Value

The square matrix a * (b %*% one_mat).


Compute a left-multiplication with the ones matrix

Description

Compute a * (one_mat %*% b) where a, b, ones_mat are square matrices of the same size, and ones_mat contains all entries equal to one. The product * is an entry-wise (Hadamard) product, while %*% represents matrix multiplication. This method is more efficient than the naive approach when a or b are sparse.

Usage

a_one_b(a, b)

Arguments

a, b

Square matrices.

Value

The square matrix a * (one_mat %*% b).


Build sparse adjacency matrix

Description

Build the sparse adjacency matrix G from a graph adjacency matrix.

Usage

build_G(adj_mat)

Arguments

adj_mat

The original adjacency matrix.

Value

The adjacency matrix G in sparse form.


Build double-edge adjacency matrix

Description

Build the sparse double-edge adjacency matrix Gd from a graph adjacency matrix.

Usage

build_Gd(adj_mat)

Arguments

adj_mat

The original adjacency matrix.

Value

A double-edge adjacency matrix Gd in sparse form.


Build product adjacency matrix

Description

Build the sparse product adjacency matrix Jp from a graph adjacency matrix.

Usage

build_Gp(adj_mat)

Arguments

adj_mat

The original adjacency matrix.

Value

A product adjacency matrix Jp in sparse form.


Build single-edge adjacency matrix

Description

Build the sparse single-edge adjacency matrix Gs from a graph adjacency matrix.

Usage

build_Gs(adj_mat)

Arguments

adj_mat

The original adjacency matrix.

Value

A single-edge adjacency matrix Gs in sparse form.


Build identity matrix

Description

Build the sparse identity matrix Id from a graph adjacency matrix.

Usage

build_Id(adj_mat)

Arguments

adj_mat

The original adjacency matrix.

Value

An identity matrix Id in sparse form.


Build directed indicator matrix

Description

Build the sparse directed indicator matrix J from a graph adjacency matrix.

Usage

build_J(adj_mat)

Arguments

adj_mat

The original adjacency matrix.

Value

A directed indicator matrix J in sparse form.


Build missing-edge indicator matrix

Description

Build the missing-edge indicator matrix J0 from a graph adjacency matrix.

Usage

build_J0(adj_mat)

Arguments

adj_mat

The original adjacency matrix.

Value

A missing-edge indicator matrix J0.


Build double-edge indicator matrix

Description

Build the sparse double-edge indicator matrix Jd from a graph adjacency matrix.

Usage

build_Jd(adj_mat)

Arguments

adj_mat

The original adjacency matrix.

Value

A double-edge indicator matrix Jd in sparse form.


Build edge-and-diagonal indicator matrix

Description

Build the sparse edge-and-diagonal indicator matrix Je from a graph adjacency matrix.

Usage

build_Je(adj_mat)

Arguments

adj_mat

The original adjacency matrix.

Value

An edge-and-diagonal indicator matrix Je in sparse form.


Build vertex-distinct indicator matrix

Description

Build the vertex-distinct indicator matrix Jn from a graph adjacency matrix.

Usage

build_Jn(adj_mat)

Arguments

adj_mat

The original adjacency matrix.

Value

A vertex-distinct indicator matrix Jn.


Build single-edge indicator matrix

Description

Build the sparse single-edge indicator matrix Js from a graph adjacency matrix.

Usage

build_Js(adj_mat)

Arguments

adj_mat

The original adjacency matrix.

Value

A single-edge indicator matrix Js in sparse form.


Build a Laplacian matrix

Description

Build a Laplacian matrix (combinatorial Laplacian or random-walk Laplacian) from a symmetric (weighted) graph adjacency matrix.

Usage

build_laplacian(adj_mat, type_lap = c("comb", "rw"))

Arguments

adj_mat

Symmetric adjacency matrix from which to build the Laplacian.

type_lap

Type of Laplacian to build. One of "comb" (combinatorial) or "rw" (random-walk).

Value

The specified Laplacian matrix.

Examples

adj_mat <- matrix(c(1:9), nrow = 3)
build_laplacian(adj_mat, "rw")

Build a motif adjacency matrix

Description

Build a motif adjacency matrix from an adjacency matrix.

Usage

build_motif_adjacency_matrix(
  adj_mat,
  motif_name,
  motif_type = c("struc", "func"),
  mam_weight_type = c("unweighted", "mean", "poisson"),
  mam_method = c("sparse", "dense")
)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_name

Motif used for the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build. One of "func" or "struc".

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

mam_method

Which formulation to use. One of "dense" or "sparse". The sparse formulation avoids generating large dense matrices so tends to be faster for large sparse graphs.

Details

Entry (i, j) of a motif adjacency matrix is the sum of the weights of all motifs containing both nodes i and j. The motif is specified by name and the type of motif instance can be one of:

The weighting scheme can be one of:

Value

A motif adjacency matrix.

Examples

adj_mat <- matrix(c(1:9), nrow = 3)
build_motif_adjacency_matrix(adj_mat, "M1", "func", "mean")

Get cluster assignments from spectrum using k-means++

Description

Get a vector of cluster assignments from a spectrum, using k-means++ and num_clusts clusters.

Usage

cluster_spectrum(spectrum, num_clusts)

Arguments

spectrum

A list containing $vects; the matrix of eigenvectors to pass to k-means++.

num_clusts

The number of clusters to find.

Value

A length-nrow(spectrum$vects) vector of integers from 1 to num_clusts, representing cluster assignments.


Generate a small graph for demonstrations

Description

Generate the sparse and dense adjacency matrices of a small weighted directed graph, for demonstrating methods and running tests.

Usage

demonstration_graph()

Value

A list with two entries: adj_mat_dense is the adjacency matrix in dense form, and adj_mat_sparse is the adjacency matrix in sparse form.


Set diagonal entries to zero and sparsify

Description

Set the diagonal entries of a matrix to zero and convert it to sparse form.

Usage

drop0_killdiag(some_mat)

Arguments

some_mat

A square matrix.

Value

A sparse-form copy of some_mat with its diagonal entries set to zero.


Compute first few eigenvalues and eigenvectors of a matrix

Description

Compute the first few eigenvalues (by magnitude) and associated eigenvectors of a matrix.

Usage

get_first_eigs(some_mat, num_eigs)

Arguments

some_mat

Matrix for which eigenvalues and eigenvectors are to be calculated.

num_eigs

Number of eigenvalues and eigenvectors to calculate.

Value

A list with two entries: vals contains a length-num_eigs vector of the first few eigenvalues, and vects contains an nrow(some_mat) by num_eigs matrix of the associated eigenvectors.


Get largest connected component

Description

Get the indices of the vertices in the largest connected component of a graph from its adjacency matrix.

Usage

get_largest_component(adj_mat)

Arguments

adj_mat

An adjacency matrix of a graph.

Value

A vector of indices corresponding to the vertices in the largest connected component.

Examples

adj_mat <- matrix(c(0, 1, 0, 0, 0, 0, 0, 0, 0), nrow = 3)
get_largest_component(adj_mat)

Get common motif names

Description

Get the names of some common motifs as strings.

Usage

get_motif_names()

Value

A vector of names (strings) of common motifs.


kmeans++ clustering

Description

Use the kmeans++ algorithm to cluster points into k clusters, as implemented in the deprecated LICORS package, using the built-in function kmeans.

Usage

kmeanspp(data, k = 2, iter.max = 100, nstart = 10, ...)

Arguments

data

An N \times d matrix, where there are N samples in dimension d.

k

The number of clusters.

iter.max

The maximum number of iterations.

nstart

The number of restarts.

...

Additional arguments passed to kmeans.

Value

A list with 9 entries:

References

Arthur, D. and S. Vassilvitskii (2007). “k-means++: The advantages of careful seeding.” In H. Gabow (Ed.), Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms [SODA07], Philadelphia, pp. 1027-1035. Society for Industrial and Applied Mathematics.

See Also

kmeans

Examples

set.seed(1984)
n <- 100
X = matrix(rnorm(n), ncol = 2)
Y = matrix(runif(length(X)*2, -1, 1), ncol = ncol(X))
Z = rbind(X, Y)
cluster_Z = kmeanspp(Z, k = 5)

Perform the motif adjacency matrix calculations for motif M1

Description

Perform the motif adjacency matrix calculations for motif M1

Usage

mam_M1(adj_mat, motif_type, mam_weight_type)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M10

Description

Perform the motif adjacency matrix calculations for motif M10

Usage

mam_M10(adj_mat, motif_type, mam_weight_type, mam_method)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

mam_method

Which formulation to use. One of "dense" or "sparse".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M11

Description

Perform the motif adjacency matrix calculations for motif M11

Usage

mam_M11(adj_mat, motif_type, mam_weight_type, mam_method)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

mam_method

Which formulation to use. One of "dense" or "sparse".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M12

Description

Perform the motif adjacency matrix calculations for motif M12

Usage

mam_M12(adj_mat, motif_type, mam_weight_type, mam_method)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

mam_method

Which formulation to use. One of "dense" or "sparse".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M13

Description

Perform the motif adjacency matrix calculations for motif M13

Usage

mam_M13(adj_mat, motif_type, mam_weight_type, mam_method)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

mam_method

Which formulation to use. One of "dense" or "sparse".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M2

Description

Perform the motif adjacency matrix calculations for motif M2

Usage

mam_M2(adj_mat, motif_type, mam_weight_type)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M3

Description

Perform the motif adjacency matrix calculations for motif M3

Usage

mam_M3(adj_mat, motif_type, mam_weight_type)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M4

Description

Perform the motif adjacency matrix calculations for motif M4

Usage

mam_M4(adj_mat, mam_weight_type)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M5

Description

Perform the motif adjacency matrix calculations for motif M5

Usage

mam_M5(adj_mat, motif_type, mam_weight_type)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M6

Description

Perform the motif adjacency matrix calculations for motif M6

Usage

mam_M6(adj_mat, motif_type, mam_weight_type)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M7

Description

Perform the motif adjacency matrix calculations for motif M7

Usage

mam_M7(adj_mat, motif_type, mam_weight_type)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M8

Description

Perform the motif adjacency matrix calculations for motif M8

Usage

mam_M8(adj_mat, motif_type, mam_weight_type, mam_method)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

mam_method

Which formulation to use. One of "dense" or "sparse".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif M9

Description

Perform the motif adjacency matrix calculations for motif M9

Usage

mam_M9(adj_mat, motif_type, mam_weight_type, mam_method)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

mam_method

Which formulation to use. One of "dense" or "sparse".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif Mcoll

Description

Perform the motif adjacency matrix calculations for motif Mcoll

Usage

mam_Mcoll(adj_mat, motif_type, mam_weight_type, mam_method)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

mam_method

Which formulation to use. One of "dense" or "sparse".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif Md

Description

Perform the motif adjacency matrix calculations for motif Md

Usage

mam_Md(adj_mat, mam_weight_type)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif Mexpa

Description

Perform the motif adjacency matrix calculations for motif Mexpa

Usage

mam_Mexpa(adj_mat, motif_type, mam_weight_type, mam_method)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

mam_method

Which formulation to use. One of "dense" or "sparse".

Value

A motif adjacency matrix.


Perform the motif adjacency matrix calculations for motif Ms

Description

Perform the motif adjacency matrix calculations for motif Ms

Usage

mam_Ms(adj_mat, motif_type, mam_weight_type)

Arguments

adj_mat

Adjacency matrix from which to build the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to build.

mam_weight_type

The weighting scheme to use. One of "unweighted", "mean" or "product".

Value

A motif adjacency matrix.


Build a random sparse matrix

Description

Build a sparse matrix of size m * n with non-zero probability p. Edge weights can be unweighted, constant-weighted or Poisson-weighted.

Usage

random_sparse_matrix(m, n, p, sample_weight_type = "constant", w = 1)

Arguments

m, n

Dimension of matrix to build is (m, n).

p

Probability that each entry is non-zero (before weighting).

sample_weight_type

Type of weighting scheme.

w

Weight parameter.

Value

A random sparse matrix.


Run Laplace embedding

Description

Run Laplace embedding on a symmetric (weighted) adjacency matrix with a specified number of eigenvalues and eigenvectors.

Usage

run_laplace_embedding(adj_mat, num_eigs, type_lap = c("comb", "rw"))

Arguments

adj_mat

Symmetric adjacency matrix to be embedded.

num_eigs

Number of eigenvalues and eigenvectors for the embedding.

type_lap

Type of Laplacian for the embedding. One of "comb" (combinatorial) or "rw" (random-walk).

Value

A list with two entries: vals contains the length-num_eigs vector of the first few eigenvalues of the Laplacian, and vects contains an nrow(adj_mat) by num_eigs matrix of the associated eigenvectors.

Examples

adj_mat <- matrix(c(1:9), nrow = 3)
run_laplace_embedding(adj_mat, 2, "rw")

Run motif-based clustering

Description

Run motif-based clustering on the adjacency matrix of a (weighted directed) network, using a specified motif, motif type, weighting scheme, embedding dimension, number of clusters and Laplacian type.

Usage

run_motif_clustering(
  adj_mat,
  motif_name,
  motif_type = c("struc", "func"),
  mam_weight_type = c("unweighted", "mean", "product"),
  mam_method = c("sparse", "dense"),
  num_eigs = 2,
  type_lap = c("comb", "rw"),
  restrict = TRUE,
  num_clusts = 2
)

Arguments

adj_mat

Adjacency matrix to be embedded.

motif_name

Motif used for the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to use. One of "func" or "struc".

mam_weight_type

Weighting scheme for the motif adjacency matrix. One of "unweighted", "mean" or "product".

mam_method

The method to use for building the motif adjacency matrix. One of "sparse" or "dense".

num_eigs

Number of eigenvalues and eigenvectors for the embedding.

type_lap

Type of Laplacian for the embedding. One of "comb" or "rw".

restrict

Whether or not to restrict the motif adjacency matrix to its largest connected component before embedding.

num_clusts

The number of clusters to find.

Value

A list with 8 entries:

Examples

adj_mat <- matrix(c(1:16), nrow = 4)
run_motif_clustering(adj_mat, "M1", "func")

Run motif embedding

Description

Calculate a motif adjacency matrix for a given motif and motif type, restrict it to its largest connected component, and then run Laplace embedding with specified Laplacian type and number of eigenvalues and eigenvectors.

Usage

run_motif_embedding(
  adj_mat,
  motif_name,
  motif_type = c("struc", "func"),
  mam_weight_type = c("unweighted", "mean", "product"),
  mam_method = c("sparse", "dense"),
  num_eigs = 2,
  type_lap = c("comb", "rw"),
  restrict = TRUE
)

Arguments

adj_mat

Adjacency matrix to be embedded.

motif_name

Motif used for the motif adjacency matrix.

motif_type

Type of motif adjacency matrix to use. One of "func" or "struc".

mam_weight_type

Weighting scheme for the motif adjacency matrix. One of "unweighted", "mean" or "product".

mam_method

The method to use for building the motif adjacency matrix. One of "sparse" or "dense".

num_eigs

Number of eigenvalues and eigenvectors for the embedding.

type_lap

Type of Laplacian for the embedding. One of "comb" or "rw".

restrict

Whether or not to restrict the motif adjacency matrix to its largest connected component before embedding.

Value

A list with 7 entries:

Examples

adj_mat <- matrix(c(1:9), nrow = 3)
run_motif_embedding(adj_mat, "M1", "func")

Sample a bipartite stochastic block model (BSBM)

Description

Sample the (weighted) adjacency matrix of a (weighted) bipartite stochastic block model (BSBM) with specified parameters.

Usage

sample_bsbm(
  source_block_sizes,
  dest_block_sizes,
  bipartite_connection_matrix,
  bipartite_weight_matrix = NULL,
  sample_weight_type = c("unweighted", "constant", "poisson")
)

Arguments

source_block_sizes

A vector containing the size of each block of source vertices.

dest_block_sizes

A vector containing the size of each block of destination vertices.

bipartite_connection_matrix

A matrix containing the source block to destination block connection probabilities.

bipartite_weight_matrix

A matrix containing the source block to destination block weight parameters. Unused for sample_weight_type = "constant". Defaults to NULL.

sample_weight_type

The type of weighting scheme. One of "unweighted", "constant" or "poisson".

Value

A randomly sampled (weighted) adjacency matrix of a BSBM.

Examples

source_block_sizes <- c(10, 10)
dest_block_sizes <- c(10, 10, 10)
bipartite_connection_matrix <- matrix(c(0.8, 0.5, 0.1, 0.1, 0.5, 0.8),
      nrow = 2, byrow = TRUE)
bipartite_weight_matrix = matrix(c(20, 10, 2, 2, 10, 20),
      nrow = 2, byrow = TRUE)
sample_bsbm(source_block_sizes, dest_block_sizes,
      bipartite_connection_matrix, bipartite_weight_matrix, "poisson")

Sample a directed stochastic block model (DSBM)

Description

Sample the (weighted) adjacency matrix of a (weighted) directed stochastic block model (DSBM) with specified parameters.

Usage

sample_dsbm(
  block_sizes,
  connection_matrix,
  weight_matrix = NULL,
  sample_weight_type = c("unweighted", "constant", "poisson")
)

Arguments

block_sizes

A vector containing the size of each block of vertices.

connection_matrix

A matrix containing the block-to-block connection probabilities.

weight_matrix

A matrix containing the block-to-block weight parameters. Unused for sample_weight_type = "constant". Defaults to NULL.

sample_weight_type

The type of weighting scheme. One of "unweighted", "constant" or "poisson".

Value

A randomly sampled (weighted) adjacency matrix of a DSBM.

Examples

block_sizes <- c(10, 10)
connection_matrix <- matrix(c(0.8, 0.1, 0.1, 0.8), nrow = 2, byrow = TRUE)
weight_matrix <- matrix(c(10, 3, 3, 10), nrow = 2, byrow = TRUE)
sample_dsbm(block_sizes, connection_matrix, weight_matrix, "poisson")