Type: Package
Title: Predict Information Cascade by Self-Exciting Point Process
Version: 1.1
Date: 2022-05-20
Author: Hera He, Murat Erdogdu, Qingyuan Zhao
Maintainer: Qingyuan Zhao <qingyzhao@gmail.com>
Description: An implementation of self-exciting point process model for information cascades, which occurs when many people engage in the same acts after observing the actions of others (e.g. post resharings on Facebook or Twitter). It provides functions to estimate the infectiousness of an information cascade and predict its popularity given the observed history. See http://snap.stanford.edu/seismic/ for more information and datasets.
URL: http://snap.stanford.edu/seismic/
License: GPL-3
NeedsCompilation: no
RoxygenNote: 7.1.2
Packaged: 2022-05-20 20:56:45 UTC; qyzhao
Repository: CRAN
Date/Publication: 2022-05-20 21:30:02 UTC

Estimate the infectiousness of an information cascade

Description

Estimate the infectiousness of an information cascade

Usage

get.infectiousness(
  share.time,
  degree,
  p.time,
  max.window = 2 * 60 * 60,
  min.window = 300,
  min.count = 5
)

Arguments

share.time

observed resharing times, sorted, share.time[1] =0

degree

observed node degrees

p.time

equally spaced vector of time to estimate the infectiousness, p.time[1]=0

max.window

maximum span of the locally weight kernel

min.window

minimum span of the locally weight kernel

min.count

the minimum number of resharings included in the window

Details

Use a triangular kernel with shape changing over time. At time p.time, use a triangluer kernel with slope = min(max(1/(p.time/2), 1/min.window), max.window).

Value

a list of three vectors:

Examples

data(tweet)
pred.time <- seq(0, 6 * 60 * 60, by = 60)
infectiousness <- get.infectiousness(tweet[, 1], tweet[, 2], pred.time)
plot(pred.time, infectiousness$infectiousness)

Integration with respect to locally weighted kernel

Description

Integration with respect to locally weighted kernel

Usage

linear.kernel(t1, t2, ptime, slope, c = 0.0006265725)

power.kernel(
  t1,
  t2,
  ptime,
  share.time,
  slope,
  theta = 0.2314843,
  cutoff = 300,
  c = 0.0006265725
)

integral.memory.kernel(
  p.time,
  share.time,
  slope,
  window,
  theta = 0.2314843,
  cutoff = 300,
  c = 0.0006265725
)

Arguments

t1

a vector of integral lower limit

t2

a vector of integral upper limit

ptime

the time (a scalar) to estimate infectiousness and predict for popularity

slope

slope of the linear kernel

c

the constant density when t is less than the cutoff

share.time

observed resharing times, sorted, share.time[1] =0

theta

exponent of the power law

cutoff

the cutoff value where the density changes from constant to power law

p.time

equally spaced vector of time to estimate the infectiousness, p.time[1]=0

window

size of the linear kernel

Value

linear.kernel returns the integral from vector t1 to vector t2 of c*[slope(t-ptime) + 1]; power.kernel returns the integral from vector t1 to vector 2 of c*((t-share.time)/cutoff)^(-(1+theta))[slope(t-ptime) + 1]; integral.memory.kernel returns the vector with ith entry being integral_-inf^inf phi_share.time[i]*kernel(t-p.time)

Functions

See Also

memory.pdf


Memory kernel

Description

Probability density function and complementary cumulative distribution function for the human reaction time.

Usage

memory.pdf(t, theta = 0.2314843, cutoff = 300, c = 0.0006265725)

memory.ccdf(t, theta = 0.2314843, cutoff = 300, c = 0.0006265725)

Arguments

t

time

theta

exponent of the power law

cutoff

the cutoff value where the density changes from constant to power law

c

the constant density when t is less than the cutoff

Details

default values are measured from a real Twitter data set.

Value

the density at t

memory.pdf returns the density function at t. memory.ccdf returns the ccdf (probabilty of greater than t).

Functions


Predict the popularity of information cascade

Description

Predict the popularity of information cascade

Usage

pred.cascade(
  p.time,
  infectiousness,
  share.time,
  degree,
  n.star = 100,
  features.return = FALSE
)

Arguments

p.time

equally spaced vector of time to estimate the infectiousness, p.time[1]=0

infectiousness

a vector of estimated infectiousness, returned by get.infectiousness

share.time

observed resharing times, sorted, share.time[1] =0

degree

observed node degrees

n.star

the average node degree in the social network

features.return

if TRUE, returns a matrix of features to be used to further calibrate the prediction

Value

a vector of predicted populatiry at each time in p.time.

Examples

data(tweet)
pred.time <- seq(0, 6 * 60 * 60, by = 60)
infectiousness <- get.infectiousness(tweet[, 1], tweet[, 2], pred.time)
pred <- pred.cascade(pred.time, infectiousness$infectiousness, tweet[, 1], tweet[, 2], n.star = 100)
plot(pred.time, pred)

Predicting information cascade by self-exciting point process model

Description

This package implements a self-exciting point process model for information cascades. An information cascade occurs when many people engage in the same acts after observing the actions of others. Typical examples are post/photo resharings on Facebook and retweets on Twitter. The package provides functions to estimate the infectiousness of an information cascade and predict its popularity given the observed history. For more information, see http://snap.stanford.edu/seismic/.

References

SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity by Q. Zhao, M. Erdogdu, H. He, A. Rajaraman, J. Leskovec, ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2015.


An example information cascade

Description

A dataset containing all the (relative) resharing time and node degree of a tweet. The original Twitter ID is 127001313513967616.

Format

A data frame with 15563 rows and 2 columns

Details