Type: | Package |
Title: | Predict Information Cascade by Self-Exciting Point Process |
Version: | 1.1 |
Date: | 2022-05-20 |
Author: | Hera He, Murat Erdogdu, Qingyuan Zhao |
Maintainer: | Qingyuan Zhao <qingyzhao@gmail.com> |
Description: | An implementation of self-exciting point process model for information cascades, which occurs when many people engage in the same acts after observing the actions of others (e.g. post resharings on Facebook or Twitter). It provides functions to estimate the infectiousness of an information cascade and predict its popularity given the observed history. See http://snap.stanford.edu/seismic/ for more information and datasets. |
URL: | http://snap.stanford.edu/seismic/ |
License: | GPL-3 |
NeedsCompilation: | no |
RoxygenNote: | 7.1.2 |
Packaged: | 2022-05-20 20:56:45 UTC; qyzhao |
Repository: | CRAN |
Date/Publication: | 2022-05-20 21:30:02 UTC |
Estimate the infectiousness of an information cascade
Description
Estimate the infectiousness of an information cascade
Usage
get.infectiousness(
share.time,
degree,
p.time,
max.window = 2 * 60 * 60,
min.window = 300,
min.count = 5
)
Arguments
share.time |
observed resharing times, sorted, share.time[1] =0 |
degree |
observed node degrees |
p.time |
equally spaced vector of time to estimate the infectiousness, p.time[1]=0 |
max.window |
maximum span of the locally weight kernel |
min.window |
minimum span of the locally weight kernel |
min.count |
the minimum number of resharings included in the window |
Details
Use a triangular kernel with shape changing over time. At time p.time, use a triangluer kernel with slope = min(max(1/(p.time
/2), 1/min.window
), max.window
).
Value
a list of three vectors:
infectiousness. the estimated infectiousness
p.up. the upper 95 percent approximate confidence interval
p.low. the lower 95 percent approximate confidence interval
Examples
data(tweet)
pred.time <- seq(0, 6 * 60 * 60, by = 60)
infectiousness <- get.infectiousness(tweet[, 1], tweet[, 2], pred.time)
plot(pred.time, infectiousness$infectiousness)
Integration with respect to locally weighted kernel
Description
Integration with respect to locally weighted kernel
Usage
linear.kernel(t1, t2, ptime, slope, c = 0.0006265725)
power.kernel(
t1,
t2,
ptime,
share.time,
slope,
theta = 0.2314843,
cutoff = 300,
c = 0.0006265725
)
integral.memory.kernel(
p.time,
share.time,
slope,
window,
theta = 0.2314843,
cutoff = 300,
c = 0.0006265725
)
Arguments
t1 |
a vector of integral lower limit |
t2 |
a vector of integral upper limit |
ptime |
the time (a scalar) to estimate infectiousness and predict for popularity |
slope |
slope of the linear kernel |
c |
the constant density when t is less than the cutoff |
share.time |
observed resharing times, sorted, share.time[1] =0 |
theta |
exponent of the power law |
cutoff |
the cutoff value where the density changes from constant to power law |
p.time |
equally spaced vector of time to estimate the infectiousness, p.time[1]=0 |
window |
size of the linear kernel |
Value
linear.kernel
returns the integral from vector t1 to vector t2 of
c*[slope(t-ptime) + 1];
power.kernel
returns the integral from vector t1 to vector 2 of c*((t-share.time)/cutoff)^(-(1+theta))[slope(t-ptime) + 1];
integral.memory.kernel
returns the vector with ith entry being integral_-inf^inf phi_share.time[i]*kernel(t-p.time)
Functions
-
power.kernel
: Power-law kernel -
integral.memory.kernel
: Integral of the kernel
See Also
Memory kernel
Description
Probability density function and complementary cumulative distribution function for the human reaction time.
Usage
memory.pdf(t, theta = 0.2314843, cutoff = 300, c = 0.0006265725)
memory.ccdf(t, theta = 0.2314843, cutoff = 300, c = 0.0006265725)
Arguments
t |
time |
theta |
exponent of the power law |
cutoff |
the cutoff value where the density changes from constant to power law |
c |
the constant density when t is less than the cutoff |
Details
default values are measured from a real Twitter data set.
Value
the density at t
memory.pdf
returns the density function at t.
memory.ccdf
returns the ccdf (probabilty of greater than t).
Functions
-
memory.ccdf
: Complementary cumulative distribution function
Predict the popularity of information cascade
Description
Predict the popularity of information cascade
Usage
pred.cascade(
p.time,
infectiousness,
share.time,
degree,
n.star = 100,
features.return = FALSE
)
Arguments
p.time |
equally spaced vector of time to estimate the infectiousness, p.time[1]=0 |
infectiousness |
a vector of estimated infectiousness, returned by |
share.time |
observed resharing times, sorted, share.time[1] =0 |
degree |
observed node degrees |
n.star |
the average node degree in the social network |
features.return |
if TRUE, returns a matrix of features to be used to further calibrate the prediction |
Value
a vector of predicted populatiry at each time in p.time
.
Examples
data(tweet)
pred.time <- seq(0, 6 * 60 * 60, by = 60)
infectiousness <- get.infectiousness(tweet[, 1], tweet[, 2], pred.time)
pred <- pred.cascade(pred.time, infectiousness$infectiousness, tweet[, 1], tweet[, 2], n.star = 100)
plot(pred.time, pred)
Predicting information cascade by self-exciting point process model
Description
This package implements a self-exciting point process model for information cascades. An information cascade occurs when many people engage in the same acts after observing the actions of others. Typical examples are post/photo resharings on Facebook and retweets on Twitter. The package provides functions to estimate the infectiousness of an information cascade and predict its popularity given the observed history. For more information, see http://snap.stanford.edu/seismic/.
References
SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity by Q. Zhao, M. Erdogdu, H. He, A. Rajaraman, J. Leskovec, ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2015.
An example information cascade
Description
A dataset containing all the (relative) resharing time and node degree of a tweet. The original Twitter ID is 127001313513967616.
Format
A data frame with 15563 rows and 2 columns
Details
relative_time_second. resharing time in seconds
number_of_followers. number of followers