#

# PLFIT(x) estimates x_min and alpha according to the goodness-of-fit # based method described in Clauset, Shalizi, Newman (2007). x is a # vector of observations of some quantity to which we wish to fit the # power-law distribution p(x) ~ x^-alpha for x >= xmin. # PLFIT automatically detects whether x is composed of real or integer # values, and applies the appropriate method. For discrete data, if # min(x) > 1000, PLFIT uses the continuous approximation, which is # a reliable in this regime.

#

# The fitting procedure works as follows:

# 1) For each possible choice of x_min, we estimate alpha via the # method of maximum likelihood, and calculate the Kolmogorov-Smirnov # goodness-of-fit statistic D.

# 2) We then select as our estimate of x_min, the value that gives the # minimum value D over all values of x_min.

#

# Note that this procedure gives no estimate of the uncertainty of the # fitted parameters, nor of the validity of the fit.

#

# Example:

# x <- (1-runif(10000))^(-1/(2.5-1))

# plfit(x)

#

#

# Version 1.0 (2008 February)

# Version 1.1 (2008 February)

# - correction : division by zero if limit >= max(x) because the unique R function do no sort # and the matlab function do...

# Version 1.1 (minor correction 2009 August)

# - correction : lines 230 zdiff calcul was wrong when xmin=0 (thanks to Naoki Masuda) # - gpl version updated to v3.0 (asked by Felipe Ortega)

# Version 1.2 (2011 August)

# - correction for method "limit" thanks to David R. Pugh # xmins <- xmins[xmins<=limit] is now xmins <- xmins[xmins>=limit] # - "fixed" method added for xmins from David R. Pugh

# - modifications by Alan Di Vittorio:

# - correction : zdiff calculation was wrong when xmin==1 # - the previous zdiff correction was incorrect

# - correction : x has to have at least two unique values # - additional discrete x input test : discrete x cannot contain the value 0 # - added option to truncate continuous xmin search when # of obs gets small #

# Copyright (C) 2008,2011 Laurent Dubroca laurent.dubroca_at_gmail.com # (Stazione Zoologica Anton Dohrn, Napoli, Italy)

# Distributed under GPL 3.0

# http://www.gnu.org/copyleft/gpl.html

# PLFIT comes with ABSOLUTELY NO WARRANTY

# Matlab to R translation based on the original code of Aaron Clauset (Santa Fe Institute) # Source: http://www.santafe.edu/~aaronc/powerlaws/

#

# Notes:

#

# 1. In order to implement the integer-based methods in Matlab, the numeric # maximization of the log-likelihood function was used. This requires # that we specify the range of scaling parameters considered. We set # this range to be seq(1.5,3.5,0.01) by default. This vector can be # set by the user like so,

#

# a <- plfit(x,"range",seq(1.001,5,0.001))

#

# 2. PLFIT can be told to limit the range of values considered as estimates # for xmin in two ways. First, it can be instructed to sample these # possible values like so,

#

# a <- plfit(x,"sample",100)

#

# which uses 100 uniformly distributed values on the sorted list of # unique values in the data set. Alternatively, it can simply omit all # candidates below a hard limit, like so

#

# a <- plfit(x,"limit",3.4)

#

# In the case of discrete data, it rounds the limit to the nearest # integer.

#

# Finally, if you wish to force the threshold parameter to take a specific value # (useful for bootstrapping), simply call plfit() like so #

# a <- plfit(x,"fixed",3.5)

#

# 3. When the input sample size is small (e.g., < 100), the estimator is # known to be slightly biased (toward larger values of alpha). To #...