Title: Descriptive Statistics
Depends: R(≥ 3.6)
Version: 1.0.3
Description: Exploratory analysis of a data base. Using the functions of this package is possible to filter the data set detecting atypical values (outliers) and to perform exploratory analysis through visual inspection or dispersion measures. With this package you can explore the structure of your data using several parameters at the same time joining statistical parameters with different graphics. Finally, this package aid to confirm or reject the hypothesis that your data structure presents a normal distribution. Therefore this package is useful to get a previous insight of your data before to carry out statistical analysis.
Imports: stats (≥ 3.6.3), graphics (≥ 3.6.3)
License: GPL-3
URL: https://github.com/William-HC/DescriptiveWH
Encoding: UTF-8
RoxygenNote: 7.1.1
NeedsCompilation: no
Packaged: 2021-06-14 21:24:24 UTC; Edgar
Author: William Herrera-Caceres [aut, cre]
Maintainer: William Herrera-Caceres <williamandresh@hotmail.com>
Repository: CRAN
Date/Publication: 2021-06-17 05:50:16 UTC

Descriptive statistics

Description

This function is useful to see the basic statistical information of a variable.

Usage

destatWH(x, dec = NULL, op = FALSE)

Arguments

x

Numeric. A vector containing the data set.

dec

Numeric. A value defined to set the number of decimals in the returns.

op

Logical. If TRUE, returns plot, histogram, boxplot and qqnorm of the data set.

Details

This function shows statistical values including several visual tools.

Value

If op=FALSE (defined by default), the function returns a matrix (1 row, 9 columns) with values of the mean, the standart deviadtion, the minumun, the median, the maximum, the variance, the variation coefficient, the first quartil value and the third quartil value of the data set. Whereas if op=TRUE , the function returns four graphics: 1) A plot with X axis the number of samples and Y axis the amount of values included in the data set, 2) The density histogram of data set with a red line representing the distribution shape of the data, 3) A classical boxplot of the data, and 4) the qqnorm plot including the qqline used to check the normal disribution in a data set through visual inspection.

Examples

#Example 1
data=rnorm(1000,30,10)
destatWH( x=data ,dec=2 )
#Example 2
destatWH( x=data , op=TRUE )

Distribution of records

Description

This function is useful to see the frequency of data and descriptive statistics

Usage

distrWH(x, dec = FALSE)

Arguments

x

Numeric. A vector containing the data set.

dec

Numeric. A value defined to set the number of decimals in the returns.

Details

This function shows statistical descriptive values of a vector.

Value

List including the table with the absolute frequency of data, also the mean, median, mode and range of data.

Examples

#Example 1
dat=c(1,5,2,4,5,1,3,5,3,5)
distrWH(x=dat,dec=3)

Distribution of frequency

Description

Function used to compute the absolute and relative frecuency of a categorical variable.

Usage

freqWH(x, op = FALSE)

Arguments

x

Numeric or Character. A vector of the data set.

op

Logical. Optional argument to see the data structure in graphical representation.

Details

This function shows a table including the frequency of values from a categorical vector. This vector can be formed by numeric, character or mixed values, that is, numerical and character values. The table also includes the number of categories in which data is distributed. The absolute and relative frecuencies can be showed using histogram.

Value

A data frame is returned by default (op=FALSE), including the absolute and relative frequeincies of the data set. In addition, cumulative fequencies are included in this table. When "op=TRUE", the function returns only the absolute and relative frequencies in a graphical representation through histogram.

Examples

#Example 1
data1=c("A","A","A","A",2,3,2,3,"B","B","B")   
freqWH(data1)
#Example 2
data2=c("A","A","B","A","C")
freqWH(data2)
#Example 3
data3=c(5,5,3,1,5,2,3,6,9,6,6)
freqWH(data3)
#Example 4
data4=c("A1","A2","B3","A2","C")
freqWH(data4)

Arithmetic, Harmonic, Geometric and Weighted Mean

Description

Function used to compute different kind of mean in variables.

Usage

meanWH(x, meth = NULL, narm = FALSE, dec = NULL)

Arguments

x

Numeric. A vector of the data set.

meth

Character. To compute the arithmetic (meth="a"), harmonic ("h"), geometric ("g") or weighted ("w") mean. Only for weighted mean option, an additional column including the weight of values is required.

narm

Logical. Optional argument used to compute the mean removing NA values.

dec

Numeric. A value to define the number of decimals in the returns.

Details

This function shows the mean as a numerical value from the data set. For the case of weighted mean, an additional column must be included in the dataset containing the assigned weigths of each value. The weighted mean is computed taking into account the total percentage of weights, that is, for cases that the sumed of wights is 100% or different (i.e. 90%, 120%, etc).

Value

This function returns a numerical value representing the mean depending on used method.

Examples

#Example 1
data1=c(7,6,NA,8,NA)
meanWH(x=data1,meth="a",narm=TRUE)
#Example 2
data2=rnorm(10,50,3)
meanWH(x=data2,meth="a",dec=2)
#Example 3
data3=rnorm(100,50,11)
meanWH(x=data3,meth="g")
#Example 4
data4=rnorm(10,20,4)
meanWH(data4,meth="h")
#Example 5  (total weight is closed to 100%)
df1=data.frame(value=c(9,8,7,8),weight=c(0.5,0.2,0.2,0.1))
meanWH(df1,meth="w")
#Example 6  (total weight is not closed 100%)
df2=data.frame(value=c(9,8,7,8),weight=c(0.5,0.2,0.2,0.2))
meanWH(df2,meth="w")

Outliers detection

Description

The function used to detect atypical values in a data set

Usage

outlierWH(x, lower = NULL, upper = NULL, dec = NULL)

Arguments

x

Numeric. A vector of the data set.

lower

Numeric. A value representing the lower limit of outliers.

upper

Numeric. A value representing the upper limit of outliers.

dec

Numeric. A value to define the decimal numbers

Details

This function shows a list with the putative outliers and other without those. If you do not type lower and upper values, by default the value is assumed as the standard deviation of "x" multipied two.

Value

A list including three elements: 1) A data frame including the mean, the standart deviation, and the upper and lower limits of outliers, 2) A numeric vector with data considered no outliers, and 2) A numeric vector including the putative outliers.

Examples

#Example 1
data=rnorm(100,30,15)
outlierWH( x=data ,lower=15 ,upper=35 )
#Example 2
outlierWH( x=data )
#Example 3
outlierWH( x=data ,lower=15 )