Type: | Package |
Title: | Hedonic and Multilateral Index Methods for Real Estate Price Statistics |
Version: | 1.0.0 |
Maintainer: | Vivek Gajadhar <v.gajadhar@cbs.nl> |
Description: | Compute price indices using various Hedonic and multilateral methods, including Laspeyres, Paasche, Fisher, and HMTS (Hedonic Multilateral Time series re-estimation with splicing). The central function calculate_price_index() offers a unified interface for running these methods on structured datasets. This package is designed to support index construction workflows across a wide range of domains — including but not limited to real estate — where quality-adjusted price comparisons over time are essential. The development of this package was funded by Eurostat and Statistics Netherlands (CBS), and carried out by Statistics Netherlands. The HMTS method implemented here is described in Ishaak, Ouwehand and Remøy (2024) <doi:10.1177/0282423X241246617>. For broader methodological context, see Eurostat (2013, ISBN:978-92-79-25984-5, <doi:10.2785/34007>). |
License: | EUPL-1.2 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
Depends: | R (≥ 4.4.0) |
Imports: | dplyr, stats, KFAS, stringr, lmtest |
Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
VignetteBuilder: | knitr |
URL: | https://github.com/vivekag7/REPS |
BugReports: | https://github.com/vivekag7/REPS/issues |
NeedsCompilation: | no |
Packaged: | 2025-07-28 06:52:08 UTC; VGAR |
Author: | Farley Ishaak [aut], Pim Ouwehand [aut], David Pietersz [aut], Liu Nuo Su [aut], Cynthia Cao [aut], Mohammed Kardal [aut], Odens van der Zwan [aut], Vivek Gajadhar [aut, cre] |
Repository: | CRAN |
Date/Publication: | 2025-07-30 08:10:02 UTC |
REPS: Hedonic and Multilateral Index Methods for Real Estate Price Statistics
Description
Compute price indices using various Hedonic and multilateral methods, including Laspeyres, Paasche, Fisher, and HMTS (Hedonic Multilateral Time series re-estimation with splicing). The central function calculate_price_index() offers a unified interface for running these methods on structured datasets. This package is designed to support index construction workflows across a wide range of domains — including but not limited to real estate — where quality-adjusted price comparisons over time are essential. The development of this package was funded by Eurostat and Statistics Netherlands (CBS), and carried out by Statistics Netherlands. The HMTS method implemented here is described in Ishaak, Ouwehand and Remøy (2024) doi:10.1177/0282423X241246617. For broader methodological context, see Eurostat (2013, ISBN:978-92-79-25984-5, doi:10.2785/34007).
Author(s)
Maintainer: Vivek Gajadhar v.gajadhar@cbs.nl
Authors:
Farley Ishaak
Pim Ouwehand
David Pietersz
Liu Nuo Su
Cynthia Cao
Mohammed Kardal
Odens van der Zwan
See Also
Useful links:
Calculate direct index according to the Fisher hedonic double imputation method
Description
By the parameters 'dependent_variable', 'continue_variable' and 'categorical_variables' as regression model is compiled. With the model, a direct series of index figures is estimated by use of hedonic regression.
Usage
calculate_fisher(
dataset,
period_variable,
dependent_variable,
numerical_variables,
categorical_variables,
reference_period = NULL,
number_of_observations = FALSE
)
Arguments
dataset |
table with data (does not need to be a selection of relevant variables) |
period_variable |
variable in the table with periods |
dependent_variable |
usually the sale price |
numerical_variables |
vector with quality determining numeric variables (no dummies) |
categorical_variables |
vector with quality determining categorical variables (also dummies) |
reference_period |
period or group of periods that will be set to 100 (numeric/string) |
number_of_observations |
number of observations per period (default = TRUE) |
Details
N.B.: the independent variables must be entered transformed (and ready) in the parameters. Hence, not: log(floor_area), but transform the variable in advance and then provide log_floor_area. This does not count for the dependent variable. This should be entered untransformed
Within the data, it is not neccesary to filter the data on relevant variables or complete records. This is taken care of in the function.
Value
table with index, imputation averages, number of observations and confidence intervals per period
Author(s)
Farley Ishaak
Calculate the geometric average of a series of values
Description
The equation for the calculation is:: exp(mean(log(series_values)))
Usage
calculate_geometric_average(values)
Arguments
values |
series with numeric values |
Value
geometric average
Author(s)
Farley Ishaak
Calculate Growth Rates
Description
Computes period-over-period growth rates from a numeric index vector.
Usage
calculate_growth_rate(values)
Arguments
values |
A numeric vector representing index values. |
Value
A numeric vector of growth rates, with 1 as the initial value.
Author(s)
Vivek Gajadhar
Calculate imputation averages with the 1st period as base period
Description
Prices are estimated based on a provided Hedonic model The model values are calculated for each period in the data With these values, new prices of base period observations are estimated. With this function, imputations according to the Laspeyres and Paasche method can be estimated.
Usage
calculate_hedonic_imputation(
dataset_temp,
period_temp,
dependent_variable_temp,
independent_variables_temp,
number_of_observations_temp,
period_list_temp
)
Arguments
dataset_temp |
table with data |
period_temp |
'period' |
dependent_variable_temp |
usually the sale price |
independent_variables_temp |
vector with quality determining variables |
period_list_temp |
list with all available periods |
Value
Table with imputation averages per period
Author(s)
Farley Ishaak
Calculate a matrix with hedonic imputation averages, re-estimated time series imputation averages and corresponding index series.
Description
Based on a hedonic model, a series of imputed values is calculated in below steps: 1: for every period average imputed prices are estimated with the 1st period as base period. 2: the above is repeated for each possible base period. This result in an equal number of series as the number of periods. 3: All series are re-estimated with a time series model (state space). This step is optionally skipped with a parameter (state_space_model = NULL) 4: the series imputed values are transformed into index series. This matrix can be used for an index calculations according to the HMTS method.
Usage
calculate_hedonic_imputationmatrix(
dataset,
period_variable,
dependent_variable,
numerical_variables,
categorical_variables,
periods_in_year,
number_of_observations = TRUE,
production_since = NULL,
number_preliminary_periods
)
Arguments
dataset |
table with data (does not need to be a selection of relevant variables) |
period_variable |
variable in the dataset with the period |
dependent_variable |
usually the sale price |
numerical_variables |
vector with quality-determining continues variables (numeric, no dummies) |
categorical_variables |
vector with categorical variables (also dummy) |
periods_in_year |
if month, then 12. If quarter, then 4, etc. (default = 4) |
number_of_observations |
number of observations per period (default = TRUE) |
production_since |
1 period in the format of the period_variable. See description above (default = NULL) |
number_preliminary_periods |
number of periods that the index is preliminary. Only works if production_since <> NULL. default = 3 |
Details
Parameter 'production_since': To simulate a series, where 1 period a time expires (as in production), a manual choice in the past is possible. Until this period, all periods are imputed. After that, 1 period is added.
Value
$Matrix_HMTS_index table with index series based on estimations with time series re-estimations $Matrix_HMTS table with estimated values based on time series re-estimations $Matrix_HMS_index table with index series based on estimations with the hedonic model $Matrix_HMS table with estimated values based on the hedonic model $Matrix_HMTS_analysis table with analysis values of the time series model per base period
Author(s)
Farley Ishaak
Calculate HMTS index (Hedonic Multilateral Time series re-estimation Splicing)
Description
Based on a hedonic model, an index is calculated in below steps. See also Ishaak, Ouwehand, Remoy & De Haan (2023). 1: for each period, average imputed prices are calculated with the first period as base period. 2: step 1 is repeated for every possible base period. This result in as many series of imputed values as the number of periods. 3: All series with imputed prices are re-estimated with a Kalman filter (also time series model/state space model) This step can be turned off with a parameter. 4: The series of imputed values are transformed into index series. 5: a specified (parameter) window is chosen of index figures that continues in the calculation. This step can be turned off with a parameter. 6: Of the remaining index figures, the geometric average per period is calculated. The remaining index figures form the final index.
Usage
calculate_hmts(
dataset,
period_variable,
dependent_variable,
numerical_variables,
categorical_variables,
reference_period,
periods_in_year,
production_since = NULL,
number_preliminary_periods,
number_of_observations,
resting_points
)
Arguments
period_variable |
variable in the dataset with the period |
dependent_variable |
usually the sale price |
numerical_variables |
vector with quality-determining continues variables (numeric, no dummies) |
categorical_variables |
vector with categorical variables (also dummy) |
reference_period |
period or group of periods that will be set to 100 (numeric/string) |
periods_in_year |
if month, then 12. If quarter, then 4, etc. (default = 4) |
production_since |
1 period in the format of the period_variable. See description above (default = NULL) |
number_preliminary_periods |
number of periods that the index is preliminary. Only works if production_since <> NULL. default = 3 |
number_of_observations |
number of observations per period (default = TRUE) |
resting_points |
should analyses values be returned? (default = FALSE) |
Details
Parameter 'production_since': To simulate a series, where 1 period a time expires (as in production), a manual choice in the past is possible. Until this period, all periods are imputed. After that, 1 period is added.
Parameter 'resting_points': If TRUE, the output is a list of tables. These tables can be called with a $ after the output. $Index table with periods, index and number of observations $Window table with the index figures within the chosen window $Chosen_index_series table with index series before the window splice $Matrix_HMTS_index table with index series based on re-estimated imputations (time series model) $Matrix_HMTS table with re-estimated imputations (time series model) $Matrix_HMTS_index table with index series based on estimated imputations (hedonic model) $Matrix_HMTS table with estimated imputations (time series model) $Matrix_HMTS_analyse table with diagnostic values of the time series model per base period
Value
$Matrix_HMTS_index table with index series based on estimations with time series re-estimations $Matrix_HMTS table with estimated values based on time series re-estimations $Matrix_HMS_index table with index series based on estimations with the hedonic model $Matrix_HMS table with estimated values based on the hedonic model $Matrix_HMTS_analysis table with analysis values of the time series model per base period
table with periods, index (and optional confidence intervals) and number of observations. If resting_points = TRUE, then list with tables. See general description and examples.
Author(s)
Farley Ishaak
Calculate HMTS index only (Hedonic Multilateral Time series re-estimation Splicing)
Description
Based on a hedonic model, an index is calculated in below steps. See also Ishaak, Ouwehand, Remoy & De Haan (2023). 1: for each period, average imputed prices are calculated with the first period as base period. 2: step 1 is repeated for every possible base period. This result in as many series of imputed values as the number of periods. 3: All series with imputed prices are re-estimated with a Kalman filter (also time series model/state space model) This step can be turned off with a parameter. 4: The series of imputed values are transformed into index series. 5: a specified (parameter) window is chosen of index figures that continues in the calculation. This step can be turned off with a parameter. 6: Of the remaining index figures, the geometric average per period is calculated. The remaining index figures form the final index.
Usage
calculate_hmts_index(
dataset,
period_variable,
dependent_variable,
numerical_variables,
categorical_variables,
reference_period,
periods_in_year,
production_since = NULL,
number_preliminary_periods,
number_of_observations = NULL,
resting_points
)
Arguments
period_variable |
variable in the dataset with the period |
dependent_variable |
usually the sale price |
numerical_variables |
vector with quality-determining continues variables (numeric, no dummies) |
categorical_variables |
vector with categorical variables (also dummy) |
reference_period |
period or group of periods that will be set to 100 (numeric/string) |
periods_in_year |
if month, then 12. If quarter, then 4, etc. (default = 4) |
production_since |
1 period in the format of the period_variable. See description above (default = NULL) |
number_preliminary_periods |
number of periods that the index is preliminary. Only works if production_since <> NULL. default = 3 |
number_of_observations |
number of observations per period (default = TRUE) |
resting_points |
should analyses values be returned? (default = FALSE) |
Details
Parameter 'production_since': To simulate a series, where 1 period a time expires (as in production), a manual choice in the past is possible. Until this period, all periods are imputed. After that, 1 period is added.
Parameter 'resting_points': If TRUE, the output is a list of tables. These tables can be called with a $ after the output. $Index table with periods, index and number of observations $Window table with the index figures within the chosen window $Chosen_index_series table with index series before the window splice $Matrix_HMTS_index table with index series based on re-estimated imputations (time series model) $Matrix_HMTS table with re-estimated imputations (time series model) $Matrix_HMTS_index table with index series based on estimated imputations (hedonic model) $Matrix_HMTS table with estimated imputations (time series model)l $Matrix_HMTS_analyse table with diagnostic values of the time series model per base period
Value
$Matrix_HMTS_index table with index series based on estimations with time series re-estimations $Matrix_HMTS table with estimated values based on time series re-estimations $Matrix_HMS_index table with index series based on estimations with the hedonic model $Matrix_HMS table with estimated values based on the hedonic model $Matrix_HMTS_analysis table with analysis values of the time series model per base period
table with periods, index and number of observations. If resting_points = TRUE, then list with tables. See general description and examples.
Author(s)
Farley Ishaak
Transform series into index
Description
The index can be calculated in two ways:
from a series of values
from a series of mutations (from_growth_rate = TRUE)
Usage
calculate_index(periods, values, reference_period = NULL)
Arguments
periods |
vector/variable with periods (numeric/string) |
values |
vector/variable with to be transformed values (numeric) |
reference_period |
period or group of periods that will be set to 100 (numeric/string) |
Details
N.B. with from_growth_rate: The series of mutations must be equally long to the series of values. The vector should, therefore, also contain a mutation for the first period (this is likely 1). In the calculation, this first mutation is not used.
N.B. for the reference period: The first value is on default set to 100. An adjusted reference period can be provided in the paramater. The reference period can also be a part of a period. E.g. if the series contains months (2019jan, 2019feb), the reference period can be a year (2019).
Value
Index series
Author(s)
Farley Ishaak
Calculate direct index according to the Laspeyres hedonic double imputation method
Description
By the parameters 'dependent_variable', 'continue_variable' and 'categorical_variables' as regression model is compiled. With the model, a direct series of index figures is estimated by use of hedonic regression.
Usage
calculate_laspeyres(
dataset,
period_variable,
dependent_variable,
numerical_variables,
categorical_variables,
reference_period = NULL,
number_of_observations = FALSE,
imputation = FALSE
)
Arguments
dataset |
table with data (does not need to be a selection of relevant variables) |
period_variable |
variable in the table with periods |
dependent_variable |
usually the sale price |
numerical_variables |
vector with quality determining numeric variables (no dummies) |
categorical_variables |
vector with quality determining categorical variables (also dummies) |
reference_period |
period or group of periods that will be set to 100 (numeric/string) |
number_of_observations |
number of observations per period (default = TRUE) |
imputation |
display the underlying average imputation values? (default = FALSE) |
Details
N.B.: the independent variables must be entered transformed (and ready) in the parameters. Hence, not: log(floor_area), but transform the variable in advance and then provide log_floor_area. This does not count for the dependent variable. This should be entered untransformed/
Within the data, it is not necessary to filter the data on relevant variables or complete records. This is taken care of in the function.
Value
table with index, imputation averages, number of observations and confidence intervals per period
Author(s)
Farley Ishaak
Calculate direct index according to the Paasche hedonic double imputation method
Description
By the parameters 'dependent_variable', 'continue_variable' and 'categorical_variables' as regression model is compiled. With the model, a direct series of index figures is estimated by use of hedonic regression.
Usage
calculate_paasche(
dataset,
period_variable,
dependent_variable,
numerical_variables,
categorical_variables,
reference_period = NULL,
number_of_observations = FALSE,
imputation = FALSE
)
Arguments
dataset |
table with data (does not need to be a selection of relevant variables) |
period_variable |
variable in the table with periods |
dependent_variable |
usually the sale price |
numerical_variables |
vector with quality determining numeric variables (no dummies) |
categorical_variables |
vector with quality determining categorical variables (also dummies) |
reference_period |
period or group of periods that will be set to 100 (numeric/string) |
number_of_observations |
number of observations per period (default = TRUE) |
imputation |
display the underlying average imputation values? (default = FALSE) |
Details
N.B.: the independent variables must be entered transformed (and ready) in the parameters. Hence, not: log(floor_area), but transform the variable in advance and then provide log_floor_area. This does not count for the dependent variable. This should be entered untransformed
Within the data, it is not necessary to filter the data on relevant variables or complete records. This is taken care of in the function.
Value
table with index, imputation averages, number of observations and confidence intervals per period
Author(s)
Farley Ishaak
Calculate index based on specified method (Fisher, Laspeyres, Paasche, HMTS, Time Dummy, Rolling Time Dummy)
Description
Central hub function to calculate index figures using different methods.
Usage
calculate_price_index(
dataset,
method,
period_variable,
dependent_variable,
numerical_variables = NULL,
categorical_variables = NULL,
reference_period = NULL,
number_of_observations = TRUE,
periods_in_year = 4,
production_since = NULL,
number_preliminary_periods = 3,
resting_points = FALSE,
imputation = FALSE,
window_length = 5
)
Arguments
dataset |
Data frame with input data |
method |
One of: "fisher", "laspeyres", "paasche", "hmts", "timedummy", "rolling_timedummy", "repricing" |
period_variable |
A string with the name of the column containing time periods. Values must follow a consistent format such as "2020Q1" (quarterly), "2020M01" (monthly), "202001" (YYYYMM), "2020W01" (weekly), or "2020" (yearly). Mixed or irregular formats (e.g., "Q1_2020", "Jan2020") are not supported. |
dependent_variable |
Usually the price |
numerical_variables |
Vector with numeric quality-determining variables |
categorical_variables |
Vector with categorical variables (also dummies) |
reference_period |
Period or group of periods that will be set to 100 |
number_of_observations |
Logical, whether to show number of observations (default = TRUE) |
periods_in_year |
(HMTS only) Number of periods per year (e.g. 12 for months) |
production_since |
(HMTS only) Start period for production simulation |
number_preliminary_periods |
(HMTS only) Number of preliminary periods |
resting_points |
(HMTS only) Whether to return detailed outputs (default = FALSE) |
imputation |
(Laspeyres/Paasche only) Include imputation values? Default = FALSE |
window_length |
(Rolling Time Dummy only) Window size in number of periods |
Value
A data.frame (or list for HMTS with resting_points = TRUE; or named list if multiple methods are used)
Author(s)
Vivek Gajadhar
Examples
# Example: Time Dummy index
Tbl_TD <- calculate_price_index(
method = "timedummy",
dataset = data_constraxion,
period_variable = "period",
dependent_variable = "price",
numerical_variables = "floor_area",
categorical_variables = "neighbourhood_code",
reference_period = "2015",
number_of_observations = FALSE
)
head(Tbl_TD)
# Example: Multiple methods (Fisher, Paasche, Laspeyres)
multi_result <- calculate_price_index(
method = c("fisher", "paasche", "laspeyres"),
dataset = data_constraxion,
period_variable = "period",
dependent_variable = "price",
numerical_variables = "floor_area",
categorical_variables = "neighbourhood_code",
reference_period = "2015",
number_of_observations = FALSE
)
head(multi_result$fisher)
head(multi_result$paasche)
head(multi_result$laspeyres)
Calculate regression diagnostics by period
Description
For each period in the data, fits a log-linear model and computes diagnostics:
Normality test (Shapiro-Wilk)
Adjusted R-squared
Breusch-Pagan test for heteroscedasticity
Durbin-Watson test for autocorrelation
Usage
calculate_regression_diagnostics(
dataset,
period_variable,
dependent_variable,
numerical_variables = NULL,
categorical_variables = NULL
)
Arguments
dataset |
A data.frame with input data |
period_variable |
Name of the period variable (string) |
dependent_variable |
Name of the dependent variable (string) |
numerical_variables |
Vector of numerical independent variables (default = NULL) |
categorical_variables |
Vector of categorical independent variables (default = NULL) |
Value
A data.frame with diagnostics by period
Author(s)
Mohammad Kardal, Vivek Gajadhar
Examples
diagnostics <- calculate_regression_diagnostics(
dataset = data_constraxion,
period_variable = "period",
dependent_variable = "price",
numerical_variables = c("floor_area", "dist_trainstation"),
categorical_variables = c("dummy_large_city", "neighbourhood_code")
)
head(diagnostics)
Calculate repricing index based on hedonic model (geometric adjustment)
Description
For each pair of subsequent periods, this method compares the observed geometric mean price with the predicted mean price from a hedonic regression model. The ratio of these two values forms the basis of the repricing growth rate, which is then accumulated into an index.
Usage
calculate_repricing(
dataset,
period_variable,
dependent_variable,
numerical_variables,
categorical_variables,
reference_period = NULL,
number_of_observations = FALSE,
periods_in_year = 4
)
Arguments
dataset |
a data frame containing the data |
period_variable |
character name of the time period variable |
dependent_variable |
character name of the dependent variable (e.g., sale price) |
numerical_variables |
character vector of numeric quality-determining variables |
categorical_variables |
character vector of categorical variables (including dummies) |
reference_period |
reference period (numeric or string) to normalize index to 100 |
number_of_observations |
logical, if TRUE, adds number of observations column |
periods_in_year |
if month, then 12. If quarter, then 4, etc. (default = 4) |
Value
a data.frame with columns: period, Index, (optionally number_of_observations)
Author(s)
Vivek Gajadhar, Farley Ishaak
Calculate Rolling Time Dummy Index
Description
Estimates a price index using rolling windows of time dummy regressions.
Usage
calculate_rolling_timedummy(
dataset,
period_variable,
dependent_variable,
numerical_variables,
categorical_variables,
reference_period,
window_length,
number_of_observations = FALSE
)
Arguments
dataset |
data frame with input data |
period_variable |
name of the time variable (string) |
dependent_variable |
name of the dependent variable (usually price, assumed unlogged) |
numerical_variables |
vector of numeric quality-determining variables |
categorical_variables |
vector of categorical variables |
reference_period |
period to be normalized to index = 100 (e.g., "2015") |
window_length |
length of each rolling window (integer) |
number_of_observations |
logical, whether to return number of observations per period (default = FALSE) |
Value
data frame with period, Index, and optionally number_of_observations
Author(s)
Vivek Gajadhar
Calculate Time Dummy Index
Description
Estimates a price index using a single regression with time dummy variables.
Usage
calculate_time_dummy(
dataset,
period_variable,
dependent_variable,
numerical_variables,
categorical_variables,
reference_period = NULL,
number_of_observations = FALSE
)
Arguments
dataset |
data frame with input data |
period_variable |
name of the time variable (string) |
dependent_variable |
name of the dependent variable (usually price, assumed unlogged) |
numerical_variables |
vector of numeric quality-determining variables |
categorical_variables |
vector of categorical variables |
reference_period |
period to be normalized to index = 100 (e.g., "2015") |
number_of_observations |
logical, whether to return number of observations per period (default = FALSE) |
Value
data frame with period, Index, and optionally number_of_observations
Author(s)
Vivek Gajadhar
Calculate the trend line for a provided time series of numeric values
Description
Calculate the trend line with the state space method for a provided time series (chronological order is assumed). The series are calculated with the package KFAS.
Usage
calculate_trend_line_kfas(original_series, periodicity, resting_points)
Arguments
original_series |
time series with values in chrolological order |
periodicity |
if month, then 12. If quarter, then 4, etc. (defaul = 4) |
resting_points |
should analyses values be returned? (default = FALSE) |
Value
Trend line
Author(s)
Pim Ouwehand, Farley Ishaak
Default update function
Description
This function is used in the function: calculate_trend_line_KFAS()
Usage
custom_update_function(params, model)
Arguments
params |
startvalues |
model |
state space modelnumber |
Value
Newmodel
Author(s)
Vivek Gajadhar
A real estate example dataframe
Description
A subset of data from a fictitious real estate data frame containing transaction prices and some categorical and numerical characteristics of each dwelling.
Usage
data_constraxion
Format
A data frame with 7,800 rows and 6 columns:
- period
A (string) vector indicating a time period
- price
A (string) vector indicating the transaction price of the dwelling
- floor_area
A real-valued vector of (the logarithm of) the floor area of the dwelling
- dist_trainstation
A real-valued vector of (the logarithm of) the distance of the dwelling to the nearest train station
- neighbourhood_code
A categorical code/string referring to the neighbourhood the dwelling belongs to
- dummy_large_city
A vector indicating whether the dwelling belongs to a large city or not
Source
A fictitious dataset for illustration purposes
Examples
data(data_constraxion)
head(data_constraxion)
Determine_initial_parameters
Description
Determine startvalues within state space models This function is used in the function: calculate_trend_line_KFAS()
Usage
determine_initial_parameters(
model,
initial_values,
FUN = custom_update_function
)
Arguments
model |
modelvalues as output of the function select_state_space_model() |
initial_values |
$initial.values as output of the model |
FUN |
function called: custom_update_function |
Value
New initial startvalues
Author(s)
Pim Ouwehand, Farley Ishaak
Estimate time series parameters
Description
Estimate parameters to estimate trend lines This function is used in the function: calculate_trend_line_KFAS()#'
Usage
estimate_ts_parameters(model, initial_values)
Arguments
model |
model values as output of the function select_state_space_model() |
initial_values |
$initial.values as output of the model |
Value
Parameter for the time series model
Author(s)
Pim Ouwehand, Farley Ishaak
Plot index output from calculate_price_index
Description
Static price index plot using base R graphics with grid lines and external legend.
Usage
plot_price_index(index_output, title = NULL)
Arguments
index_output |
A data.frame or named list of data.frames (from calculate_price_index()) |
title |
Optional plot title |
Details
Supports both single index data.frame and named list of multiple methods. X-axis shows only first period of each year with rotated labels to avoid clutter.
Value
None. Draws plots in the active graphics device.
Author(s)
Vivek Gajadhar
Plot diagnostics output from calculate_regression_diagnostics as a multi-panel grid (base R)
Description
Creates a static 3x2 grid of base R plots showing regression diagnostics:
Normality (Shapiro-Wilk)
Linearity (Adjusted R-squared)
Heteroscedasticity (Breusch-Pagan)
Autocorrelation (Durbin-Watson)
Autocorrelation (p-value DW)
Usage
plot_regression_diagnostics(diagnostics, title = "Regression Diagnostics")
Arguments
diagnostics |
A data.frame as returned by calculate_regression_diagnostics() |
title |
Optional overall title for the entire plot grid (default: "Regression Diagnostics") |
Value
None. Produces plots in the active graphics device.
Author(s)
Vivek Gajadhar
Examples
plot_regression_diagnostics(
calculate_regression_diagnostics(
dataset = data_constraxion,
period_variable = "period",
dependent_variable = "price",
numerical_variables = c("floor_area", "dist_trainstation"),
categorical_variables = c("dummy_large_city", "neighbourhood_code")
)
)
Select the state space model type
Description
This function is used in the function: calculate_trend_line_KFAS()
Usage
select_state_space_model(series, initial_values_all)
Arguments
series |
time series with values in chronological order |
initial_values_all |
start values for 5 hyperparameters: meas, level, slope, seas, scaling |
Value
modelvalues (level, slope) of the chosen state space model and the provided time series
Author(s)
Pim Ouwehand
Set starting values for hyperparameters in state space models
Description
Set starting values for hyperparameters in state space models
Usage
set_startvalues(a, b, c, d, e)
Value
starting values for hyperparameters
Run forward and backward pass of time series estimation
Description
Calculate a trend line based on a provided model. This function is used in the function: calculate_trend_line_KFAS()
Usage
smooth_ts(fittedmodel)
Arguments
fittedmodel |
model values as output of the function estimate.TS.parameters() |
Value
A list containing multiple elements; sub-list signalsubconf[, 1]
provides the estimated trend line.
Author(s)
Pim Ouwehand, Farley Ishaak
Validate Input Data for Hedonic Index Calculation
Description
This function checks whether the dataset contains all required variables, whether the dependent and numerical variables are numeric, and whether the period variable is formatted correctly (e.g., "2020Q1", "2020M01", or just "2015"). It also performs soft-matching to adjust a provided reference_period to align with the dataset.
Usage
validate_input(
dataset,
period_variable,
dependent_variable,
numerical_variables,
categorical_variables,
reference_period = NULL
)
Arguments
dataset |
A data.frame containing the dataset to be validated. |
period_variable |
A string specifying the name of the period variable column. |
dependent_variable |
A string specifying the name of the dependent variable (usually the sale price). |
numerical_variables |
A character vector with names of numeric quality-determining variables. |
categorical_variables |
A character vector with names of categorical variables (including dummies). |
reference_period |
Optional string for the base period to normalize index values (e.g., "2015", "2020Q1"). |
Author(s)
David Pietersz, Vivek Gajadhar