Title: | Process and Visualise Concept Mapping Data |
Version: | 1.0.0 |
Author: | Jos Hageman [aut, cre], Jarl Kampen [aut] |
Maintainer: | Jos Hageman <jos.hageman@wur.nl> |
Description: | Processing and visualizing concept mapping data. Concept maps are versatile tools used across disciplines to enhance understanding, teaching, brainstorming, and information organization. The analysis of concept mapping data involves the sequential use of cluster analysis (for sorting participants and statements), multidimensional scaling (for positioning statements in a conceptual space), and visualization techniques, including point cluster maps and dendrograms. |
License: | GPL-3 |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Imports: | cluster, ggplot2, factoextra, clue, igraph, stringr, pheatmap |
NeedsCompilation: | no |
Packaged: | 2025-03-11 14:44:40 UTC; hagem011 |
Repository: | CRAN |
Date/Publication: | 2025-03-13 12:50:05 UTC |
Abbreviate Statements in Concept Map Data
Description
This function processes the "statement" column of a data frame containing concept map data by converting each statement to lowercase, removing stopwords, and truncating the statement to a specified maximum length. It allows for the abbreviation of long statements while maintaining their core meaning by removing unnecessary words.
Usage
abbreviateStatements(CMData, max_length = 30)
Arguments
CMData |
A data frame containing concept map data. The data should have at least the following column:
|
max_length |
An integer specifying the maximum number of characters for the abbreviated statement. Default is 30. |
Details
This function performs several preprocessing steps on the "statement" column:
Converts statements to lowercase.
Removes punctuation and stopwords from the statements.
Truncates statements to a specified maximum length.
Removes any rows with empty statements after processing.
Stopwords are predefined and include common English words (e.g., "the", "and", "is", "a", etc.) that do not contribute much meaning to the core idea of the statement.
Value
A data frame with the same structure as the input, but with an updated "statement" column containing the abbreviated statements.
Examples
# Create a sample data frame with concept map data
conceptMapData <- data.frame(
id = c(1, 2, 3),
statement = c(
"The quick brown fox jumps over the lazy dog",
"This is a simple concept map example",
"Data science involves analyzing datasets"
)
)
# Apply the abbreviateStatements function with a maximum length of 20
result <- abbreviateStatements(conceptMapData, max_length = 20)
print(result)
Validate Concept Map Data
Description
Checks whether the provided data frame meets the requirements for concept map data.
Usage
checkConceptMapData(CMData)
Arguments
CMData |
A data frame containing concept map data. This data frame must include the required columns: "sorterID", "statement", and "stackID". |
Details
This function performs the following checks on the input data:
Verifies that
CMData
is a data frame.Ensures the presence of required columns:
"sorterID"
,"statement"
, and"stackID"
.Confirms that there are at least 2 unique values in the
"stackID"
column.Confirms that there are at least 2 unique values in the
"sorterID"
column.Confirms that there are at least 2 unique values in the
"statement"
column.
Value
Returns TRUE
if all checks pass. If any check fails, an error is raised
with a descriptive message.
Examples
# Example of valid data
validData <- data.frame(
sorterID = c("resp1", "resp1", "resp1", "resp2",
"resp2", "resp2", "resp3", "resp3", "resp3"),
statement = c("London", "Frankfurt", "Berlin", "London",
"Frankfurt", "Berlin", "London", "Frankfurt", "Berlin"),
stackID = c("capital city", "city", "capital city", 1, 2, 2, "A", "B", "A")
)
checkConceptMapData(validData) # Should return TRUE
# Example of invalid data (missing columns)
invalidData <- data.frame(
sorterID = c(1, 2),
stackID = c(1, 2)
)
# checkConceptMapData(invalidData) # Will return False as an error
Perform Concept Mapping Analysis
Description
This function conducts a concept mapping analysis on sorter data, producing a concept map based on one of three clustering methods: k-means, network analysis, or classical multidimensional scaling (CMDS).
Usage
conceptMapping(
CMData,
method = "network",
numberOfClusters = "auto",
verbose = TRUE,
rangeNumberOfClusters = 2:15,
graph = FALSE,
numberOfKmeansRestarts = 100,
backgroundColor = "black",
main = NULL,
...
)
Arguments
CMData |
A data frame containing concept mapping data.
It must include the columns:
|
method |
A string specifying the clustering method to use. Options are:
|
numberOfClusters |
Either a character string ( |
verbose |
Logical, if |
rangeNumberOfClusters |
A vector of integers specifying the range of clusters to evaluate
when |
graph |
Logical. If |
numberOfKmeansRestarts |
An integer specifying the number of restarts for k-means clustering.
Only relevant if |
backgroundColor |
A string specifying the background color of network plots. Default is |
main |
A string specifying the title for plots. Default is |
... |
Additional arguments, such as |
Details
The function supports three methods for clustering:
-
"kmeans"
: Uses k-means clustering with an optional silhouette-based determination of cluster count. -
"network"
: Generates a network plot using modularity-based clustering. -
"cmds"
: Applies classical multidimensional scaling (CMDS) and clusters the results.
Heatmaps are created for all methods, while additional visualizations depend on the chosen method and graph
parameter.
Value
An object of class conceptMap
, containing:
allStatements
A data frame with statement numbers and text.
CMData
The original concept mapping data.
method
The clustering method used.
numberOfClusters
The number of clusters identified.
clusterResults
A vector indicating cluster assignments for each statement.
heatmapPlot
A heatmap visualizing co-occurrence patterns.
silhouettePlot
(If applicable) A silhouette plot for
"kmeans"
or"cmds"
.networkPlot
(If applicable) A network plot for
"network"
.cmdsPlot
(If applicable) A CMDS plot for
"cmds"
.
Examples
# Simulate data with custom parameters:
set.seed(1)
myCMData <- simulateCardData(nSorters=40, pCorrect=.90, attributeWeights=c(1,1,1,1))
# Subject the data to sorter cluster analysis
myCMDataBySorters <- sorterMapping(myCMData)
# Concept mapping on sorter cluster 3 using default "network" method
myCMAnalysis3 <- conceptMapping(myCMDataBySorters[[3]])
# Concept mapping using default network method using 3 clusters
myCMAnalysis3b <- conceptMapping(myCMDataBySorters[[3]], numberOfCluster = 3)
# Concept mapping using kmeans clustering and 3 clusters
myCMAnalysis3c <- conceptMapping(myCMDataBySorters[[3]], method = "kmeans",
numberOfCluster = 3)
Create an Overview of Statements in Concept Map Data
Description
This function generates an overview of all unique statements in a given concept mapping dataset. The function assigns a unique statement number to each distinct statement and returns a summary data frame.
Usage
createStatementOverview(CMData)
Arguments
CMData |
A data frame containing concept map data. This must include a column named |
Details
The function first checks if the provided dataset is suitable for concept mapping using the
checkConceptMapData
function. If the data is valid, it assigns a numeric statement number
to each distinct statement, ordered by their appearance in the dataset. The function then generates
an overview where each statement is paired with its corresponding statement number.
Value
A data frame with two columns: StatementNumber
and Statement
.
Each row represents a unique statement with its corresponding number.
Examples
# Example of valid data
CMData <- data.frame(
sorterID = c("resp1", "resp1", "resp1", "resp2",
"resp2", "resp2", "resp3", "resp3", "resp3"),
statement = c("London", "Frankfurt", "Berlin", "London",
"Frankfurt", "Berlin", "London", "Frankfurt", "Berlin"),
stackID = c("capital city", "city", "capital city", 1, 2, 2, "A", "B", "A")
)
# Create an overview of the statements
createStatementOverview(CMData)
Cross-Cluster Mapping Between Concept Maps
Description
This function compares two concept maps by aligning their clustering results and visualizing the correspondence between clusters. It identifies matches between clusters from the two maps and highlights differences visually.
Usage
crossClusterMap(conceptMap1, conceptMap2)
Arguments
conceptMap1 |
An object of class |
conceptMap2 |
An object of class |
Details
The function aligns clusters between two concept maps using an optimal matching
algorithm. It first creates a matching matrix based on the overlap between clusters
in the two maps. Then, it uses the Hungarian algorithm (via the solve_LSAP
function
from the clue
package) to find an optimal alignment of clusters.
The output is a plot that shows the alignment of clusters from the two concept maps, with connecting lines colored to indicate matches or mismatches. Statements not clustered in both maps are highlighted in grey.
Value
The function does not return a value but generates a ggplot2
visualization.
See Also
Examples
# Simulate data with custom parameters:
set.seed(1)
myCMData <- simulateCardData(nSorters=40, pCorrect=.90, attributeWeights=c(1,1,1,1))
# Subject the data to sorter cluster analysis
myCMDataBySorters <- sorterMapping(myCMData)
# Subject sorter cluster 1 to concept mapping using default "network" method
myCMAnalysis1 <- conceptMapping(myCMDataBySorters[[1]])
# Subject sorter cluster 3 to concept mapping using default "network" method
myCMAnalysis3 <- conceptMapping(myCMDataBySorters[[3]])
# Visualise comparison of results of two sorter clusters
crossClusterMap(myCMAnalysis1, myCMAnalysis3)
Count the Number of Sorters in Concept Map Data
Description
This function calculates the number of unique sorters (users) in a given concept mapping dataset.
Usage
numberOfSorters(CMData, verbose = TRUE)
Arguments
CMData |
A data frame containing concept map data. This must include a column named |
verbose |
A logical, if |
Details
The function first checks if the provided dataset is suitable for concept mapping using the
checkConceptMapData
function. If the data is valid, it calculates and returns the number
of unique sorterID
s.
Value
An integer representing the number of unique sorters in the dataset.
Examples
# Example of valid data
CMData <- data.frame(
sorterID = c("resp1", "resp1", "resp1", "resp2",
"resp2", "resp2", "resp3", "resp3", "resp3"),
statement = c("London", "Frankfurt", "Berlin", "London",
"Frankfurt", "Berlin", "London", "Frankfurt", "Berlin"),
stackID = c("capital city", "city", "capital city", 1, 2, 2, "A", "B", "A")
)
# Count the number of sorters silently
numberOfSorters(CMData, verbose = FALSE)
# Count the number of sorters with message
numberOfSorters(CMData)
Count the Number of Statements in Concept Map Data
Description
This function calculates the number of unique statements in a given concept mapping dataset.
Usage
numberOfStatements(CMData, verbose = TRUE)
Arguments
CMData |
A data frame containing concept map data. This must include a column named |
verbose |
A logical, if |
Details
The function first checks if the provided dataset is suitable for concept mapping using the
checkConceptMapData
function. If the data is valid, it calculates and returns the number
of unique statement
s.
Value
An integer representing the number of unique statements in the dataset.
Examples
# Example of valid data
CMData <- data.frame(
sorterID = c("resp1", "resp1", "resp1", "resp2",
"resp2", "resp2", "resp3", "resp3", "resp3"),
statement = c("London", "Frankfurt", "Berlin", "London",
"Frankfurt", "Berlin", "London", "Frankfurt", "Berlin"),
stackID = c("capital city", "city", "capital city", 1, 2, 2, "A", "B", "A")
)
# Count the number of statements silently
numberOfStatements(CMData, verbose = FALSE)
# Count the number of statements with message
numberOfStatements(CMData)
Plot a Concept Map
Description
This function generates visualizations for an object of class conceptMap
.
Depending on the method used to create the concept map, different types of plots
(e.g., heatmap, silhouette, network, or CMDS plots) are available.
Usage
## S3 method for class 'conceptMap'
plot(x, whichPlot = "all", ...)
Arguments
x |
An object of class |
whichPlot |
A character string specifying which plot to display. Options depend on the method used to create the concept map:
The default is |
... |
arguments to be passed to methods |
Details
The function behaves differently depending on the method used to create
the concept map. The conceptMap
object must include attributes such as
method
(e.g., "kmeans", "network", or "cmds") and the corresponding plot
objects (e.g., heatmapPlot
, silhouettePlot
, etc.).
The following methods are supported:
-
"kmeans"
: Supports"heatmap"
and"silhouette"
plots. -
"network"
: Supports"heatmap"
and"network"
plots. -
"cmds"
: Supports"heatmap"
,"silhouette"
, and"cmds"
plots.
Value
This function displays the specified plot(s) in the current graphical device.
Examples
# Simulate data with custom parameters:
set.seed(1)
myCMData <- simulateCardData(nSorters=40, pCorrect=.90, attributeWeights=c(1,1,1,1))
# Subject the data to sorter cluster analysis
myCMDataBySorters <- sorterMapping(myCMData)
# Subject sorter cluster 3 to concept mapping using default "network" method
myCMAnalysis3 <- conceptMapping(myCMDataBySorters[[3]])
# Visualise the concept map
plot(myCMAnalysis3)
Print a Concept Map Object
Description
This function prints a detailed summary of a concept map object, including information about statements, sorters, clusters, and the specific statements within selected clusters.
Usage
## S3 method for class 'conceptMap'
print(x, whichCluster = NULL, ...)
Arguments
x |
An object of class |
whichCluster |
A vector of cluster numbers to display. If |
... |
arguments to be passed to methods |
Details
The function first checks if the input object is of class "conceptMap"
and validates the
requested clusters (if specified). It provides an overview of the number of statements, sorters,
and clusters. For each requested cluster, the function lists the statements included in that cluster.
Value
This function does not return a value; it prints the details of the concept map to the console.
Examples
# Simulate data with custom parameters:
set.seed(1)
myCMData <- simulateCardData(nSorters=40, pCorrect=.90, attributeWeights=c(1,1,1,1))
# Subject the data to sorter cluster analysis
myCMDataBySorters <- sorterMapping(myCMData)
# Subject sorter cluster 3 to concept mapping using default "network" method
myCMAnalysis3 <- conceptMapping(myCMDataBySorters[[3]])
# Print content of concept map of sorter cluster 3
print(myCMAnalysis3)
Print and Return Unique Sorters in Concept Map Data
Description
This function retrieves and optionally prints the unique sorters (users) in a given concept mapping dataset.
Usage
printSorters(CMData, verbose = TRUE)
Arguments
CMData |
A data frame containing concept map data. This must include a column named |
verbose |
A logical, if |
Details
The function first checks if the provided dataset is suitable for concept mapping using the
checkConceptMapData
function. If the data is valid, it retrieves the unique sorter IDs
from the sorterID
column. If verbose = TRUE
, the function prints the sorter IDs.
Value
A vector of unique sorter IDs.
Examples
# Example of valid data
CMData <- data.frame(
sorterID = c("resp1", "resp1", "resp1", "resp2",
"resp2", "resp2", "resp3", "resp3", "resp3"),
statement = c("London", "Frankfurt", "Berlin", "London",
"Frankfurt", "Berlin", "London", "Frankfurt", "Berlin"),
stackID = c("capital city", "city", "capital city", 1, 2, 2, "A", "B", "A")
)
# Retrieve unique sorters without printing
printSorters(CMData, verbose = FALSE)
# Retrieve and print unique sorters to console
printSorters(CMData)
Select Significant Statements from Concept Map Data
Description
This function selects statements from a concept map dataset based on their significance in terms of co-occurrence. It applies a chi-squared test on the co-occurrence matrix of the statements to identify those that are statistically significant (i.e., those that co-occur more frequently than would be expected by chance).
Usage
selectStatements(CMData, significanceThreshold = 0.05, verbose = TRUE, ...)
Arguments
CMData |
A data frame containing concept map data. The data should have at least the following columns:
|
significanceThreshold |
A numeric value representing the significance threshold for the chi-squared test. Statements with p-values less than this threshold are considered significant. Default is 0.05. |
verbose |
Logical, if |
... |
Additional arguments to be passed to the chi-squared test (optional). |
Value
A data frame with the same structure as the input, but with non-significant statements removed (if any). If all statements are significant, the original data frame is returned unchanged.
See Also
chisq.test
for chi-squared test functionality.
Examples
# Simulate data with custom parameters:
set.seed(1)
myCMData <- simulateCardData(nSorters=40, pCorrect=.70, attributeWeights=c(1,1,1,1))
# Subject the data to sorter cluster analysis
myCMDataBySorters <- sorterMapping(myCMData)
# Select significant statements
mySelectedStatementsSorterCluster3 <- selectStatements(myCMDataBySorters[[1]])
Simulate Card Sorting Data
Description
This function simulates card sorting data based on user-specified parameters, such as the number of sorters, the probability of correct sorting, and the weights for different card attributes.
Usage
simulateCardData(
nSorters = 40,
pCorrect = 0.95,
attributeWeights = c(1, 1, 1, 1),
verbose = TRUE
)
Arguments
nSorters |
An integer specifying the number of sorters to simulate. Default is 40. |
pCorrect |
A numeric value between 0 and 1 specifying the probability that a card is sorted correctly. Default is 0.95. |
attributeWeights |
A numeric vector of length 4 specifying the weights
for the card attributes (e.g., |
verbose |
Logical, if |
Details
The function simulates a card sorting experiment where cards are
sorted by multiple sorters based on one of four attributes. The probability
of sorting a card correctly is determined by pCorrect
, and errors are
introduced randomly for each sorter. The attribute weights determine how many
sorters focus on each attribute, and a warning is issued if the weights do
not align with the total number of sorters.
The function returns a data frame containing simulated card sorting data for all sorters, including sorter IDs, card IDs, and assigned stacks.
Value
A data frame with columns:
sorterID
A unique identifier for each sorter.
statement
The ID of the card being sorted.
stackID
The stack assigned to the card by the sorter.
See Also
Examples
# Simulate data with default parameters:
set.seed(1)
myCMData <- simulateCardData()
# Simulate data with custom parameters:
set.seed(1)
myCMData <- simulateCardData(nSorters=40, pCorrect=.90, attributeWeights=c(1,1,1,1))
Cluster Sorters in Concept Mapping Data
Description
This function performs clustering of sorters in concept mapping data based on their sorting behavior. It uses hierarchical clustering and allows the automatic determination of the optimal number of clusters or a user-defined number.
Usage
sorterMapping(
CMData,
numberOfSorterClusters = "auto",
verbose = TRUE,
rangeNumberOfClusters = 2:15,
graph = TRUE
)
Arguments
CMData |
A data frame containing concept mapping data. It must include the columns:
|
numberOfSorterClusters |
Either a character string ( |
verbose |
Logical, if |
rangeNumberOfClusters |
A vector of integers specifying the range of clusters to evaluate
when |
graph |
Logical. If |
Details
This function clusters sorters based on their sorting behavior using hierarchical clustering
with Ward's method. If numberOfSorterClusters = "auto"
, the silhouette method is used
to determine the optimal number of clusters within the range specified by rangeNumberOfClusters
.
Each cluster's data is validated for its suitability for concept mapping, and cluster-specific data
is returned as a list of data frames. Graphical output includes a dendrogram and silhouette plot
if graph = TRUE
.
Value
A list of data frames, each representing the concept mapping data for a cluster of sorters.
If only one cluster is found, the original CMData
is returned.
Examples
# Simulate data with custom parameters:
set.seed(1)
myCMData <- simulateCardData(nSorters=40, pCorrect=.90, attributeWeights=c(1,1,1,1))
# Subject the data to sorter cluster analysis
myCMDataBySorters <- sorterMapping(myCMData)
# Subject the data to sorter cluster analysis with a predefined number of sorter clusters
myCMDataBySorters <- sorterMapping(myCMData, numberOfSorterClusters=2)
Summary of a Concept Map Object
Description
This function provides a summary of a concept map object, including the number of statements, sorters, clusters, and the distribution of statements across clusters.
Usage
## S3 method for class 'conceptMap'
summary(object, ...)
Arguments
object |
An object of class |
... |
arguments to be passed to methods |
Details
The function verifies that the input object is of class "conceptMap"
and extracts key information from it.
It summarizes the number of statements, sorters, and clusters, and details the distribution of statements across
the identified clusters.
Value
This function does not return a value; it prints a summary of the concept map object to the console.
Examples
# Simulate data with custom parameters:
set.seed(1)
myCMData <- simulateCardData(nSorters=40, pCorrect=.90, attributeWeights=c(1,1,1,1))
# Subject the data to sorter cluster analysis
myCMDataBySorters <- sorterMapping(myCMData)
# Subject sorter cluster 3 to concept mapping using default "network" method
myCMAnalysis3 <- conceptMapping(myCMDataBySorters[[3]])
# Generate summary of concept map of sorter cluster 3
summary(myCMAnalysis3)