Title: | XOR Pattern Detection and Visualization |
Version: | 0.1.0 |
Description: | Provides tools for detecting XOR-like patterns in variable pairs in two-class data sets. Includes visualizations for pattern exploration and reporting capabilities with both text and HTML output formats. |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
Imports: | dplyr (≥ 1.1.0), ggplot2 (≥ 3.4.0), ggh4x (≥ 0.2.3), tibble (≥ 3.1.8), reshape2 (≥ 1.4.4), glue (≥ 1.6.0), magrittr (≥ 2.0.0), stats, ggthemes, DescTools (≥ 0.99.50), utils, methods, grDevices, knitr, kableExtra, htmltools, base64enc |
Suggests: | testthat (≥ 3.0.0), rmarkdown, doParallel, foreach, parallel (≥ 4.2.0), future (≥ 1.28.0), future.apply (≥ 1.10.0), pbmcapply (≥ 1.5.0) |
RoxygenNote: | 7.2.3 |
SystemRequirements: | GNU make |
Depends: | R (≥ 3.5.0) |
URL: | https://github.com/JornLotsch/detectXOR |
BugReports: | https://github.com/JornLotsch/detectXOR/issues |
NeedsCompilation: | no |
Packaged: | 2025-06-24 05:54:01 UTC; joern |
Author: | Jorn Lotsch |
Maintainer: | Jorn Lotsch <j.lotsch@em.uni-frankfurt.de> |
Repository: | CRAN |
Date/Publication: | 2025-06-27 13:00:06 UTC |
XOR Pattern Detection and Visualization
Description
Provides tools for detecting XOR-like patterns in variable pairs in two-class data sets. Includes visualizations for pattern exploration and reporting capabilities with both text and HTML output formats.
Details
Core Features:
Statistical detection using chi-square tests and Kendall's tau
Spaghetti plots and xy plot for pattern visualization
Main Functions:
-
detect_xor
: Core detection algorithm -
generate_spaghetti_plot_from_results
: Line plots -
generate_xy_plot_from_results
: Plot for pattern visualization
Author(s)
Jorn Lotsch <j.lotsch@em.uni-frankfurt.de>
References
Methodological foundations:
Pattern detection in machine learning
Statistical dependency measures (Kendall's tau)
See Also
Useful links:
Report bugs at https://github.com/JornLotsch/detectXOR/issues
Related packages:
Examples
# Basic workflow with included dataset
data(XOR_data)
# Detect XOR patterns
results <- detect_xor(XOR_data, class_col = "class")
# Generate visualizations
generate_spaghetti_plot_from_results(
results$results_df,
XOR_data,
class_col = "class"
)
generate_xy_plot_from_results(
results$results_df,
XOR_data,
class_col = "class"
)
Synthetic XOR Pattern Dataset
Description
Simulated classification dataset containing 400 observations with 5 features demonstrating XOR patterns, linear class differences, and random noise.
Usage
data("XOR_data")
Format
A data frame with 400 rows and 6 variables:
- class
Binary class labels (1 or 2)
- Variable_A
Normally distributed with subtle class difference (delta mu=0.25)
- Variable_B
High-variance normal distribution (sigma=3) with moderate class separation (delta mu=-0.7)
- Variable_C
XOR pattern component 1 (mu=3 vs 10 between classes)
- Variable_D
XOR pattern component 2 (mu=3 vs 10 between classes)
- Variable_E
Uniform noise (1-10)
Source
Synthetic data generated with rnorm() and runif()
Examples
data(XOR_data)
str(XOR_data)
summary(XOR_data)
Detect XOR Patterns in Variable Pairs
Description
Identifies XOR-shaped relationships between variables using statistical tests and pattern detection.
Usage
detect_xor(
data,
class_col = "class",
check_tau = TRUE,
compute_axes_parallel_significance = TRUE,
p_threshold = 0.05,
tau_threshold = 0.3,
abs_diff_threshold = 20,
split_method = "quantile",
max_cores = 1,
extreme_handling = "winsorize",
winsor_limits = c(0.05, 0.95),
scale_data = TRUE,
use_complete = TRUE
)
Arguments
data |
Data frame containing features and class column |
class_col |
Name of class column (default: "class") |
check_tau |
Logical - compute classwise tau coefficients (default: TRUE) |
compute_axes_parallel_significance |
Logical - compute Wilcoxon tests (default: TRUE) |
p_threshold |
Significance threshold (default: 0.05) |
tau_threshold |
Tau coefficient threshold (default: 0.3) |
abs_diff_threshold |
Absolute difference threshold for patterns (default: 20) |
split_method |
Method for splitting data ("quantile" or "range") (default: "quantile") |
max_cores |
Maximum cores for parallel processing (default: NULL = automatic) |
extreme_handling |
Method for handling extreme values; options include "winsorize" or "none" (default: "winsorize") |
winsor_limits |
Numeric vector of length 2 specifying lower and upper quantiles for winsorization (default: c(0.05, 0.95)) |
scale_data |
Logical; whether to scale/standardize the data before analysis (default: TRUE) |
use_complete |
Logical; whether to use only complete cases (default: TRUE) |
Details
This function performs an analysis to detect XOR-like patterns in pairwise variable relationships within two-class data sets. The analysis pipeline includes:
Data preprocessing (winsorization, scaling, complete cases)
Tile pattern analysis using chi-squared tests
Classwise Kendall tau correlation analysis
Group-wise Wilcoxon significance tests
The function automatically handles parallel processing when multiple cores are available and returns both a summary data frame and detailed results for further analysis.
Value
List containing:
results_df |
Data frame with detection results for all variable pairs |
pair_list |
Detailed analysis results for each variable pair |
See Also
generate_spaghetti_plot_from_results
for spaghetti plot visualization, generate_xy_plot_from_results
for scatter plot visualization, generate_xor_reportConsole
for console reporting, generate_xor_reportHTML
for HTML report generation, XOR_data
for example dataset
Examples
# Load example data
data(XOR_data)
# Run XOR detection
results <- detect_xor(data = XOR_data, class_col = "class")
# View summary of detected patterns
print(results$results_df["xor_shape_detected"])
# Generate visualizations
spaghetti_plot <- generate_spaghetti_plot_from_results(
results = results,
data = XOR_data,
class_col = "class"
)
print(spaghetti_plot)
xy_plot <- generate_xy_plot_from_results(
results = results,
data = XOR_data,
class_col = "class"
)
print(xy_plot)
# Generate console report (doesn't write files)
generate_xor_reportConsole(results, XOR_data, "class", show_plots = FALSE)
# View detailed results for detected pairs
detected_pairs <- results$results_df[results$results_df$xor_shape_detected == TRUE, ]
print(detected_pairs)
Generate XOR Spaghetti Plots
Description
Creates connected line plots for variable pairs showing XOR patterns.
Usage
generate_spaghetti_plot_from_results(
results,
data,
class_col,
scale_data = TRUE
)
Arguments
results |
Either a data frame from |
data |
Original dataset containing variables and classes |
class_col |
Character string specifying the name of the class column |
scale_data |
Logical indicating whether to scale variables before plotting (default: TRUE) |
Details
This function creates spaghetti plots (connected line plots) for variable pairs that have been flagged as showing XOR patterns by detect_xor()
. The function automatically handles both original and rotated XOR patterns, applying the appropriate coordinate transformation when necessary.
The function accepts either the full results object returned by detect_xor()
or just the results_df
component extracted from it. Variable pairs are separated using "||" as the delimiter in plot labels.
If no XOR patterns are detected, an empty plot with an appropriate message is returned.
To save the plot, use ggplot2::ggsave()
or other standard R plotting save methods.
Value
Returns a ggplot object. No files are saved automatically.
See Also
detect_xor
for XOR pattern detection, generate_xy_plot_from_results
for scatter plots
Examples
# Using full results object (recommended)
data(XOR_data)
results <- detect_xor(data = XOR_data, class_col = "class")
spaghetti_plot <- generate_spaghetti_plot_from_results(
results = results,
data = XOR_data,
class_col = "class"
)
# Display the plot
print(spaghetti_plot)
# Save the plot if needed
# ggplot2::ggsave("my_spaghetti_plot.png", spaghetti_plot)
# Using extracted results_df (also works)
xy_plot <- generate_spaghetti_plot_from_results(
results = results$results_df,
data = XOR_data,
class_col = "class"
)
Generate XOR Detection Report (Console-friendly)
Description
Creates a report with formatted table and plots for XOR pattern detection results.
Usage
generate_xor_reportConsole(
results,
data,
class_col,
scale_data = TRUE,
show_plots = TRUE,
quantile_lines = c(1/3, 2/3),
line_method = "quantile"
)
Arguments
results |
Either a data frame from |
data |
Original dataset containing variables and classes. |
class_col |
Character specifying the class column name. |
scale_data |
Logical indicating whether to scale variables in plots. Default: TRUE. |
show_plots |
Logical indicating whether to display plots. Default: TRUE. |
quantile_lines |
Numeric vector of quantiles for reference lines in XY plots. Default: c(1/3, 2/3). |
line_method |
Method for boundary calculation ("quantile" or "range"). Default: "quantile". |
Value
Invisibly returns a list containing the formatted table and plots (if generated).
See Also
detect_xor
for XOR pattern detection,
generate_xor_reportHTML
for HTML report generation
Generate XOR Detection HTML Report
Description
Creates an HTML report with formatted table and plots for XOR pattern detection results.
Usage
generate_xor_reportHTML(
results,
data,
class_col,
output_file = "xor_detection_report.html",
open_browser = TRUE,
scale_data = TRUE,
quantile_lines = c(1/3, 2/3),
line_method = "quantile"
)
Arguments
results |
Either a data frame from |
data |
Original dataset containing variables and classes. |
class_col |
Character specifying the class column name. |
output_file |
Character specifying the output HTML file name. Default: "xor_detection_report.html". |
open_browser |
Logical indicating whether to open the report in browser automatically. Default: TRUE. |
scale_data |
Logical indicating whether to scale variables in plots. Default: TRUE. |
quantile_lines |
Numeric vector of quantiles for reference lines in XY plots. Default: c(1/3, 2/3). |
line_method |
Method for boundary calculation ("quantile" or "range"). Default: "quantile". |
Value
Invisibly returns the file path of the generated HTML report.
See Also
detect_xor
for XOR pattern detection,
generate_xor_reportConsole
for text-based report generation
Generate XOR Scatter Plots
Description
Creates scatterplots with decision boundaries for variable pairs showing XOR patterns.
Usage
generate_xy_plot_from_results(
results,
data,
class_col,
scale_data = TRUE,
quantile_lines = c(1/3, 2/3),
line_method = "quantile"
)
Arguments
results |
Either a data frame from |
data |
Original dataset containing variables and classes |
class_col |
Character string specifying the name of the class column |
scale_data |
Logical indicating whether to scale variables before plotting (default: TRUE) |
quantile_lines |
Numeric vector of length 2 specifying quantiles for reference lines (default: c(1/3, 2/3)) |
line_method |
Character string specifying the boundary calculation method, either "quantile" or "range" (default: "quantile") |
Details
This function creates scatter plots for variable pairs that have been flagged as showing XOR patterns by detect_xor()
. The plots include dashed reference lines that help visualize the decision boundaries used in XOR pattern detection.
The function automatically handles both original and rotated XOR patterns, applying the appropriate coordinate transformation when necessary. Variable pairs are separated using "||" as the delimiter in plot labels.
The line_method
parameter controls how reference lines are calculated:
"quantile": Lines are placed at the specified quantiles of the data distribution
"range": Lines divide the data range into three equal parts
If no XOR patterns are detected, an empty plot with an appropriate message is returned.
To save the plot, use ggplot2::ggsave()
or other standard R plotting save methods.
Value
Returns a ggplot object. No files are saved automatically.
See Also
detect_xor
for XOR pattern detection, generate_spaghetti_plot_from_results
for spaghetti plots
Examples
# Using full results object (recommended)
data(XOR_data)
results <- detect_xor(data = XOR_data, class_col = "class")
xy_plot <- generate_xy_plot_from_results(
results = results,
data = XOR_data,
class_col = "class"
)
# Display the plot
print(xy_plot)
# Using different boundary method
xy_plot_range <- generate_xy_plot_from_results(
results = results,
data = XOR_data,
class_col = "class",
line_method = "range"
)
# Save the plot if needed
# ggplot2::ggsave("my_xy_plot.png", xy_plot)
# Using extracted results_df (also works)
xy_plot_df <- generate_xy_plot_from_results(
results = results$results_df,
data = XOR_data,
class_col = "class"
)