get_writer_profiles() and
plot_writer_profiles() so that they are exported
functions.Created get_cluster_fill_rates() to calculate the
proportion of graphs assigned to each cluster in a cluster
template.
Created get_writer_profiles() to estimate writer
profiles for all handwritten documents in the input directory. Writer
profiles are the cluster fill counts or the cluster fill rates. This
function runs process_batch_dir() to split the handwriting
into component shapes called graphs. Then, it runs
get_clusters_batch() to assign each graph to a cluster in a
cluster template. Lastly, it runs get_cluster_fill_counts()
or get_cluster_fill_rates() depending on whether the user
chooses counts or rates.
Changed the writer_indices and
doc_indices arguments in get_clusters_batch()
to be optional. The main reason for this change is so that users do not
need to use a naming convention for their handwriting files in order to
use handwriterRF::calculate_slr() and
handwriterRF::compare_documents() which call
get_clusters_batch().
Fixed bug in get_clusters_batch() so that the
warning “closing unused connection” no longer occurs if the user runs
the function in parallel.
The new function plotGraphs() added in version 3.2.0
mistakenly was not exported. The function has been renamed as
plot_graphs() and is now available to users.
Fixed a bug in get_clusters_batch() so it no longer
returns the error message, “Unable to get cluster assignments for one or
more documents,” if the file all_clusters.rds exists in the output
directory.
Fixed a bug in get_clusters_batch() so it no longer
mistakenly returns the error message, “Unable to get cluster assignments
for one or more documents,” even if cluster assignments were
successfully processed for all input documents.
If the graphs from the model training documents do not populate
all available clusters in the cluster template, one or more of the
graphs from the questioned document(s) could potentially be assigned to
a cluster in the template that doesn’t contain any model training
graphs. Previously, analyze_questioned_documents() threw an
error if this occurs. Now, it only throws an error if more than 10% of
the graphs from the questioned documents are assigned to clusters that
are not populated by model training graphs. The mismatch between
clusters used by the model document graphs and clusters used by the
questioned document(s) graphs only becomes an issue if it occurs for a
large number of graphs.
The new function plot_cluster_centers() creates a
plot of the clusters centers from a cluster template. The cluster
centers are displayed as orange shapes. The function also plots all
graphs in each cluster as grey shapes with 5% transparency to depict the
variability of graph shapes within each cluster.
The new function plotGraphs() plots every graph in a
document processed with processDocument().
Fixed bug in processDocument() when the writing in
the document is a single connected component, such as a single word
written in cursive. Previously, the output of
processDocument() for this kind of document was formatted
incorrectly.
Fixed bug in get_credible_intervals() and
plot_credible_intervals() where the model writers were
numbered sequentially. Now these functions use the writer IDs.
Fixed bug in format_template_data() where the
function coerced writer IDs to integers even if the writer IDs contained
characters.
Fixed bug in get_clusters_batch() where the function
would stall but not return an error message if a document had a graph
with a large number of edges (paths). Now the function ignores graphs
with more than 30 edges.
Fixed bug in fit_model() where the function saves
the same data in two separate files: “model_clusters.rds” and
“all_clusters.rds”. The argument ‘save_master_file’ was added to
get_clusters_batch(). If TRUE, a data frame of all cluster
assignments, “all_clusters.rds”, will be saved. The default is
FALSE.
fit_model() and
analyze_questioned_documents() now allow writer IDs that
contain numbers and letters.Increased the speed of processHandwriting() by
changing the function to process a handwritten document in sections
instead of all at once. Nodes created by
processHandwriting() in version 3.1.0 might differ slightly
in placement from previous versions.
Fixed bugs in fit_model() and
analyze_questioned_documents() introduced by the changes to
process_batch_list() in version 3.0.0.
Major reductions made to the memory required by
process_batch_list() so it can now process paragraph length
documents from the CSAFE Handwriting Database on machines with 8 GB of
RAM.
process_batch_list() now skips to the next document
if an error is returned while trying to process a document. A log file
records the document name(s) and error message(s) of any problem
documents. If the user reruns process_batch_list() they now
have the option to either try a second time to process problem documents
or skip them entirely.
New function get_clusters_batch() calculates cluster
assignments of all files in a directory.
fit_model() allows the user to fit a statistical
model to known handwriting samples collected from a closed-set of
persons.
analyze_questioned_documents() in conjunction with a
model created by fit_model() allows a user to calculate the
posterior probability that each known writer in the model is the writer
of the questioned document(s).
analyze_questioned_documents() only works when the
questioned document(s) had to have been written by one of the model
writers. This function must NOT be used if someone other than one of the
model writers could written the questioned document(s) as it could yield
misleading results.
processDocument() is a new wrapper function that
runs readPNGBinary(), thinImage(), and
processHandwriting() so the user doesn’t need to run these
functions individually.
plotImage(), plotImageThinned(), and
plotNodes now only need one input, a document processed
with processDocument()
processHandwriting() no longer crashes when the
input writing is a single letter or word
read_and_process() superseded in favor of
processDocument()
extractGraphs() superseded in favor of
process_batch_dir()