Type: | Package |
Title: | Data Science Box of Pandora Miscellaneous |
Version: | 0.3.3 |
Date: | 2020-09-11 |
Description: | Tool collection for common and not so common data science use cases. This includes custom made algorithms for data management as well as value calculations that are hard to find elsewhere because of their specificity but would be a waste to get lost nonetheless. Currently available functionality: find sub-graphs in an edge list data.frame, find mode or modes in a vector of values, extract (a) specific regular expression group(s), generate ISO time stamps that play well with file names, or generate URL parameter lists by expanding value combinations. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Imports: | Rcpp (≥ 1.0.1), stringr |
LinkingTo: | Rcpp |
RoxygenNote: | 7.1.1 |
Encoding: | UTF-8 |
SystemRequirements: | C++11 |
Suggests: | covr, testthat, spelling |
Language: | en-US |
NeedsCompilation: | yes |
Packaged: | 2020-09-11 18:57:01 UTC; peter |
Author: | Peter Meissner [aut, cre] |
Maintainer: | Peter Meissner <retep.meissner@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2020-09-12 05:40:13 UTC |
df_defactorize
Description
df_defactorize
Usage
df_defactorize(df)
Arguments
df |
a data.frame like object |
Value
returns the same data.frame except that factor columns have been transformed into character columns
Examples
df <-
data.frame(
a = 1:2,
b = factor(c("a", "b")),
c = as.character(letters[3:4]),
stringsAsFactors = FALSE
)
vapply(df, class, "")
df_df <- df_defactorize(df)
vapply(df_df, class, "")
Subgraphs in Undirected Graphs/Networks
Description
Finding and indexing subgraphs in undirected graph.
Usage
graphs_find_subgraphs(id_1, id_2, verbose = 1L)
Arguments
id_1 |
vector of integers indicating ids |
id_2 |
vector of integers indicating ids |
verbose |
in integer indicating the amount of verbosity; good for long running tasks or to get more information about the workings of the algorithm; currently accepted values: 0, 1, 2 |
Details
Input is given as two vectors where each pair of node ids 'id_1[i]' - 'id_2[i]' indicates an edge between two nodes.
Value
An integer vector with subgraph ids such that each distinct subgraph - i.e. all nodes are reachable within the graph and no node outside the subgraph is reachable - gets a distinct integer value. Integer values are assigned via
Examples
graphs_find_subgraphs(c(1,2,1,5,6,6), c(2,3,3,4,5,4), verbose = 0)
graphs_find_subgraphs(c(1,2,1,5,6,6), c(2,3,3,4,5,4), verbose = 2)
Mode
Description
Function calculating the mode.
Usage
stats_mode(x, multimodal = FALSE, warn = TRUE)
Arguments
x |
vector to get mode for |
multimodal |
wether or not all modes should be returned in case of more than one |
warn |
should the function warn about multimodal outcomes? |
Value
vector of mode or modes
Mode Allowing for Multi Modal Mode
Description
Function calculating the mode, allowing for multiple modes in case of equal frequencies.
Usage
stats_mode_multi(x)
Arguments
x |
vector to get mode for |
Value
vector with all modes
Extract Regular Expression Groups
Description
Extract Regular Expression Groups
Usage
str_group_extract(string, pattern, group = NULL, nas = TRUE)
Arguments
string |
string to extract from |
pattern |
pattern with groups to match |
group |
groups to extract |
nas |
return NA values (TRUE) or filter them out (FALSE) |
Value
string vector or string matrix
Examples
strings <- paste(LETTERS, seq_along(LETTERS), sep = "_")
str_group_extract(strings, "([\\w])_(\\d+)")
str_group_extract(strings, "([\\w])_(\\d+)", 1)
str_group_extract(strings, "([\\w])_(\\d+)", 2)
Time Stamps for File Names
Description
Generating file name ready iso time stamps.
Usage
time_stamp(ts = Sys.time(), sep = c("-", "_", "_"))
Arguments
ts |
one or more POSIX time stamp |
sep |
separators to be used for formatting |
Value
Returns timestamp string in format yyyy-mm-dd_HH_MM_SS ready to be used safely in file names on various operating systems.
Examples
time_stamp()
time_stamp( Sys.time() - 10000 )
URL Parameter Combinations
Description
Generate URL parameter combinations from sets of parameter values.
Usage
web_gen_param_list_expand(..., sep_1 = "=", sep_2 = "&")
Arguments
... |
multiple vectors passed on as named arguments or a single list or a data.frame |
sep_1 |
first separator to use between key and value |
sep_2 |
second separator to use between key-value pairs |
Value
string vector with assembled query string parameter combinations
Examples
web_gen_param_list_expand(q = "beluga", lang = c("de", "en"))