Help for package summarytabl

Type:

Package

Title:

Generate Summary Tables for Continuous, Ordinal, and Categorical Data

Version:

0.1.0

Maintainer:

Ama Nyame-Mensah <ama@anyamemensah.com>

URL:

https://anyamemensah.github.io/summarytabl/

Description:

Provides functions for tabulating and summarizing continuous, ordinal, and categorical variables in data frames. The package was designed to streamline exploratory data analysis and simplify the creation of summary tables for reports and other purposes.

License:

MIT + file LICENSE

Encoding:

UTF-8

LazyData:

true

Imports:

dplyr, purrr, rlang, stats, tibble, tidyr

RoxygenNote:

7.3.3

Suggests:

knitr, rmarkdown, testthat (≥ 3.0.0)

VignetteBuilder:

knitr

Depends:

R (≥ 4.1.0)

NeedsCompilation:

Packaged:

2025-09-30 02:56:26 UTC; AmaNM

Author:

Ama Nyame-Mensah [aut, cre]

Repository:

CRAN

Date/Publication:

2025-10-06 08:00:02 UTC

summarytabl: Generate Summary Tables for Continuous, Ordinal, and Categorical Data

Description

Author(s)

Maintainer: Ama Nyame-Mensah ama@anyamemensah.com

Summarize a categorical variable by a grouping variable

Description

cat_group_tbl() presents frequency counts and percentages (count, percent) for nominal or categorical variables by some grouping variable. Relative frequencies and percentages of each level of the primary categorical variable (row_var) within each level of the grouping variable (col_var) can be returned. Missing data can be excluded for either variable from the calculations. By default, the table is returned in the long format.

Usage

cat_group_tbl(
  data,
  row_var,
  col_var,
  na.rm.row_var = FALSE,
  na.rm.col_var = FALSE,
  only = NULL,
  ignore = NULL,
  pivot = "longer"
)

Arguments

data

A data frame.

row_var

A character string of the name of a column in data containing categorical data. This is the primary categorical variable. When pivoted to the wider format, the categories of this variable will appear in the rows of the table.

col_var

na.rm.row_var

A logical value indicating whether missing values for row_var should be removed before calculations. Default is FALSE.

na.rm.col_var

A logical value indicating whether missing values for col_var should be removed before calculations. Default is FALSE.

only

A character string or vector of strings indicating the types of summary data to return. The default is NULL, which includes both counts and percentages. To return only one type, specify count or percent. Percentages are calculated column- wise, grouped by col_var.

ignore

A named character vector or list containing values to ignore from row_var and col_var.

pivot

A character string specifying the format of the returned summary table. The default is longer, which returns the data in long format. To return the data in wide format, use wider.

Value

A tibble displaying relative frequency counts and/or percentages of row_var, grouped by col_var. When the output is in wider format, columns prefixed with count_ and percent_ contain the frequency and proportion, respectively, for each distinct response value of row_var within each level of col_var.

Author(s)

Ama Nyame-Mensah

Examples

cat_group_tbl(data = nlsy,
              row_var = "gender",
              col_var = "bthwht",
              pivot = "wider",
              only = "count")

cat_group_tbl(data = nlsy,
              row_var = "birthord",
              col_var = "breastfed",
              pivot = "longer")

Summarize a categorical variable

Description

cat_group_tbl() presents frequency counts and percentages (count, percent) for nominal or categorical variables. Missing data can be excluded from the calculations.

Usage

cat_tbl(data, var, na.rm = FALSE, only = NULL, ignore = NULL)

Arguments

data

A data frame.

var

A character string of the name of a variable in data containing categorical data.

na.rm

A logical value indicating whether missing values should be removed before calculations. Default is FALSE.

only

A character string, or vector of character strings, of the types of summary data to return. Default is NULL, which returns both counts and percentages. To return only counts or percentages, use count or percent, respectively.

ignore

An optional vector that contains values to exclude from the data. Default is NULL, which includes all present values.

Value

A tibble displaying the relative frequency counts and/or percentages of row_var.

Author(s)

Ama Nyame-Mensah

Examples

cat_tbl(data = nlsy, var = "gender")

cat_tbl(data = nlsy, var = "race", only = "count")

cat_tbl(data = nlsy,
        var = "race",
        ignore = "Hispanic",
        only = "percent",
        na.rm = TRUE)

Check a named vector

Description

This function assesses whether named lists and vectors have invalid values (like NULL or NA), invalid names (such as missing or empty names), confirms that the count of valid names matches the count of provided values, and verifies that the valid names obtained from the named object align with the supplied names. If any checks fail, the default value is returned.

Usage

check_named_vctr(x, names, default)

Arguments

x

A named vector.

names

A character vector specifying the names to be matched.

default

Default value to return

Value

Either the original object, x, or the default value.

Author(s)

Ama Nyame-Mensah

Examples


# returns NULL
check_named_vctr(x = c(one = 1, two = 2, 3), 
                 names = c("one", "two", "three"),
                 default = NULL)
                 
# returns x
check_named_vctr(x = list(one = 1, two = 2, three = 3), 
                 names = list("one", "two", "three"),
                 default = NULL)

Depressive Symptoms Data

Description

These data are a subset from the National Longitudinal Survey of Youth (NLSY) 1979 Children and Young Adults. The dataset includes information about depressive symptoms in children and young adults. The dataset has 11,551 observations and 12 variables.

For more information about the National Longitudinal Survey of Youth, visit https://www.nlsinfo.org/.

Usage

depressive

Format

A data.frame with 11,551 rows and 12 columns:

cid: Child identification number)
race: race of child (1 = Hispanic, 2 = Black, 3 = Non-Black,Non-Hispanic)
gender: gender of child (1 = male, 2 = female)
yob: year of child's bith
dep_1: how often child feels sad and blue (1 = often, 2 = sometimes, 3 = hardly ever)
dep_2: how often child feels nervous, tense, or on edge (1 = often, 2 = sometimes, 3 = hardly ever)
dep_3: how often child feels happy (1 = often, 2 = sometimes, 3 = hardly ever)
dep_4: how often child feels bored (1 = often, 2 = sometimes, 3 = hardly ever)
dep_5: how often child feels lonely (1 = often, 2 = sometimes, 3 = hardly ever)
dep_6: how often child feels tired or worn out (1 = often, 2 = sometimes, 3 = hardly ever)
dep_7: how often child feels excited about something (1 = often, 2 = sometimes, 3 = hardly ever)
dep_8: how often child feels too busy to get everything (1 = often, 2 = sometimes, 3 = hardly ever)

Summarize continuous variables by group

Description

mean_group_tbl() presents descriptive statistics (mean, sd, minimum, maximum, number of non-missing observations) for interval (e.g., Test scores) and ratio level (e.g., Age) variables with the same variable stem by some grouping variable. A variable stem is a common prefix found in related variable names, often corresponding to similar survey items, that represents a shared concept before unique identifiers (like time points) are added. For example, in the stem_social_psych dataset, the two variables 'belong_belongStem_w1' and 'belong_belongStem_w2' share the variable stem 'belong_belongStem' (e.g., "I feel like an outsider in STEM"), with suffixes (_w1, _w2) indicating different measurement waves. By default, missing data are excluded from the calculations in a listwise fashion.

Usage

mean_group_tbl(
  data,
  var_stem,
  group,
  escape_stem = FALSE,
  ignore_stem_case = FALSE,
  group_type = "variable",
  group_name = NULL,
  escape_group = FALSE,
  ignore_group_case = FALSE,
  remove_group_non_alnum = TRUE,
  na_removal = "listwise",
  only = NULL,
  var_labels = NULL,
  ignore = NULL
)

Arguments

data

A data frame.

var_stem

A character string of a variable stem or the full name of a variable in data.

group

A character string of a variable in data or a pattern to use to search for variables in data.

escape_stem

A logical value indicating whether to escape var_stem. Default is FALSE.

ignore_stem_case

A logical value indicating whether the search for columns matching the supplied var_stem is case-insensitive. Default is FALSE.

group_type

A character string that defines the type of grouping variable. Should be one of pattern or variable. Default is variable, in which case the variable matching the group string will be searched for within data.

group_name

A character string piped to the final table to replace the name of group.

escape_group

A logical value indicating whether to escape string supplied to group.

ignore_group_case

A logical value indicating whether group is case-insensitive. Default is FALSE.

remove_group_non_alnum

A logical value indicating whether to remove all non- alphanumeric characters (anything that is not a letter or number) from group. Default is TRUE.

na_removal

A character string specifying how to remove missing values. Should be one of pairwise or listwise. Default is listwise.

only

A character string or vector of character strings of the types of summary data to return. Default is NULL, which returns mean (mean), standard deviation (sd), minimum value (min), maximum value (max), and non-missing responses (nobs).

var_labels

An optional named character vector or list where each element maps labels to variable names. If any element is unnamed or if any labels do not match variables in returned from data, all labels will be ignored and the table will be printed without them.

ignore

An optional named vector or list specifying values to exclude from the dataset and analysis. By default, NULL includes all available values. To omit values from variables returned by var_stem, use the provided stem as the name. To exclude values from both var_stem variables and a grouping variable in data, supply a list.

Value

A tibble presenting summary statistics (e.g., mean, standard deviation, minimum value, maximum, number of non-missing observations) for a set of variables sharing the same variable stem. The results are grouped by either a grouping variable in the data or by a pattern matched with variable names.

Author(s)

Ama Nyame-Mensah

Examples

mean_group_tbl(data = stem_social_psych,
               var_stem = "belong_welcomedStem",
               group = "_w\\d",
               group_type = "pattern",
               na_removal = "pairwise",
               var_labels = c(belong_welcomedStem_w1 = "I feel welcomed in STEM workplaces",
                              belong_welcomedStem_w2 = "I feel welcomed in STEM workplaces"),
               group_name = "wave")

mean_group_tbl(data = social_psy_data,
               var_stem = "belong",
               group = "gender",
               group_type = "variable",
               na_removal = "pairwise",
               var_labels = c(belong_1 = "I feel like I belong at this institution",
                              belong_2 = "I  feel like part of the community",
                              belong_3 = "I feel valued by this institution"),
               group_name = "gender_identity")

grouped_data <-
  data.frame(
    symptoms.t1 = sample(c(1:5, -999), replace = TRUE, size = 50),
    symptoms.t2 = sample(c(NA, 1:5, -999), replace = TRUE, size = 50)
  )

mean_group_tbl(data = grouped_data,
               var_stem = "symptoms",
               group = ".t\\d",
               group_type = "pattern",
               escape_group = TRUE,
               na_removal = "listwise",
               ignore = c(symptoms = -999))

Summarize continuous variables

Description

mean_tbl() presents descriptive statistics (mean, sd, minimum, maximum, number of non-missing observations) for interval (e.g., Test scores) and ratio level (e.g., Age) variables with the same variable stem. A variable stem is a common prefix found in related variable names, often corresponding to similar survey items, that represents a shared concept before unique identifiers (like timep oints) are added. For example, in the stem_social_psych dataset, the two variables 'belong_belongStem_w1' and 'belong_belongStem_w2' share the variable stem 'belong_belongStem' (e.g., "I feel like an outsider in STEM"), with suffixes (_w1, _w2) indicating different measurement waves. By default, missing data are excluded from the calculations in a listwise fashion.

Usage

mean_tbl(
  data,
  var_stem,
  escape_stem = FALSE,
  ignore_stem_case = FALSE,
  na_removal = "listwise",
  only = NULL,
  var_labels = NULL,
  ignore = NULL
)

Arguments

data

A data frame.

var_stem

A character string of a variable stem or the full name of a variable in data.

escape_stem

A logical value indicating whether to escape var_stem. Default is FALSE.

ignore_stem_case

A logical value indicating whether the search for columns matching the supplied var_stem is case-insensitive. Default is FALSE.

na_removal

A character string specifying how to remove missing values. Should be one of pairwise or listwise. Default is listwise.

only

A character string or vector of character strings of the kinds of summary statistics to return. Default is NULL, which returns mean (mean), standard deviation (sd), minimum value (min), maximum value (max), and non-missing responses (nobs).

var_labels

ignore

An optional vector that contains values to exclude from the data. Default is NULL, which includes all present values.

Value

A tibble presenting summary statistics for series of continuous variables with the same variable stem.

Author(s)

Ama Nyame-Mensah

Examples


mean_tbl(data = social_psy_data,
         var_stem = "belong")

mean_tbl(data = social_psy_data,
         var_stem = "belong",
         na_removal = "pairwise",
         var_labels = c(belong_1 = "I feel like I belong at this institution",
                        belong_2 = "I feel like part of the community",
                        belong_3 = "I feel valued by this institution"))

National Longitudinal Survey of Youth (NLSY) Data

Description

These data are a subset from the National Longitudinal Survey of Youth (NLSY) 1979 Children and Young Adults.The data contains 2,976 observations and 10 variables.

For more information about the National Longitudinal Survey of Youth, visit https://www.nlsinfo.org/.

Usage

nlsy

Format

A tibble with 2,976 rows and 11 columns:

CID: Child identification number)
race: race of child (Hispanic, Black, Non-Black,Non-Hispanic)
gender: gender of child (1 = male, 0 = female)
birthord: birth order of child
magebirth: Age of mother at birth of child
bthwht: whether child was born low birth weight (1 = yes, 0 = no)
breastfed: whether child was breastfed (1 = yes, 0 = no)
medu: Highest grade completed by child’s mother
math: PIAT Math Standard Score
read: PIAT Reading Recognition Standard Score
hhnum: Number of household members in household

Summarize multiple response variables by group

Description

select_group_tbl() presents frequency counts and percentages (count, percent) for binary (e.g., Unselected/Selected) and ordinal (e.g., strongly disagree to strongly agree) variables with the same variable stem by some grouping variable. A variable stem is a common prefix found in related variable names, often corresponding to similar survey items, that represents a shared concept before unique identifiers (like timep oints) are added. For example, in the stem_social_psych dataset, the two variables belong_belongStem_w1 and belong_belongStem_w2 share the variable stem belong_belongStem (e.g., "I feel like an outsider in STEM"), with suffixes (_w1, _w2) indicating different measurement waves. By default, missing data are excluded from the calculations in a listwise fashion.

Usage

select_group_tbl(
  data,
  var_stem,
  group,
  escape_stem = FALSE,
  ignore_stem_case = FALSE,
  group_type = "variable",
  group_name = NULL,
  escape_group = FALSE,
  ignore_group_case = FALSE,
  remove_group_non_alnum = TRUE,
  na_removal = "listwise",
  pivot = "longer",
  only = NULL,
  var_labels = NULL,
  ignore = NULL
)

Arguments

data

A data frame.

var_stem

A character string of a variable stem or the full name of a variable in data.

group

A character string of a variable in data or a pattern to use to search for variables in data.

escape_stem

A logical value indicating whether to escape var_stem. Default is FALSE.

ignore_stem_case

A logical value indicating whether the search for columns matching the supplied var_stem is case-insensitive. Default is FALSE.

group_type

group_name

A character string piped to the final table to replace the name of group.

escape_group

A logical value indicating whether to escape string supplied to group.

ignore_group_case

A logical value indicating whether group is case- insensitive. Default is FALSE.

remove_group_non_alnum

A logical value indicating whether to remove all non-alphanumeric characters (anything that is not a letter or number) from group. Default is TRUE.

na_removal

A character string specifying how to remove missing values. Should be one of pairwise or listwise. Default is listwise.

pivot

A character string specifying the format of the returned summary table. The default is longer, which returns the data in long format. To return the data in wide format, use wider.

only

A character string or vector of character strings of the kinds of summary data to return. Default is NULL, which returns counts (count) and percentages (percent).

var_labels

ignore

Value

A tibble displaying frequency counts and/or percentages for each value of a set of variables sharing the same variable stem. The results are grouped by either a grouping variable in the data or by a pattern matched with variable names. When the output is in the wider format, columns beginning with count_value and percent_value prefixes report the count and percentage, respectively, for each distinct response value of the variable within each group.

Author(s)

Ama Nyame-Mensah

Examples

select_group_tbl(data = stem_social_psych,
                 var_stem = "belong_belong",
                 group = "\\d",
                 group_type = "pattern",
                 group_name = "wave",
                 na_removal = "pairwise",
                 pivot = "wider",
                 only = "count")

tas_recoded <-
  tas |>
  dplyr::mutate(sex = dplyr::case_when(
    sex == 1 ~ "female",
    sex == 2 ~ "male",
    TRUE ~ NA)) |>
  dplyr::mutate(dplyr::across(
    .cols = dplyr::starts_with("involved_"),
    .fns = ~ dplyr::case_when(
      .x == 1 ~ "selected",
      .x == 0 ~ "unselected",
      TRUE ~ NA)
  ))

select_group_tbl(data = tas_recoded,
                 var_stem = "involved_",
                 group = "sex",
                 group_type = "variable",
                 na_removal = "pairwise",
                 pivot = "wider")

depressive_recoded <-
  depressive |>
  dplyr::mutate(sex = dplyr::case_when(
    sex == 1 ~ "male",
    sex == 2 ~ "female",
    TRUE ~ NA)) |>
  dplyr::mutate(dplyr::across(
    .cols = dplyr::starts_with("dep_"),
    .fns = ~ dplyr::case_when(
      .x == 1 ~ "often",
      .x == 2 ~ "sometimes",
      .x == 3 ~ "hardly",
      TRUE ~ NA
    )
  ))

select_group_tbl(data = depressive_recoded,
                 var_stem = "dep",
                 group = "sex",
                 group_type = "variable",
                 na_removal = "listwise",
                 pivot = "wider",
                 only = "percent",
                 var_labels =
                   c("dep_1" = "how often child feels sad and blue",
                     "dep_2" = "how often child feels nervous, tense, or on edge",
                     "dep_3" = "how often child feels happy",
                     "dep_4" = "how often child feels bored",
                     "dep_5" = "how often child feels lonely",
                     "dep_6" = "how often child feels tired or worn out",
                     "dep_7" = "how often child feels excited about something",
                     "dep_8" = "how often child feels too busy to get everything"))

Summarize multiple response variables

Description

select_tbl() presents frequency counts and percentages (count, percent) for binary (e.g., Unselected/Selected) and ordinal (e.g., strongly disagree to strongly agree) variables with the same variable stem. A variable stem is a common prefix found in related variable names, often corresponding to similar survey items, that represents a shared concept before unique identifiers (like time points) are added. For example, in the stem_social_psych dataset, the two variables belong_belongStem_w1 and belong_belongStem_w2 share the variable stem belong_belongStem (e.g., "I feel like an outsider in STEM"), with suffixes (_w1, _w2) indicating different measurement waves. By default, missing data are excluded from the calculations in a listwise fashion.

Usage

select_tbl(
  data,
  var_stem,
  escape_stem = FALSE,
  ignore_stem_case = FALSE,
  na_removal = "listwise",
  pivot = "longer",
  only = NULL,
  var_labels = NULL,
  ignore = NULL
)

Arguments

data

A data frame.

var_stem

A character string of a variable stem or the full name of a variable in data.

escape_stem

A logical value indicating whether to escape var_stem. Default is FALSE.

ignore_stem_case

A logical value indicating whether the search for columns matching the supplied var_stem is case-insensitive. Default is FALSE.

na_removal

A character string specifying how to remove missing values. Should be one of pairwise or listwise. Default is listwise.

pivot

A character string specifying the format of the returned summary table. The default is longer, which returns the data in long format. To return the data in wide format, use wider.

only

A character string or vector of character strings of the kinds of summary data to return. Default is NULL, which returns counts (count) and percentages (percent).

var_labels

ignore

An optional vector that contains values to exclude from the data. Default is NULL, which includes all present values.

Value

A tibble displaying frequency counts and/or percentages for each value of a set of variables sharing the same variable stem. When the output is in the wider format, columns beginning with count_value and percent_value prefixes report the count and percentage, respectively, for each distinct response value of the variable.

Author(s)

Ama Nyame-Mensah

Examples

select_tbl(data = tas,
           var_stem = "involved_",
           na_removal = "pairwise")

select_tbl(data = depressive,
           var_stem = "dep",
           na_removal = "listwise",
           pivot = "wider",
           only = "percent")

var_label_example <-
  c("dep_1" = "how often child feels sad and blue",
    "dep_2" = "how often child feels nervous, tense, or on edge",
    "dep_3" = "how often child feels happy",
    "dep_4" = "how often child feels bored",
    "dep_5" = "how often child feels lonely",
    "dep_6" = "how often child feels tired or worn out",
    "dep_7" = "how often child feels excited about something",
    "dep_8" = "how often child feels too busy to get everything")

select_tbl(data = depressive,
           var_stem = "dep",
           na_removal = "pairwise",
           pivot = "longer",
           var_labels = var_label_example)

select_tbl(data = depressive,
           var_stem = "dep",
           na_removal = "pairwise",
           pivot = "wider",
           only = "count",
           var_labels = var_label_example)

Social Psychological (Generated) Data

Description

These data were generated to produce social psychological data applicable to real-world contexts.

Usage

social_psy_data

Format

A data.frame with 10,200 rows and 17 columns:

id: participant id number)
belong_1: I feel like I belong at this institution (1=Strongly Disagree, 2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
belong_2: I feel like part of the community (1=Strongly Disagree, 2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
belong_3: I feel valued by this institution (1=Strongly Disagree, 2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
identity_1: This institution is a big part of who I am (1=Strongly Disagree,2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
identity_2: I feel comfortable being myself in this setting (1=Strongly Disagree,2=Disagree,3=Neither agree nor disagree,4=Agree, 5=Strongly Agree)
identity_3: This institution is a big part of who I am (1=Strongly Disagree, 2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
identity_4: I care about doing well at this institution (1=Strongly Disagree, 2=Disagree,3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
selfEfficacy_1: I am confident about A (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
selfEfficacy_2: I am confident about B (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
selfEfficacy_3: I am confident about C (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
selfEfficacy_4: I am confident about D (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
selfEfficacy_5: I am confident about E (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
selfEfficacy_6: I am confident about F (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
selfEfficacy_7: I am confident about G (1=Strongly Disagree,2=Disagree, 3=Neither agree nor disagree,4=Agree,5=Strongly Agree)
gender: Participant's gender identity (1=Woman,2=Man,3=Non-binary, 4=Self-identify,5=Transgender,6=Gender-queer/non-conforming)
citizen: Participant's citizenship status (1=U.S. citizen,2=Non-U.S. citizen with permanent residency,3=Non-U.S. citizen with temporary visa,4=Other)

STEM Social Psychological (Generated) Data

Description

These data were generated to produce social psychological data applicable to a subset of college students participating in a Science, Technology, Engineering, and Mathematics (STEM) intervention program.

Usage

stem_social_psych

Format

A data.frame with 786 rows and 37 columns:

id: student id number)
belong_belongStem_w1: I feel like I belong in STEM (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
belong_outsiderStem_w1: I feel like an outsider in STEM (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
identity_identityStem_w1: STEM is a big part of who I am. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
belong_welcomedStem_w1: I feel welcomed in STEM workplaces (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
identity_noCommonStem_w1: I do not have much in common with the other students in my STEM classes.(1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
selfEfficacy_passStemCourses_w1: pass my STEM courses.(1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
selfEfficacy_learnConcepts_w1: learn the foundations and concepts of scientific thinking. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
selfEfficacy_stemField_w1: do well in a stem-related field. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
selfEfficacy_learnScience_w1: quickly learn new science areas, systems, techniques or concepts on my own. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
selfEfficacy_contributeProject_w1: contribute to a science project. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
selfEfficacy_commScience_w1: clearly communicate scientific problems and findings to varied audiences (1=Strongly disagree,2=Somewhat disagree, 3=Neither disagree nor agree, 4=Somewhat agree,5=Strongly agree)
selfEfficacy_scientist_w1: become a scientist. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
selfEfficacy_completeUG_w1: complete an undergraduate STEM degree. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
selfEfficacy_admitGrad_w1: get admitted to a graduate STEM program. (1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
selfEfficacy_successGrad_w1: be successful in a graduate STEM program. (1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
belong_belongStem_w2: I feel like I belong in STEM (1=Strongly disagree, 2=Somewhat disagree, 3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
belong_outsiderStem_w2: I feel like an outsider in STEM. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
identity_identityStem_w2: STEM is a big part of who I am. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
belong_welcomedStem_w2: I feel welcomed in STEM workplaces. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
identity_noCommonStem_w2: I do not have much in common with the other students in my STEM classes.(1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
selfEfficacy_passStemCourses_w2: pass my STEM courses. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
selfEfficacy_learnConcepts_w2: learn the foundations and concepts of scientific thinking. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
selfEfficacy_stemField_w2: do well in a stem-related field. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
selfEfficacy_learnScience_w2: quickly learn new science areas, systems, techniques or concepts on my own. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree, 4=Somewhat agree, 5=Strongly agree)
selfEfficacy_contributeProject_w2: contribute to a science project. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
selfEfficacy_commScience_w2: clearly communicate scientific problems and findings to varied audiences (1=Strongly disagree,2=Somewhat disagree, 3=Neither disagree nor agree, 4=Somewhat agree,5=Strongly agree)
selfEfficacy_scientist_w2: become a scientist. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree,5=Strongly agree)
selfEfficacy_completeUG_w2: complete an undergraduate STEM degree. (1=Strongly disagree, 2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
selfEfficacy_admitGrad_w2: get admitted to a graduate STEM program. (1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
selfEfficacy_successGrad_w2: be successful in a graduate STEM program. (1=Strongly disagree,2=Somewhat disagree,3=Neither disagree nor agree,4=Somewhat agree, 5=Strongly agree)
is_male: Participant's current sex (0=Not Male,1=Male)
has_disability: Whether participant has a disability (0=No, 1=Yes)
firstGen: Whether participant is a first generation college student (0=No, 1=Yes)
stemMajor: Whether participant is a STEM Major (0=No, 1=Yes)
expLearning: Whether student has participated in an experiential learning program, such as an internship, research, or leadership opportunity. (0=No, 1=Yes)
urm: Whether participant is Asian, Middle Eastern/Arab or White (0) vs. Black, Indigenous, Hispanic/Latino, or Mixed Race (1)

Panel Study of Income Dynamics (PSID) Transition into Adulthood Supplement (TAS) Data

Description

These data are a subset from the Panel Study of Income Dynamics (PSID) Transition into Adulthood Supplement. The data contains 2,526 observations and 8 variables.

For more information about the Panel Study of Income Dynamics, visit https://psidonline.isr.umich.edu/CDS/default.aspx.

Usage

tas

Format

A tibble with 2,526 rows and 8 columns:

pid: personal identification number)
sex: sex of individual (1 = female, 2 = male)
involved_arts: whether the individual participated in any organized activities related to art, music, or the theater in the last 12 months (1 = yes, 0 = no)
involved_sports: whether the individual was a member of any athletic or sports teams in the last 12 months (1 = yes, 0 = no)
involved_schoolClubs: whether the individual was involved with any high school or college clubs or student government in the last 12 months (1 = yes, 0 = no)
involved_election: whether the individual voted in the national election in November 2016 that was held to elect the President (1 = yes, 0 = no)
involved_socialActionGrps: whether the individual was involved in any political groups, solidarity or ethnic-support groups or social-action groups in the last 12 months (1 = yes, 0 = no)
involved_volunteer: whether the individual was involved in any unpaid volunteer or community service work in the last 12 months (1 = yes, 0 = no)

summarytabl: Generate Summary Tables for Continuous, Ordinal, and Categorical Data

Description

Author(s)

See Also

Summarize a categorical variable by a grouping variable

Description

Usage

Arguments

Value

Author(s)

Examples

Summarize a categorical variable

Description

Usage

Arguments

Value

Author(s)

Examples

Check a named vector

Description

Usage

Arguments

Value

Author(s)

Examples

Depressive Symptoms Data

Description

Usage

Format

Summarize continuous variables by group

Description

Usage

Arguments

Value

Author(s)

Examples

Summarize continuous variables

Description

Usage

Arguments

Value

Author(s)

Examples

National Longitudinal Survey of Youth (NLSY) Data

Description

Usage

Format

Summarize multiple response variables by group

Description

Usage

Arguments

Value

Author(s)

Examples

Summarize multiple response variables

Description

Usage

Arguments

Value

Author(s)

Examples

Social Psychological (Generated) Data

Description

Usage

Format

STEM Social Psychological (Generated) Data

Description

Usage

Format

Panel Study of Income Dynamics (PSID) Transition into Adulthood Supplement (TAS) Data

Description

Usage

Format