Help for package dauR

Version:

1.0

Title:

Datasets for "Sampling and Data Analysis Using R: Theory and Practice"

Maintainer:

Professor Dr. Mohammad Shahidul Islam <shahed.stat@gmail.com>

Description:

Provides several datasets used throughout the book "Sampling and Data Analysis Using R: Theory and Practice" by Islam (2025, ISBN:978-984-35-8644-5). The datasets support teaching and learning of statistical concepts such as sampling methods, descriptive analysis, estimation and basic data handling. These curated data objects allow instructors, students and researchers to reproduce examples, practice data manipulation and perform hands-on analysis using R.

License:

CC BY 4.0

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.3.3

Depends:

R (≥ 3.5)

NeedsCompilation:

Packaged:

2025-11-30 14:47:35 UTC; shahe

Author:

Professor Dr. Mohammad Shahidul Islam [aut, cre]

Repository:

CRAN

Date/Publication:

2025-12-04 15:10:08 UTC

This dataset was used in different chapters of the book "Sampling and Data Analysis Using R: Theory and Practice"

Description

This dataset consists of a sample of eighty laborers from a large factory. Measurements on four variables, namely Gender, Age in years, Diastolic blood pressure and BMI are reported.

Usage

Health

Format

A data frame with 80 rows and 4 variables:

Gender: Gender of individuals
Age: Age of the respondents
Diastolic: Diastolic blood pressure of the respondents
BMI: BMI of the respondents

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(Health, package = "dauR")
head(Health)

This dataset corresponds to a case study in the book "Sampling and Data Analysis Using R: Theory and Practice"

Description

A family library contains 1100 books. The owner is interested in exploring some features of the existing books through a short survey. Thirty books have been randomly selected, and four characteristics have been measured: the number of pages (Page), weight in grams (Weight), surface area in square inches (Surface), and type of each book (Type).

Usage

case_study

Format

A data frame with 30 rows and 4 variables:

Page: Number of pages for each book.
Weight: Weight of the book in grams.
Surface: Surface area of each book in square inches.
Type: Categorical variable defining whether this is a religious, science, or story book.

Source

Generated for the book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(case_study, package = "dauR")
head(case_study)

This dataset is to demonstrate the concept of Survival Analysis

Description

This dataset is about an engineering firm which monitors industrial machines over time to understand how long they operate before experiencing a mechanical failure. Each machine is tracked from installation until either it fails or the study period ends. The record includes each machine’s operating age (in years) and maintenance type of whether it receives regular maintenance or on-demand maintenance.

Usage

eng_data

Format

A dataframe with 4 variables and 223 observations:

time: Time to malfunction (in days)
status: Failure indicator (1 = still operational/censored, 2 = malfunctioned)
age: Age of machine at installation (in years)
maintenance: Maintenance type (1 = On-demand, 2 = Regular)

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(eng_data, package = "dauR")
head(eng_data)

This dataset refers to the concept of ordinal data

Description

A survey was conducted on a group of 15 students about teaching-learning environment of an institution. One part of the survey (comprises five questions) was dedicated for opinions about one recently completed course.

Usage

likert_data

Format

A data frame with 15 rows and 5 variables, all in likert-scale:

good_exp: Question about whether this was a good experience
disturb_free: Question about whether enjoyed disturb-free internet connection
support: Question about whether got support from my institution
instructors: Question about whether instructors were good in distant teaching and new technology
better: Question about whetherit is better than in-class teaching

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(likert_data, package = "dauR")
head(likert_data)

This dataset demonstrates the concept of Mediation, Moderation & Confounding

Description

The dataset contains 200 observations and five variables, namely reading_time, vocab, test_score, access_resources and read_motiv.

Usage

reading

Format

A data frame with 200 rows and five variables:

reading_time: Time spent in reading
vocab: Measure of richness of vocubulary
test_score: Score from the
access_resources: Access to Resources
read_motiv: Measure of motivation for reading

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(reading, package = "dauR")
head(reading)

This dataset is used to illustrate the concept of contingency table

Description

This dataset shows smoking habit of 198 drivers of three types of vehicles. The types of vehicles are Bus, Truck and Taxi. Here our aim is to find whether there is any association between smoking habit and the occupation type, represented here by the type of vehicle driven.

Usage

smoke_class

Format

A data frame with 198 rows and 2 variables:

Class: Type of vehicle driven
SmokingStatus: Categorial verible, showing whether smoker of non-smoker

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(smoke_class, package = "dauR")
head(smoke_class)

This dataset is to demonstrate the concept of PCA and Factor Analysis

Description

The dataset contains nine variables and 20 recently graduates from university and college. For a high-school job, students sat for exam on 7 subjects and then appeared for oral (viva-voce) exam. The subjects are Math, Physics, Chemistry, Statistics, Bengali literature, English literature and History.

Usage

students_data

Format

A dataframe with 9 variables and 20 observations:

ID: Serial number
Group: Whether a student is from science or humanities background
Math: Score in Mathematics
Physics: Scores in Physics
Chemistry: Score in Chemistry
Statistics: Score in Statistics
English: Score in English
Bengali: Score in Bengali
History: Score in History
Viva: Score in Viva-voce

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(students_data, package = "dauR")
head(students_data)

This dataset illustrates the application of two-way ANOVA

Description

A researcher seeks to investigate whether an individual’s life satisfaction (happiness) is associated with gender and working status. The variable Gender includes two categories: Male and Female, while Working_Status comprises three categories: Self-employed, Student and Job. A random sample of ten participants was selected from each category. Life satisfaction was measured on a scale ranging from 0 to 100, with higher scores indicating greater happiness. Therefore, the dependent variable is life satisfaction (happiness), and the independent variables are gender and working status.

Usage

twoway

Format

A data frame with 60 rows and 3 variables:

Happiness: Measurement of happiness
Gender: Gender of individuals
Working_Status: Working status of individuals. The classes are "Job", "Self_employed" and "Student"

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(twoway, package = "dauR")
head(twoway)

The datasets weight1 and weight2 correspond to pair-wise observations

Description

This dataset shows weight of 16 individuals before taking medication.

Usage

weight1

Format

A vector of 16 observations:

weight1: weight of individuals

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(weight1, package = "dauR")
head(weight1)

The datasets weight1 and weight2 correspond to pair-wise observations

Description

This dataset shows weight of 16 individuals after taking medication.

Usage

weight2

Format

A vector of 16 observations:

weight2: weight of individuals

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(weight2, package = "dauR")
head(weight2)