Version: 1.0
Title: Datasets for "Sampling and Data Analysis Using R: Theory and Practice"
Maintainer: Professor Dr. Mohammad Shahidul Islam <shahed.stat@gmail.com>
Description: Provides several datasets used throughout the book "Sampling and Data Analysis Using R: Theory and Practice" by Islam (2025, ISBN:978-984-35-8644-5). The datasets support teaching and learning of statistical concepts such as sampling methods, descriptive analysis, estimation and basic data handling. These curated data objects allow instructors, students and researchers to reproduce examples, practice data manipulation and perform hands-on analysis using R.
License: CC BY 4.0
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.3
Depends: R (≥ 3.5)
NeedsCompilation: no
Packaged: 2025-11-30 14:47:35 UTC; shahe
Author: Professor Dr. Mohammad Shahidul Islam [aut, cre]
Repository: CRAN
Date/Publication: 2025-12-04 15:10:08 UTC

This dataset was used in different chapters of the book "Sampling and Data Analysis Using R: Theory and Practice"

Description

This dataset consists of a sample of eighty laborers from a large factory. Measurements on four variables, namely Gender, Age in years, Diastolic blood pressure and BMI are reported.

Usage

Health

Format

A data frame with 80 rows and 4 variables:

Gender

Gender of individuals

Age

Age of the respondents

Diastolic

Diastolic blood pressure of the respondents

BMI

BMI of the respondents

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(Health, package = "dauR")
head(Health)

This dataset corresponds to a case study in the book "Sampling and Data Analysis Using R: Theory and Practice"

Description

A family library contains 1100 books. The owner is interested in exploring some features of the existing books through a short survey. Thirty books have been randomly selected, and four characteristics have been measured: the number of pages (Page), weight in grams (Weight), surface area in square inches (Surface), and type of each book (Type).

Usage

case_study

Format

A data frame with 30 rows and 4 variables:

Page

Number of pages for each book.

Weight

Weight of the book in grams.

Surface

Surface area of each book in square inches.

Type

Categorical variable defining whether this is a religious, science, or story book.

Source

Generated for the book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(case_study, package = "dauR")
head(case_study)

This dataset is to demonstrate the concept of Survival Analysis

Description

This dataset is about an engineering firm which monitors industrial machines over time to understand how long they operate before experiencing a mechanical failure. Each machine is tracked from installation until either it fails or the study period ends. The record includes each machine’s operating age (in years) and maintenance type of whether it receives regular maintenance or on-demand maintenance.

Usage

eng_data

Format

A dataframe with 4 variables and 223 observations:

time

Time to malfunction (in days)

status

Failure indicator (1 = still operational/censored, 2 = malfunctioned)

age

Age of machine at installation (in years)

maintenance

Maintenance type (1 = On-demand, 2 = Regular)

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(eng_data, package = "dauR")
head(eng_data)

This dataset refers to the concept of ordinal data

Description

A survey was conducted on a group of 15 students about teaching-learning environment of an institution. One part of the survey (comprises five questions) was dedicated for opinions about one recently completed course.

Usage

likert_data

Format

A data frame with 15 rows and 5 variables, all in likert-scale:

good_exp

Question about whether this was a good experience

disturb_free

Question about whether enjoyed disturb-free internet connection

support

Question about whether got support from my institution

instructors

Question about whether instructors were good in distant teaching and new technology

better

Question about whetherit is better than in-class teaching

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(likert_data, package = "dauR")
head(likert_data)

This dataset demonstrates the concept of Mediation, Moderation & Confounding

Description

The dataset contains 200 observations and five variables, namely reading_time, vocab, test_score, access_resources and read_motiv.

Usage

reading

Format

A data frame with 200 rows and five variables:

reading_time

Time spent in reading

vocab

Measure of richness of vocubulary

test_score

Score from the

access_resources

Access to Resources

read_motiv

Measure of motivation for reading

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(reading, package = "dauR")
head(reading)

This dataset is used to illustrate the concept of contingency table

Description

This dataset shows smoking habit of 198 drivers of three types of vehicles. The types of vehicles are Bus, Truck and Taxi. Here our aim is to find whether there is any association between smoking habit and the occupation type, represented here by the type of vehicle driven.

Usage

smoke_class

Format

A data frame with 198 rows and 2 variables:

Class

Type of vehicle driven

SmokingStatus

Categorial verible, showing whether smoker of non-smoker

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(smoke_class, package = "dauR")
head(smoke_class)

This dataset is to demonstrate the concept of PCA and Factor Analysis

Description

The dataset contains nine variables and 20 recently graduates from university and college. For a high-school job, students sat for exam on 7 subjects and then appeared for oral (viva-voce) exam. The subjects are Math, Physics, Chemistry, Statistics, Bengali literature, English literature and History.

Usage

students_data

Format

A dataframe with 9 variables and 20 observations:

ID

Serial number

Group

Whether a student is from science or humanities background

Math

Score in Mathematics

Physics

Scores in Physics

Chemistry

Score in Chemistry

Statistics

Score in Statistics

English

Score in English

Bengali

Score in Bengali

History

Score in History

Viva

Score in Viva-voce

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(students_data, package = "dauR")
head(students_data)

This dataset illustrates the application of two-way ANOVA

Description

A researcher seeks to investigate whether an individual’s life satisfaction (happiness) is associated with gender and working status. The variable Gender includes two categories: Male and Female, while Working_Status comprises three categories: Self-employed, Student and Job. A random sample of ten participants was selected from each category. Life satisfaction was measured on a scale ranging from 0 to 100, with higher scores indicating greater happiness. Therefore, the dependent variable is life satisfaction (happiness), and the independent variables are gender and working status.

Usage

twoway

Format

A data frame with 60 rows and 3 variables:

Happiness

Measurement of happiness

Gender

Gender of individuals

Working_Status

Working status of individuals. The classes are "Job", "Self_employed" and "Student"

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(twoway, package = "dauR")
head(twoway)

The datasets weight1 and weight2 correspond to pair-wise observations

Description

This dataset shows weight of 16 individuals before taking medication.

Usage

weight1

Format

A vector of 16 observations:

weight1

weight of individuals

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(weight1, package = "dauR")
head(weight1)

The datasets weight1 and weight2 correspond to pair-wise observations

Description

This dataset shows weight of 16 individuals after taking medication.

Usage

weight2

Format

A vector of 16 observations:

weight2

weight of individuals

Source

Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam

Examples

data(weight2, package = "dauR")
head(weight2)