Type: | Package |
Title: | Data to Accompany Smithson & Merkle, 2013 |
Version: | 1.2 |
Date: | 2018-05-22 |
Author: | Ed Merkle and Michael Smithson |
Maintainer: | Ed Merkle <merklee@missouri.edu> |
Description: | Contains data files to accompany Smithson & Merkle (2013), Generalized Linear Models for Categorical and Continuous Limited Dependent Variables. |
License: | GPL-2 |
NeedsCompilation: | no |
Packaged: | 2018-05-22 16:32:15 UTC; merkle |
Repository: | CRAN |
Date/Publication: | 2018-05-22 16:38:40 UTC |
Babies gaze data
Description
Gaze patterns of four babies in a group.
Usage
data("babies")
Format
A data frame with 1180 observations on the following 6 variables.
row
a numeric vector
time
a numeric vector indexing the target baby
id
a numeric vector indexing the observations
gaze
a factor indicating whether a baby was looked at, with levels
no
yes
babies
a factor indexing which baby was chosen to be looked at with levels
baby1
baby2
baby3
baby4
lookedat
a numeric vector registering whether gaze was initiated by the target baby, with levels
0
indicating “no” and1
indicating “yes”
Source
These are hypothetical data.
Examples
data("babies", package="smdata")
Car salesperson problem
Description
Replication of the car salesperson problem in See, Fox, and Rottenstreich (2006)
Usage
data("carsales")
Format
A data frame with 155 observations on the following 4 variables.
initial
a numeric vector taking the value
0
for the Car condition and1
for the Salesperson conditionprob
a numeric vector recording the respondent's probability estimate that the car was purchased from Carlos
NFCC
a numeric vector recording respondents' scores on the Need for Certainty and Closure scale
ctrNFCC
a numeric vector that is NFCC standardized to have a mean of 0 and standard deviation of 1
Source
Data provided by Gurr, M. (2009).
References
Gurr, M. (2009). Partition dependence: Investigating the principle of insufficient reason, uncertainty and dispositional predictors. (Unpublished Honours thesis: The Australian National University, Canberra, Australia)
See, K. E., Fox, C. R., & Rottenstreich, Y. S. (2006). Between ignorance and truth: Partition dependence and learning in judgment under uncertainty. Journal of Experimental Psychology, 32, 1385-1402.
Examples
data("carsales", package="smdata")
Sex by method of cocaine ingestion
Description
Data from the 1991-1994 Drug Abuse Treatment Outcome Study on cocaine usage patterns.
Usage
data("cocaine")
Format
A data frame with 7592 observations on the following 2 variables.
sex
a factor with levels
female
male
mode
a factor recording self-reported method of cocaine ingestion with levels
crack
freebase
inhale
inject
Source
The data are extracted from the 1991-1994 Drug Abuse Treatment Outcome Study (2010) data (DATOS).
References
United States Department of Health and Human Services and National Institute of Health and National Institute on Drug Abuse (2010). Drug Abuse Treatment Outcome Study, 1991-1994. Ann Arbor, MI: Inter-university Consortium for Political and Social Research.
Examples
data("cocaine", package="smdata")
Sex and race by method of cocaine ingestion
Description
Data from the 1991-1994 Drug Abuse Treatment Outcome Study on cocaine usage patterns.
Usage
data("cocaineplus")
Format
A data frame with 7592 observations on the following 8 variables.
sexsrt
a factor with levels
FEMALE
MALE
age
a numeric vector
mstatstr
a factor with levels
BLANK
DIVORCED
LIVINGASMARRIED
MARRIED
NEVERMARRIED
SEPARATED
WIDOWED
modestr
a factor with levels
crack
freebase
inhale
inject
racestr
a factor with levels
AfroAmerican
Caucasian
Hispanic
Other
sex
a numeric vector that takes the value
1
if male and0
if femalemode
a numeric vector that takes the value
1
if cocaine usage method is crack,2
if method is freebase,3
if method is inhale, and4
if method is injectrace
a numeric vector that takes the value
1
if AfroAmerican,2
if Caucasian,3
if Hispanic, and4
if Other
Source
The data were extracted from the 1991-1994 Drug Abuse Treatment Outcome Study (2010) data (DATOS).
References
United States Department of Health and Human Services and National Institute of Health and National Institute on Drug Abuse (2010). Drug Abuse Treatment Outcome Study, 1991-1994. Ann Arbor, MI: Inter-university Consortium for Political and Social Research.
Examples
data("cocaineplus", package="smdata")
Depression, Anxieity, and Stress
Description
Depression, Anxieity, and Stress Scale Data.
Usage
data("dass")
Format
A data frame with 166 observations on the following 3 variables.
depress
a numeric vector measuring depression, scored from 0 to 20
anxiety
a numeric vector measuring anxiety, scored from 0 to 20
stress
a numeric vector measuring stress, scored from 0 to 20
Source
Data from a pilot study by Michael Smithson.
References
Lovibond, P. F., & Lovibond, S. H. (1995). The structure of negative emotional states: Comparison of the Depression Anxiety Stress Scales with the Beck Depression and Anxiety Inventories. Behavior Research and Therapy, 33, 335-343.
Examples
data("dass", package="smdata")
Dyslexic readers data
Description
Reading scores and nonverbal IQ scores for gender- and age-matched dyslexic and non-dyslexic readers.
Usage
data("dyslexic3")
Format
A data frame with 44 observations on the following 3 variables.
score
a numeric vector recording childrens' scores on a reading accuracy test
dys
a numeric vector taking the value
1
if dyslexic and0
if notziq
a numeric vector recording childrens' nonverbal IQ scores, standardized to have a mean of 0 and standard deviation of 1
Details
The reading accuracy scores have a maximum score of 1, indicating a perfect score on the test. In the Example 6.2 analysis, these are recoded to .99; whereas in the 1's inflated model in Ch. 6 and the censored regression model in Ch. 7 they have a value of 1.
Source
Data provided from Pammer and Kevan (2007), first analyzed in Smithson and Verkuilen (2006).
References
Pammer, K., & Kevan, A. (2007). The contribution of visual sensitivity, phonological processing, and nonverbal IQ to childrens reading. Scientific Studies in Reading, 11, 33-53.
Smithson, M. J., & Verkuilen, J. (2006). A better lemon squeezer? maximum likelihood regression with beta-distributed dependent variables. Psychological Methods, 11, 54-71.
Examples
data("dyslexic3", package="smdata")
Marital Status and Email Usage
Description
Data from the U.S. General Social Surveys on marital status (ordinal; see details) and email usage.
Usage
data("email")
Format
A data frame with 3967 observations on the following 3 variables.
marital
Marital status, an ordered factor with levels
never.married
<married
<divorced
.email.hrs
Reported weekly hours spent emailing.
z.email
Standardized version of
email.hrs
.
Details
In creation of this dataset, an additional GSS item (DIVORCE) was used to ensure that married people in the sample had not been previously divorced or widowed. Thus, the marital status variable in this dataset is truly ordinal, as individuals can only progress through the statuses in one order.
Source
The Survey Documentation and Analysis system hosted at UC, Berkeley: http://sda.berkeley.edu/GSS/.
References
Smith, T. W., Marsden, P. V., Hout, M., & Kim, J. (2011). General Social Surveys, 1972 - 2010. Principal Investigator, Tom W. Smith; Co-Principal-Investigators, Peter V. Marsden and Michael Hout, NORC ed. Chicago: National Opinion Research Center, producer, 2005; Storrs, CT: The Roper Center for Public Opinion Research, University of Connecticut, distributor. 1 data file (55,087 logical records) and 1 codebook (3,610 pp).
Examples
data("email", package="smdata")
Euthanasia Scale
Description
Euthanasia scale and Christian identification scale data.
Usage
data("euthan")
Format
A data frame with 351 observations on the following 3 variables.
mident
a numeric vector measuring the degree to which respondents identify themselves as Christian, on a scale from 0 to 1
teuth
a numeric vector measuring the degree to which respondents favor euthanasia, on a scale from 0 to 1
status
a numeric vector taking the value
0
if the observation is censored and1
if not
Source
Data obtained from Mavor's (2004) study.
References
Mavor, K. (2004). Religious orientation, social identity and attitudes to homosexuality. Unpublished doctoral dissertation, School of Psychology, The Australian National University, Canberra, A.C.T., Australia.
Examples
data("euthan", package="smdata")
Exam data
Description
Grades achieved by second-year psychology students at The Australian National University in an introductory research methods course and the percentage marks they received in the laboratory component of that course.
Usage
data("exam")
Format
A data frame with 154 observations on the following 3 variables.
Labs
a numeric vector recording the percentage mark for the laboratory component of the course
Final
a numeric vector recording the percentage mark for the final exam
cens
a numeric vector taking the value
100
to indicate the value of censored observations
Source
Data obtained from Michael Smithson.
Examples
data("exam", package="smdata")
Confidence in financial knowledge
Description
Choice and confidence data from a study of financial knowledge involving U.S. undergraduates.
Usage
data("finance")
Format
A data frame with 4230 observations on the following 11 variables.
sub
Participant number.
jmeth
Experimental condition, with levels
1cd
2ci
3ei
(see details).item
Item number.
easyfoil
Equals 1 if the foil (incorrect alternative) was easy, 0 if the foil was hard (see details).
targtop
Equals 1 if the correct alternative was the first one displayed (on top), 0 otherwise.
cho
Participant's choice (equals one for the first alternative, 0 for the second alternative).
corr
Participant's accuracy (essentially
targtop==cho
).iproba
For conditions
2ci
and3ei
, the participant's confidence in the first alternative.iprobb
For conditions
2ci
and3ei
, the participant's confidence in the second alternative.probc
The participant's confidence in his/her choice (see details).
nchorev
The number of choice revisions that the participant made.
Details
The data come from Study 2 of Sieck, Merkle, and Van Zandt (2007). Experimental participants completed a 30-item, 2-alternative test of financial knowledge. For each item, the participant first chose an alternative and then made a confidence judgment.
The confidence
elicitation method varied across three between-subjects conditions.
For condition 1cd
, participants reported confidence in their
chosen alternative on a scale from 50% to 100%. For conditions 2ci
and 3ei
, participants reported independent confidence judgments
for each alternative on scales from 0% to 100%. These independent
confidence judgments are contained in iproba
and iprobb
.
In these conditions, probc
is obtained by normalizing confidence
in the chosen alternative by the sum of independent judgments.
In addition to reporting independent confidence judgments in condition
3ei
,
participants wrote an explanation in response to the
question "Why is this option true?" prior to reporting each confidence
judgment.
For each item, the incorrect alternative was manipulated to sometimes be
easy (easyfoil==1
) and sometimes be difficult
(easyfoil==0
). Foil difficulty was defined by the accuracy of an
independent group of students on a four-alternative version of the
financial knowledge test; see Sieck et al. for more detail.
Source
Provided by Ed Merkle.
References
Sieck, W.R., Merkle, E.C., & Van Zandt, T. (2007). Option fixation: A cognitive contributor to overconfidence. Organizational Behavior and Human Decision Processes, 103, 68-83.
Examples
data("finance", package="smdata")
Word Color and Fixations
Description
Summary eyetracking data from a study examining the impact of text saliency on eye movements.
Usage
data("fixations")
Format
A data frame with 48 observations on the following 6 variables.
id
Participant ID label.
condition
Condition, signifying whether a channel had a red title (see details).
countleft
Count of fixations in the middle, left channel.
countright
Count of fixations in the middle, right channel.
gazetime
Total gaze time on the webpage.
rt.cond
Equals
red
if the middle, right channel title was red;black
otherwise.
Details
The data are taken from Owens, Shrestha, & Chaparro (2009). A webpage was divided into 9 channels (sections), and the title color of the "middle, left" and "middle, right" channels were manipulated.
The variable condition
takes the value Control
if all
title colors were black; Left
if the "middle, left" channel title
was red; and Right
if the "middle, right" channel title was red.
Source
Provided by Justin W. Owens.
References
Owens, J.W., Shrestha, S., & Chaparro, B.S. (2009). Effects of text saliency on eye movements while browsing a web portal. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting (Vol. 53, pp. 1257-1261).
Examples
data("fixations", package="smdata")
Grades and marks for an undergraduate course
Description
Lab percentage mark, letter grade, lower and upper grade thresholds, a censored variable value, and the final percentage course mark.
Usage
data("grades")
Format
A data frame with 165 observations on the following 6 variables.
lab
a numeric vector recording the percentage mark for the laboratory component of the course
gradecat
a factor denoting the letter grade for the course, with levels
CR
D
HD
N
P
lower
a numeric vector denoting the lower threshold for the corresponding letter grade
upper
a numeric vector denoting the upper threshold for the corresponding letter grade
cens
a numeric vector listing the censoring value of a mark,
3
finmark
a numeric vector recording the final percentage mark for the course
Source
Data obtained from Michael Smithson.
Examples
data("grades", package="smdata")
Study 1 judged probabilities of guilt
Description
Judged probabilities of guilt in a criminal trial scenario (Study 1).
Usage
data("guilt1")
Format
A data frame with 104 observations on the following 7 variables.
observ
a numeric vector indexing cases
crguilt
a numeric vector recording the judged probability of guilt in a criminal trial scenario
cigult
a numeric vector recording the judged probability of guilt in a civil trial scenario
crvd1
a numeric vector taking the value
1
if the respondent returned a “guilty” verdict in the criminal trial and0
otherwisecrvd2
a numeric vector taking the value
1
if the respondent returned a “not guilty” verdict in the criminal trial and0
otherwisecivd1
a numeric vector taking the value
1
if the respondent returned a “guilty” verdict in the civil trial and0
otherwisecivd2
a numeric vectortaking the value
1
if the respondent returned a “not guilty” verdict in the civil trial and0
otherwise
Source
Data provided from Study 1 of Smithson, Deady and Gracik (2007).
References
Smithson,M., Gracik, L., & Deady, S. (2007). Guilty, not guilty, or ?multiple verdict options in jury verdict choices. Journal of Behavioral Decision Making, 20, 481-498.
Examples
data("guilt1", package="smdata")
Study 3 judged probabilities of guilt
Description
Judged probabilities of guilt in a criminal trial scenario (Study 3).
Usage
data("guilt3")
Format
A data frame with 96 observations on the following 3 variables.
pguilt
a numeric vector recording the judged probability of guilt in a criminal trial scenario
v1
a numeric vector taking the value
1
if the respondent returned a “guilty” verdict in the criminal trial and0
otherwisev2
a numeric vector taking the value
1
if the respondent returned a 'not 'guilty” verdict in the criminal trial and0
otherwise
Source
Data provided from Study 3 of Smithson, Deady and Gracik (2007).
References
Smithson, M., Gracik, L., & Deady, S. (2007). Guilty, not guilty, or ?multiple verdict options in jury verdict choices. Journal of Behavioral Decision Making, 20, 481-498.
Examples
data("guilt3", package="smdata")
Lower and upper probability estimates
Description
Lower and upper probability estimates provided by the Busdecu et al. (2009) respondents in their interpretations of the phrase “very likely” in an IPCC report statement, along with dummy variables indicating the experimental condition.
Usage
data("intervalbeta")
Format
A data frame with 220 observations on the following 5 variables.
t
a numeric vector taking the value
1
if the respondent is in the Translation condition, and0
otherwisen
a numeric vector taking the value
1
if the respondent is in the Narrow condition, and0
otherwisew
a numeric vector taking the value
1
if the respondent is in the Wide condition, and0
otherwisey1
a numeric vector recording the respondent's lower probability estimate
y2
a numeric vector recording the respondent's upper probability estimate
Source
Data provided by D. V. Budescu from the Budescu et al. (2009) study.
References
Budescu, D.V., Broomell, S., and Por,H.-H. (2009). Improving the communication of uncertainty in the reports of the Intergovernmental panel on climate change, Psychological Science, 20, 299-308.
Examples
data("intervalbeta", package="smdata")
Word and non-word response data
Description
Frequency with which respondents correctly identified 0, 1, 2, 3, or 4 letters (in correct versus incorrect order) of a word or non-word based on a cue.
Usage
data("phono")
Format
A data frame with 16 observations on the following 3 variables.
treeid
a numeric vector, a tree identification code needed by the R package for estimating MPT models
resp
a factor denoting whether a respondent correctly identified 0, 1, 2, 3, or 4 letters, with CO denoting the 4 letters were in the correct order and IO indicating that they were not, with levels
0L
1L
2L
3L
4LCO
4LIO
fr
a numeric vector recording the frequency of each response type
Source
These data are extracted from Maris (2002) figure 7, pg. 1421.
References
Maris, E. (2002). The role of orthographic and phonological codes in the word and the pseudoword superiority effect: An analysis by means of multinomial processing tree models. Journal of Experimental Psychology: Human Perception and Performance, 28, 1409-1431.
Examples
data("phono", package="smdata")
Censored response time data
Description
Response times for a task timed-out at 1200 ms, and a prime (either respondents were primed to use intuition or deliberation in the task).
Usage
data("rtime")
Format
A data frame with 300 observations on the following 3 variables.
RT
a numeric vector, response time in milliseconds
prime
a numeric vector taking the value
0
if primed to use intuition or1
if primed to use deliberationstatus
a numeric vector taking the value
0
if the observation is censored and1
if not
Source
These are hypothetical data.
Examples
data("rtime", package="smdata")
School Skipping
Description
Data from the U.S. National Survey on Drug Use and Health on the frequency with which individuals skip school and other covariates.
Usage
data("skipping")
Format
A data frame with 252 observations on the following 6 variables.
income
Reported household income, where
1
means < $20k;2
means >= $20k and < $50k;3
means >= $50k and < $75k;4
means >= $75k.irsex
Gender;
1
is male and2
is female.educatn2
Grade in school (see details).
schdskip
Reported number of school days skipped out of the past 30.
wrkhrsw2
Reported number of hours worked in the past week.
anyskip
A binary version of
schdskip
, signifying whether the respondent skipped any days of school out of the past 30.
Details
Variable names match those from the National Survey on Drug Use and Health, so more
details can be obtained from the survey codebook. Missing data codes have been
changed to NA
. Additionally, the educatn2
has been recoded to generally
match the actual grade in which the respondent is enrolled. The only exceptions to this
are that 14
means the second and third years in college, and 15
means the fourth
or higher year in college.
Source
Obtained from the Inter-University Consortium for Political and Social Research, University of Michigan, http://www.icpsr.umich.edu.
References
United States Department of Health and Human Services. Substance Abuse and Mental Health Services Administration. Center for Behavioral Health Statistics and Quality. National Survey on Drug Use and Health, 2010. ICPSR32722-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2011-12-05. doi:10.3886/ICPSR32722.v1
Examples
data("skipping", package="smdata")
Transportation mode choice
Description
Choice of transportation mode by gender.
Usage
data("trchoice")
Format
A data frame with 10 observations on the following 4 variables.
treeid
a numeric vector needed for identifying a tree in the MPT algorithm
sex
a numeric vector taking the value
1
if male and0
if femaleresp
a factor denoting the transport mode choice, where
D
denotes driving one's own vehicle,F
denotes getting a ride with a friend,O
denotes other,P
denotes using public transport, andW
denotes walkingfr
a numeric vector recording the frequency with which each transport mode is chosen
Source
The data are extracted from the 1991-1994 Drug Abuse Treatment Outcome Study (2010) data (DATOS).
References
United States Department of Health and Human Services and National Institute of Health and National Institute on Drug Abuse (2010). Drug Abuse Treatment Outcome Study, 1991-1994. Ann Arbor, MI: Inter-university Consortium for Political and Social Research.
Examples
data("trchoice", package="smdata")
Chest Pain Treatment Preferences
Description
Experimental data in which participants were presented with statistical information about two treatments for chest pain, then asked about their preference for a treatment.
Usage
data("treatment")
Format
A data frame with 235 observations on the following 4 variables.
cond
Condition, referring to the way that statistical information was presented (see details).
choice
Treatment preference on an ordinal, 6-level scale from "definitely angioplasty" to "definitely bypass".
effectiveness
Participant ratings of the importance of treatment effectiveness on treatment choice (1 is extremely unimportant; 6 is extremely important).
invasiveness
Participant ratings of the importance of treatment invasiveness on treatment choice (1 is extremely unimportant; 6 is extremely important).
Details
The data were taken from Hulsey (2010). Study participants were asked to make a hypothetical decision between two treatments for chest pain: bypass surgery or balloon angioplasty. Bypass is generally more effective, but it is also more invasive and has a longer recovery time.
Conditions were defined by the way participants received statistical
information concerning the two treatments. In condition
pictograph
, participants viewed
visual information via a pictograph. In condition
statistics
, participants view numerical information.
Source
Provided by Lukas Hulsey.
References
Hulsey, L. (2010). Testimonials and statistics in patient medical decision aids. Unpublished master's thesis, Wichita State University.
Examples
data("treatment", package="smdata")
Transportation mode choice, long format
Description
Choice of transportation mode by gender, in long format so that each choice occupies 5 rows.
Usage
data("trlong")
Format
A data frame with 31680 observations on the following 6 variables.
obs
a numeric vector
case
a numeric vector
sex
a numeric vector, =
1
if male and0
if femaleresp
a factor indicating the transport mode choice, and
B
denotes taking the bus, codeD denotes driving one's own vehicle,F
denotes getting a ride with a friend,O
denotes other, andW
denotes walkingchosen
a numeric vector taking the value
1
if the transport mode was chosen and0
if notpubpriv
a numeric vector that takes a value of
1
if the transportation mode is private and0
if it is public
Source
The data are extracted from the 1991-1994 Drug Abuse Treatment Outcome Study (2010) data (DATOS).
References
United States Department of Health and Human Services and National Institute of Health and National Institute on Drug Abuse (2010). Drug Abuse Treatment Outcome Study, 1991-1994. Ann Arbor, MI: Inter-university Consortium for Political and Social Research.
Examples
data("trlong", package="smdata")
Work Days Missed
Description
Data from the U.S. National Survey on Drug Use and Health on the frequency with which individuals miss work due to mental health issues and other covariates.
Usage
data("workdays")
Format
A data frame with 777 observations on the following 8 variables.
cigtry
Reported age that the respondent first smoked a cigarette.
impydays
Reported days in the past year the respondent was unable to work due to mental health (see details).
age2
Respondent age (see details).
service
Has the respondent been in the U.S. Armed Forces? (
1
=yes,0
=no)health
Rating of overall health, where
1
is excellent and5
is poor.movespy2
Number of times the respondent moved in the past 12 months.
schenrl
Whether the respondent is enrolled in any school (
1
=yes,0
=no).coutyp2
Type of county in which the respondent resides: large metro (
large
), small metro (small
), nonmetro (nonmetro
).
Details
Variable names match those from the National Survey on Drug Use and Health, so more
details can be obtained from the survey codebook. Missing data codes have been
changed to NA
. Additionally, age2
is coded so that 7
means 18 years of age, 8
means 19 years of age, ..., 11
means 22 or 23 years of age, 12
means 24 or 25 years,
13
means 26-29, 14
means 30-34, 15
means 35-49, 16
means 50-64, and 17
means 65 and over.
The variable impydays
contains responses to the question "About how many days out of 365 in the past 12 months were you totally unable to work or carry out your normal activities because of your emotions, nerves, or mental health?"
Source
Obtained from the Inter-University Consortium for Political and Social Research, University of Michigan, http://www.icpsr.umich.edu.
References
United States Department of Health and Human Services. Substance Abuse and Mental Health Services Administration. Center for Behavioral Health Statistics and Quality. National Survey on Drug Use and Health, 2010. ICPSR32722-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2011-12-05. doi:10.3886/ICPSR32722.v1
Examples
data("workdays", package="smdata")