Structural Equation Modeling
And Related Techniques
Handout 5
Factor Analysis Hands on
Michael Biderman
Department of Psychology
Factor
analysis of Faked Goldberg scale testlets
The
file testletdata040510 contains 15 Faking conditions testlets from the dataset
described in the previous handouts.
Exercise
1. Perform an EFA of the Faking
testlets, FETL1, FETL2, . . . FOTL3.
Don’t include the FJTL testlets, those are not personality testlets.
Suggestions,
requests Analyze -> Reduction ->
Factor ...
fetl1 fetl2 fetl3 fatl1 fatl2 etc. etc.
_files/image003.jpg)
Check
the Eigenvalues over 1 box to let SPSS identify the number of factors.
![]()
![]()
![]()
![]()
_files/image007.gif)
Recall
that we did that in the EFA of the Honest testlets, and we obtained 5 factors
with eigenvalues >= 1.
Exercise 1 Suggestions continued
Request
an oblique (correlated factors) solution, since we already know the factors
are correlated.
![]()
![]()
_files/image010.gif)
Output
you should see . . .
_files/image012.gif)
Some of the Agreeableness
(FATL1, FATL2, FATL3) and Openness (FOTL1, FOTL2, FOTL3) testlets had pretty
low communalities.
Exercise 1 likely ouput
continued
_files/image014.gif)
_files/image016.gif)
This output definitely does
NOT suggest that there are 5 factors.
Inspection of eigenvalues
> 1 suggest only three.
The scree plot suggests only
1 factor, perhaps 3, definitely not 5.
Exercise 1 likely output
continued
_files/image018.gif)
Since the program
automatically retained only 3 factors, the pattern matrix above doesn’t really
make any sense.
Exercise 2. Redo this EFA, requesting 5 factors.
_files/image007.gif)
Some of the output you
should see . . .
Note
that the communality estimates were generally larger in this 5 factor
solution. Five
factors should account for more of the variance than three.
_files/image020.gif)
Exercise 2 Output: Faking testlets, 5 factor oblique EFA
![]()
![]()
![]()
![]()
![]()
_files/image023.gif)
The pattern of loadings onto
5 factors was generally what we would expect.
Exercise 2 Factor
Correlations
_files/image025.gif)
The factors based on the
Faked testlets were MUCH more highly correlated than those based on the Honest
testlets. Can you guess why?
Exercise
3. Perform a CFA on the Faked testlets.
Download
the partially complete AMOS model.
Connect it with the testletdata040510 file and complete it.
The
model you download should look like the following . . .
_files/image027.gif)
Add
the appropriate curved arrows.
Fix
values of appropriate single-headed arrows.
Exercise
3 Output: Faked Condition CFA: Standardized view
Your
model should look like the following . . .
_files/image029.gif)
The
loadings are OK, as are the reliabilities of most of the testlets.
The
goodness of fit isn’t terrible. The
RMSEA is nearly “good”.
But
look at those factor correlations!!
Yikes!
What’s
going on?
The
Faking Factor
Summary
of the situation:
The
EFA doesn’t suggest 5 factors, even though we know that there should be some
indication of 5 factors, since we’re dealing with the Big 5 personality
dimensions.
The
CFA gives only a fairly good 5 factor solution, although the correlations
between the supposedly orthogonal Big 5 dimensions are HUGE.
One
hypothesis that answers the above question is that there is a 6th
factor affecting all the testlets – a faking factor.
Participants
were instructed to fake every item. Some
were good fakers. Others were not. The good fakers had higher testlet scores
than the poor fakers. These differences
in ability to fake affected all items and caused the scores on all the testlets
to be (much) more highly correlated than they should have been due simply to
the personality dimensions.
The
individual differences in ability to fake represent a 6th influence
on the testlet scores, over and above the influences of the 5 personality
dimensions.
Note
that this 6th factor influenced ALL the testlets, not just 3 as was
the case with each of the personality dimensions.
The
interesting aspect of this situation is that the suggestions of a 6th
factor are very oblique.
The
only indication from the EFA was the predominance of the 1st factor
in the scree plot, suggesting not 6 factors, but 1.
The
only indications in the CFA are the generally high correlations among the Big 5
factors.
Basically
the CFA results suggest that there are correlations among the testlets from
different dimensions that can’t be accounted for unless we assume that there
are very high correlations among the dimension factors.
Inspection
of the modification indices gives no hint either. There is not yet a
“add-a-factor-affecting-all-observed-variables” modification index available in
AMOS or any other SEM program.
Exercise
4: Add a faking factor to the model.
Your
input model should look like this
_files/image031.gif)
All
that remains is to fix one of the Faking Ability regression weights. Fix the one to FETL1 and run the model.
The
Dark Side of CFA (and SEM)
Setting
the regression of FETL1 onto Faking Ability to 1 results in a model that
“doesn’t converge”. Argh!!
The
reason is that the estimates of parameters are obtained using iterative “hill
climbing” methods.
For
some models, there is no “top” of the hill – all the ground is level, and the
program probably would iterate forever.
Exercise
5. Experiment by setting different
regression arrows from Faking Ability to the various testlets to 1. Set only one at a time. Record the result from each. If the model converges, write the chi-square
value.
Testlet Result.
Include Chi-square if it’s printed.
Failure to converge.
FETL1 _____________________________________
FETL2 _____________________________________
FETL3 _____________________________________
FATL1 _____________________________________
FATL2 _____________________________________
FATL3 _____________________________________
FCTL1 _____________________________________
FCTL2 _____________________________________
FCTL3 _____________________________________
FSTL1 _____________________________________
FSTL2 _____________________________________
FSTL3 _____________________________________
FOTL1 _____________________________________
FOTL2 _____________________________________
FOTL3 _____________________________________
Exercise
4: One of the successful solutions
(FA -> FETL3 = 1.)
_files/image033.gif)
Note 1)the chi-square is nearly “not
significant”
2)the RMSEA value is in the
“good fit” range.
3)the Faking ability factor
loads positively on all testlets.
4)the correlations between the
factors are much smaller than they were without the faking factor.
Life
is good!
The
moral of this story: Don’t give up. Try everything you can think of to get a
usable solution.
Exercise
6. Factor Analysis of Caldwell Mor Barak
scale
Perform
an EFA on the
The
file is caldwellnm040516. It’ll have to
be downloaded.
The
questionnaire was responded to by 196 female African American managers who
accessed it through a web site.
The
items in each scale are
Perception of
Organizational Fairness
OFF1R 13
Reverse of - I feel I have been treated differently here
because of my race.
OFF2 14
Managers here have a track record of hiring and promoting em
OFF3 15
Managers here give feedback and evaluate employees fairly, r
OFF4 16
Manager's here make layoff decisions fairly, regardless of f
OFF5 17
Managers interpret human resource policies (such as sick lea
OFF6 18
Managers here give assignments based on the skills and abili
Perception of
Organizational Inclusion
OIF1 19
Management here encourages the formation of employee network
OIF2 20 There is a mentoring program in use here that
identifies and
OIF3 21
The old boys' network is alive and well here.
OIF3R 22
Reverse of - The old boys' network is alive and well here.
OIF4 23
The company spends enough money and time on diversity awaren
Personal Value
for Diversity
PDVF1 24
Knowing more about cultural norms of diverse groups would he
PDVF2 25
I think that diverse viewpoints add value.
PDVF3 26
I believe diversity is a strategic business issue.
Personal Comfort
with Diversity
PCF1 27
I feel at ease with people from backgrounds other than my ow
PCF2 28
I am afraid to disagree with members of other groups for fea
PCF2R 29
Reverse of - I am afraid to disagree with members of other g
PCF3 30
Diversity issues keep some work teams here from performing t
PCF3R 31
Reverse of - Diversity issues keep some work teams here from
_
See if an EFA gives the same
factors. Be sure to investigate an
oblique-factors solution.
Exercise
6: Some output you should get
_files/image035.gif)
_files/image037.gif)
There are 4
eigenvalues >= 1, which is as it should be given the knowledge of the items.
But the scree
plot doesn’t really indicate 4 factors.
Exercise 6
output
![]()
![]()
![]()
![]()
![]()
_files/image044.gif)
_files/image046.gif)
Exercise
7: CFA on same data
Perform
a CFA of the Caldwellnm data using the results from the EFA to guide your
choice of loadings.
Use
the CaldwellObliqueCFAStarter.amw file as a starting point. It should look like the following . . .
_files/image048.gif)
Note
that in this file, the residual variances are set equal to 1, rather than the
residual regression arrows. This was
done on orders from the dark side.
Exercise
7: Your result should look something
like the following
_files/image050.gif)
Note
the correlations between error terms.
These may represent specific similarities between items in working or
content. The “across factor” loadings
were suggested by modification indices.
Entering Summary Data for Amos
Since the analyses in Amos are based on summary statistics - the variances and covariances between the variables - only the variances and covariances need be entered.
They
can be entered
1)
as variances and covariances along with
means or
2)
as correlations along with means and standard deviations.
They can be entered into SPSS, into Excel, into Access, or even Word.
Example
from SPSS (Data from AMOS 4.0 manual)
_files/image052.jpg)
Example
from Excel (Data from Barrick, et. al.
(2002) JAP.
_files/image054.jpg)
Rules
For Names of columns in the data file.
A. First column’s name is rowtype_ (Underscore is important!!)
B. Second column’s name is varname_ (Again, the
underscore!!)
C. 3rd and subsequent columns.
The
names of these columns are the names of the variables.
_files/image052.jpg)
II. Rows of the data file
A. Row 1:
Contains the letter, n, in
column 1.
Contains nothing in
column 2.
Contains sample size in
subsequent columns.
B. Row 2 through K+1, where K is the number of
variables:
Column 1 contains either “corr” without the quotes or “cov”
dependent on whether the entries are correlations or covariances.
Column 2 contains the variable names, in same order as listed
across the top.
Columns 3 through K+1 contain correlations or covariances,
depending on what you have, until the diagonal of the matrix.
C. Row K+2
Contains
the word, stddev, in column 1, nothing in column 2, and standard deviations in
columns 3 through K+2.
D. Row K+3
Contains
the word, mean, in column 1, nothing in column 2, and means in columns 3
through K+3.
If
you want to analyze correlations, and enter 1 for each standard deviation and 0
for each mean.