Key Pages
- |Changes [Oct 08, 2009]
Chapter 5: Taking '...Factor Analysis
Factor analysis is utilized after collecting the responses to the questions of the questionnaire. As the name implies, its main purpose is to see whether a smaller number of ‘common factors’ account for the pattern of responses to a set of questions or variables. It looks at the variance of variables (responses to questions) and quantifies how much is attributable to ‘common variance’ as opposed to ‘unique variance’. For purposes of creating scales, factor analysis is used to identify underlying patterns of variation common to the responses of the questions under consideration (De Vaus 2002:188; Shennan 1988:271). Essentially, do a set of responses to questions co-vary together because they have underlying factors in common? In this regard, factor analysis is closely related to correlation analysis, and more generally, to the pattern recognition work of structuralism.
So to illustrate with an example, in desiring to create a scale for the association of spirituality, I took eight questions which are conceptually related and/or possess a relationship indicated by informant interviews. The assumption at work is that if these questions do in fact elicit common responses concerning spirituality from respondents, then these responses should demonstrate a pattern – either overall affirming spirituality associations with Teotihuacan or deemphasizing such an association. A heterogeneous mix to the responses would indicate that the questions are misunderstood, that spirituality is not an appropriate concept in relation to the values of the respondents, or that the concept, as formulated in the questions, is too broad to get at the particular spiritual associations held by respondents. In sum, that the questions, and thus responses, are unrelated. This is where factor analysis facilitates in refining the concepts of the research. These questions are listed in the factor analysis results of Table A2.1, and range from “I visit Teotihuacan to collect energy” (a reason indicated by numerous practitioners of a Mexico City centered organization who regularly ascend the Pyramid of the Sun on Sundays) to “Archaeology is not the best manner of understanding Teotihuacan” (a general sentiment expressed by visitors who participate in rituals). Evaluative justification of factor analysis is based upon parsimony. That is, any number of factors may be identified which group a series of variables together. However, many of these will indicate very weak relationships. So most statistical programs employ a statistic called an ‘eigenvalue’ to determine, from a potentially large set of factors, which ones are best. It does this by determining which factors ‘explain’ the most variance amongst the variables (responses to the questions). The best model accounting for the greatest variance will be the simplest. The eigenvalue for each factor indicates how much of the total variance (of all the variables) that factor ‘explains’. As a rule of thumb, only those factors which hold an eigenvalue of >1.0, or ‘explain’ a large amount of the total variance, are retained for the subsequent step in factor analysis.
Once the best factors are selected (which explain the most total variance), they are then included in a ‘rotated factor analysis’ to determine which variables (the questions of the questionnaire) most ‘belong’ to which factor. That is, before we were looking at the total variance for all of the questions pertaining to spirituality and which factors accounted for the majority of this variance. Now I want to discern which particular questions are best ‘explained’, or their variance accounted for, by which particular factors. There are multiple methods for rotated factor analysis, but I utilized ‘varimax rotation’ as it is included in the SPSS statistical package employed in this study. Table A2.1 shows the results of a rotated factor analysis for the spirituality association. This table lists the best factors (labeled ‘components’ in SPSS) and all of the variables (questions) selected in the form of a coefficient matrix. While there is no absolute rule as to what a threshold value should be for each coefficient, most statisticians in the social sciences would not use a variable with a coefficient less than 0.3 (De Vaus 2002:190). This matrix presents a mathematically rendered pattern, which must now be interpreted to be meaningful. So looking at the results for the first factor (column 1), only the questions pertaining to understanding Teotihuacan in non-educational or non-archaeological terms ‘hang together’ with coefficients greater than 0.3. This is termed ‘loading’, indicating that these questions co-vary together and share the underlying pattern common to factor 1. At this point, it is up to the researcher to come up with a conceptual commonality to account for this empirical commonality. That is, what does factor 1 relate to which could account for a shared pattern of responses to these two questions? Looking at the results of the other questions, those pertaining to visiting the site for purposes of health and collecting the beneficial energy of the Pyramid of the Sun ‘hang together’ under factor 2 (column 2), with strong correlations greater than 0.5 (0.589 and 0.544 respectively). I interpreted the common pattern underlying these two questions as a conscious usage of Teotihuacan for reasons of spiritual health or well being. That is, individuals consciously associate the site with properties beneficial to their personal well being. I subsequently labeled this factor as the ‘Spiritual-personal’ factor.
Contrasting this usage of Teotihuacan for immediate and personal benefits with the results in Column 1, I interpreted the latter as dealing with more abstract principles. Specifically, the idea that Teotihuacan is better understood in unconventional and non-institutional terms. As will be discussed further below in comparing the inter-relationships between the primary associations at Teotihaucan, the responses to these two questions indicate a consistent pattern as mathmatized in Table A2.1: individuals either strongly disagreed with both these statements, or they strongly agreed with both of the statements. Furthermore, those who generally disagreed with these statements (i.e. felt strongly that visiting Teotihuacan is educational) also generally affirmed questions relating Teotihuacan to scientific and rational understanding. By distinction, I decided to label the factor under column 2 ‘Spiritual-nonrational’ as a short hand to identify this pattern of responses.
The questions loading onto the final factor (column 3), while informative for descriptive statistics, were dropped from consideration of inclusion on a scale as I felt they were conceptually un-related: one pertaining to visiting Teotihuacan to honor tradition and the other concerning allowing greater access to the site (roughly 90% of which is closed to the public). These two questions loaded strongly onto factor 2 (correlations of 0.630 and 0.677 respectively), but the connection underlying this pattern of responses to these questions will have to wait for future study.
Having retained these four questions for two scales, the final aid of factor analysis is to assist in creating ‘weighted factor-based scales’. Weighted factor-based scales were created for two of the explanatory concepts (diversion and spirituality) which loaded onto several factors (Tables A2.1-2). The rationale behind weighted scales is to mitigate the caveat listed at the outset regarding the construction of scales and the worry of combining scores that are not equivalent. If a particular questions loads onto a concept (such as spirituality) more strongly by exhibiting a higher coefficient in the rotated factor analysis, then the statistical assumption is that it ‘taps’ or measures that concept more effectively. So when combing the scores from each question into a summated scale, the reasoning is that some questions ought to count more than others. To make the ‘contributions’ of each question ‘equal’, the scores from each particular question are multiplied by their rotated factor analysis coefficient or ‘weights’. So for the Spirituality-nonrational scale, I weighted the scores using the equation:
((variable 1) x (factor loading)) + ((variable 2) x (factor loading))= scale score
-or-
((‘score’ for “Archaeology is not best manner of understanding Teo.)(0.372))
+ ((‘score’ for “Visiting Teo. is not educational)(0.477) = scale score for individual
Forward to Unidimensionality and Reliability
Return to Coding