Senin, 15 Juni 2009

ANALISIS FAKTOR dengan MINITAB

Indo Yama Nasarudin

Pemilihan analisis faktor sebagai alat analisis pada penelitian ini, disebabkan karena penelitian ini mencoba menemukan hubungan (interrelationship) beberapa variabel yang saling independen satu dengan yang lainnya, sehingga bisa dibuat kumpulan variabel yang lebih sedikit dari jumlah variabel awal sehingga akan lebih mudah dikontrol oleh manajemen perusahaan atau pemegang kebijakan perusahaan.
Tujuan Analisis Faktor pada dasarnya tujuan analisis faktor adalah untuk melakukan data summarization untuk variabel-variabel yang dianalisis, yakni mengidentifikasi adanya hubungan antar variabel. Data reduction, yakni setelah melakukan korelasi, dilakukan proses membuat sebuah variabel set baru yang dinamakan faktor.
Analisis faktor merupakan suatu teknik analisis yang digunakan untuk memahami yang mendasari dimensi-dimensi atau regularitas suatu gejala. Tujuan utama teknik ini ialah untuk membuat ringkasan informasi yang dikandung dalam sejumlah besar variable kedalam suatu kelompok faktor yang lebih kecil. Secara statistik tujuan pokok teknik ini ialah untuk menentukan kombinasi linear variable-variabel yang akan membantu dalam penyeledikan saling keterkaitannya variable-variabel tersebut. Atau dengan kata lain digunakan untuk mengidentifikasi variabel-variabel atau faktor-faktor yang menerangkan pola hubungan dalam seperangkat variabel. Teknik ini bermanfaat untuk mengurangi jumlah data dalam rangka untuk mengidentifikasi sebagian kecil faktor yang dapat menerangkan varians yang sedang diteliti secara lebih jelas dalam suatu kelompok variabel yang jumlahnya lebih besar.
Kegunaan utama analisis faktor ialah untuk melakukan pengurangan data atau dengan kata lain melakukan peringkasan sejumlah variabel menjadi lebih kecil jumlahnya. Pengurangan dilakukan dengan melihat interdependensi beberapa variabel yang dapat dijadikan satu yang disebut dengan faktor sehingga diketemukan variabel-variabel atau faktor-faktor yang dominan atau penting untuk dianalisa lebih lanjut.
Prosedur analisis faktor juga dapat digunakan untuk membuat hipotesis yang mempertimbangkan mekanisme sebab akibat atau menyaring sejumlah variabel untuk kemudian dilakukan analisis selanjutnya, misalnya mengidentifikasi kolinearitas sebelum melakukan analisis regresi linear.
Dalam prosedur analisis faktor, terdapat tingkatan fleksibilitas tinggi, diantaranya ialah:
· Tujuh metode untuk membuat ekstrasi faktor.
· Lima metode rotasi, diantaranya ialah direct oblimin dan promax untuk rotasi non orthogonal.
· Tiga metode untuk menghitung nilai-nilai faktor dan kemudian faktor-faktor tersebut dapat disimpan ke dalam file untuk dianalisis lebih lanjut.
Sebagai contoh dalam suatu penelitian, kita ingin mengetahui sikap-sikap apa saja yang mendasari orang mau memberikan jawaban terhadap pertanyaan-pertanyaan dalam suatu survei politik? Dari hasil penelitian didapatkan adanya tumpang tindih yang signifikan antara berbagai sub-kelompok butir-butir pertanyaan, misalnya pertanyaan-pertanyaan mengenai masalah perpajakan cenderung untuk berkorelasi satu dengan lainnya, masalah militer saling berkorelasi, masalah ekonomi juga demikian. Jika terjadi demikian, maka kita sebaiknya menyelesaikan persoalan tersebut dengan menggunakan analisis faktor. Dengan teknik ini kita dapat melakukan penyelidikan sejumlah faktor yang mendasarinya dan dapat mengidentifikasi faktor-faktor apa saja yang mewakilinya secara konseptual. Tidak hanya itu, kita juga dapat menghitung nilai-nilai untuk masing-masing responden dan kemudian dipergunakan untuk analisis selanjutnya. Sebagai contoh kita dapat membuat model regresi logistik untuk memprediksi perilaku pemberian suara didasarkan pada nilai-nilai faktor. Untuk menggunakan teknik ini persyaratan yang sebaiknya dipenuhi ialah:
· Data yang digunakan ialah data kuantitatif berskala interval atau ratio.
· Data harus mempunyai distribusi normal bivariate untuk masing-masing pasangan variable
· Model ini mengkhususkan bahwa semua variabel ditentukan oleh faktor-faktor biasa (faktor-faktor yang diestimasikan oleh model) dan faktor-faktor unik (yang tidak tumpang tindih antara variabel-varaibel yang sedang diobservasi)
· Estimasi yang dihitung didasarkan pada asumsi bahwa semua faktor unik are tidak saling berkorelasi satu dengan lainnya dan dengan faktor-faktor biasa.
· Persyaratan dasar untuk melakukan penggabungan ialah besarnya korelasi antar variabel independen setidak-tidaknya 0,5 karena prinsip analisis faktor ialah adanya korelasi antar variabel.

2.1. Langkah-langkah Penyelesaian Analisis Faktor dengan Program Minitab
Use Minitab's multivariate analysis procedures to analyze your data when you have made multiple measurements on items or subjects. You can choose to:
- Analyze the data covariance structure to understand it or to reduce the data dimension
- Assign observations to groups
- Explore relationships among categorical variables
Because Minitab does not compare tests of significance for multivariate procedures, interpreting the results is somewhat subjective. However, you can make informed conclusions if you are familiar with your data.
Analysis of the data structure
Minitab offers two procedures for analyzing the data covariance structure:
- Principal Components helps you to understand the covariance structure in the original variables and/or to create a smaller number of variables using this structure.
- Factor Analysis, like principal components, summarizes the data covariance structure in a smaller number of dimensions. The emphasis in factor analysis is the identification of underlying "factors" that might explain the dimensions associated with large data variability.

Internal Consistency
- Item Analysis assesses how reliably multiple items in a survey or test measure the same construct.
Grouping observations
Minitab offers three cluster analysis methods and discriminant analysis for grouping observations:
- Cluster Observations groups or clusters observations that are "close" to each other when the groups are initially unknown. This method is a good choice when no outside information about grouping exists. The choice of final grouping is usually made according to what makes sense for your data after viewing clustering statistics.
- Cluster Variables groups or clusters variables that are "close" to each other when the groups are initially unknown. The procedure is similar to clustering of observations. You may want to cluster variables to reduce their number.
- Cluster K-Means, like clustering of observations, groups observations that are "close" to each other. K-means clustering works best when sufficient information is available to make good starting cluster designations.
- Discriminant Analysis classifies observations into two or more groups if you have a sample with known groups. You can use discriminant analysis to investigate how the predictors contribute to the groupings.
Correspondence Analysis
Minitab offers two methods of correspondence analysis to explore the relationships among categorical variables:
- Simple Correspondence Analysis explores relationships in a 2-way classification. You can use this procedure with 3-way and 4-way tables because Minitab can collapse them into 2-way tables. Simple correspondence analysis decomposes a contingency table similar to how principal components analysis decomposes multivariate continuous data. Simple correspondence analysis performs an eigen analysis of data, breaks down variability into underlying dimensions, and associates variability with rows and/or columns.
- Multiple Correspondence Analysis extends simple correspondence analysis to the case of 3 or more categorical variables. Multiple correspondence analysis performs a simple correspondence analysis on an indicator variables matrix in which each column corresponds to a level of a categorical variable. Rather than a 2-way table, the multi-way table is collapsed into 1 dimension.
There are three ways that you might carry out a factor analysis in Minitab. The usual way, described below, is to enter columns containing your measurement variables, but you can also use a matrix as input (See To perform factor analysis with a correlation or covariance matrix) or use stored loadings as input (See To perform factor analysis with stored loadings).
1 Choose Stat > Multivariate > Factor Analysis.
2 In Variables, enter the columns containing the measurement data.
3 If you like, use any dialog box options, then click OK.
You can choose to calculate the factor loadings and coefficients from a stored correlation or covariance matrix rather than the raw data. In this case, the raw data will be ignored. (Please note that this means scores can not be calculated.)
If it makes sense to standardize variables (usual choice when variables are measured by different scales), enter a correlation matrix; if you do not wish to standardize, enter a covariance matrix.
1 Choose Stat > Multivariate > Factor Analysis.
2 Click Options.
3 Under Matrix to Factor, choose Correlation or Covariance.
4 Under Source of Matrix, choose Use matrix and enter the matrix. Click OK.
If you store initial factor loadings from an earlier analysis, you can input these initial loadings to examine the effect of different rotations. You can also use stored loadings to predict factor scores of new data.
1. Cick Options in the Factor Analysis dialog box.
2. Under Loadings for Initial Solution, choose Use loadings. Enter the columns containing the loadings. Click OK.
3. Do one of the following, and then click OK:
- To examine the effect of a different rotation method, choose an option under Type of Rotation. See Rotating the factor loadings for a discussion of the various rotations>Main.
- To predict factor scores with new data, in Variables, enter the columns containing the new data.
You record the following characteristics of 14 census tracts: total population (Pop), median years of schooling (School), total employment (Employ), employment in health services (Health), and median home value (Home) (data from [6], Table 8.2). You would like to investigate what "factors" might explain most of the variability. As the first step in your factor analysis, you use the principal components extraction method and examine an eigenvalues (scree) plot in order to help you to decide upon the number of factors.
1. Open the worksheet EXH_MVAR.MTW.
2. Choose Stat > Multivariate > Factor Analysis.
3. In Variables, enter Pop-Home.
4. Click Graphs and check Scree plot. Click OK in each dialog box.
Session window output
Factor Analysis: Pop, School, Employ, Health, Home
Principal Component Factor Analysis of the Correlation Matrix
Unrotated Factor Loadings and Communalities
Variable Factor1 Factor2 Factor3 Factor4 Factor5 Communality
Pop 0.972 0.149 -0.006 -0.170 0.067 1.000
School 0.545 0.715 0.415 0.140 -0.001 1.000
Employ 0.989 0.005 -0.089 -0.083 -0.085 1.000
Health 0.847 -0.352 -0.344 0.200 0.022 1.000
Home -0.303 0.797 -0.523 -0.005 -0.002 1.000
Variance 3.0289 1.2911 0.5725 0.0954 0.0121 5.0000
% Var 0.606 0.258 0.114 0.019 0.002 1.000

Factor Score Coefficients
Variable Factor1 Factor2 Factor3 Factor4 Factor5
Pop 0.321 0.116 -0.011 -1.782 5.511
School 0.180 0.553 0.726 1.466 -0.060
Employ 0.327 0.004 -0.155 -0.868 -6.988
Health 0.280 -0.272 -0.601 2.098 1.829
Home -0.100 0.617 -0.914 -0.049 -0.129


Interpreting the results
Five factors describe these data perfectly, but the goal is to reduce the number of factors needed to explain the variability in the data. Examine the Session window results line of % Var or the eigenvalues plot. The proportion of variability explained by the last two factors is minimal (0.019 and 0.002, respectively) and they can be eliminated as being important. The first two factors together represent 86% of the variability while three factors explain 98% of the variability. The question is whether to use two or three factors. The next step might be to perform separate factor analyses with two and three factors and examine the communalities to see how individual variables are represented. If there were one or more variables not well represented by the more parsimonious two factor model, you might select a model with three or more factors.
Two factors were chosen as the number to represent the census tract data of the Example of Factor Analysis Using Principal Components. You perform a maximum likelihood extraction and varimax rotation to interpret the factors.
1. Open the worksheet EXH_MVAR.MTW.
2. Choose Stat > Multivariate > Factor Analysis.
3. In Variables, enter Pop-Home.
4. In Number of factors to extract, enter 2.
5. Under Method of Extraction, choose Maximum likelihood.
6. Under Type of Rotation, choose Varimax.
7. Click Graphs and check Loading plot for first 2 factors.
8. Click Results and check Sort loadings. Click OK in each dialog box.
Session window output
Factor Analysis: Pop, School, Employ, Health, Home
Maximum Likelihood Factor Analysis of the Correlation Matrix
* NOTE * Heywood case
Unrotated Factor Loadings and Communalities
Variable Factor1 Factor2 Communality
Pop 0.971 0.160 0.968
School 0.494 0.833 0.938
Employ 1.000 0.000 1.000
Health 0.848 -0.395 0.875
Home -0.249 0.375 0.202
Variance 2.9678 1.0159 3.9837
% Var 0.594 0.203 0.797
Rotated Factor Loadings and Communalities
Varimax Rotation
Variable Factor1 Factor2 Communality
Pop 0.718 0.673 0.968
School -0.052 0.967 0.938
Employ 0.831 0.556 1.000
Health 0.924 0.143 0.875
Home -0.415 0.173 0.202
Variance 2.2354 1.7483 3.9837
% Var 0.447 0.350 0.797
Sorted Rotated Factor Loadings and Communalities
Variable Factor1 Factor2 Communality
Health 0.924 0.143 0.875
Employ 0.831 0.556 1.000
Pop 0.718 0.673 0.968
Home -0.415 0.173 0.202
School -0.052 0.967 0.938
Variance 2.2354 1.7483 3.9837
% Var 0.447 0.350 0.797

Factor Score Coefficients
Variable Factor1 Factor2
Pop -0.165 0.246
School -0.528 0.789
Employ 1.150 0.080
Health 0.116 -0.173
Home -0.018 0.027

Interpreting the results
The results indicates that this is a Heywood case . There are three tables of loadings and communalities: unrotated, rotated, and sorted and rotated. The unrotated factors explain 79.7% of the data variability (see last line under Communality) and the communality values indicate that all variables but Home are well represented by these two factors (communalities are 0.202 for Home, 0.875-1.0 for other variables). The percent of total variability represented by the factors does not change with rotation, but after rotating, these factors are more evenly balanced in the percent of variability that they represent, being 44.7% and 35.0%, respectfully.
Sorting is done by the maximum absolute loading for any factor. Variables that have their highest absolute loading on factor 1 are printed first, in sorted order. Variables with their highest absolute loadings on factor 2 are printed next, in sorted order, and so on. Factor 1 has large positive loadings on Health (0.924), Employ (0.831), and Pop (0.718), and a -0.415 loading on Home while the loading on School is small. Factor 2 has a large positive loading on School of 0.967 and loadings of 0.556 and 0.673, respectively, on Employ and Pop, and small loadings on Health and Home.
You can view the rotated loadings graphically in the loadings plot. What stands out for factor 1 are the high loadings on the variables Pop, Employ, and Health and the negative loading on Home. School has a high positive loading for factor 2 and somewhat lower values for Pop and Employ.
Let's give a possible interpretation to the factors. The first factor positively loads on population size and on two variables, Employ and Health, that generally increase with population size. It negatively loads on home value, but this may be largely influenced by one point. We might consider factor 1 to be a "health care - population size" factor. The second factor might be considered to be a "education - population size" factor. Both Health and School are correlated with Pop and Employ, but not much with each other.
In addition, Minitab displays a table of factor score coefficients. These show you how the factors are calculated. Minitab calculates factor scores by multiplying factor score coefficients and your data after they have been centered by subtracting means.
You might repeat this factor analysis with three factors to see if it makes more sense for your data.
You can have three types of input data:
ü Columns of raw data
ü A matrix of correlations or covariances
ü Columns containing factor loadings
The typical case is to use raw data. Set up your worksheet so that a row contains measurements on a single item or subject. You must have two or more numeric columns, with each column representing a different measurement (response). Minitab automatically omits rows with missing data from the analysis.
Usually the factor analysis procedure calculates the correlation or covariance matrix from which the loadings are calculated. However, you can enter a matrix as input data. You can also enter both raw data and a matrix of correlations or covariances. If you do, Minitab uses the matrix to calculate the loadings. Minitab then uses these loadings and the raw data to calculate storage values and generate graphs. See To perform factor analysis with a correlation or covariance matrix.
If you store initial factor loadings, you can later input these initial loadings to examine the effect of different rotations. You can also use stored loadings to predict factor scores of new data. See To perform factor analysis with stored loadings.

Tidak ada komentar:

Posting Komentar