Correspondance Analysis (CA)
| Inputs |
Qualitative data (unlimited number)
or
specific CA matrix |
| Output |
qualitative data |
Display

Description
The aim of the Correspondence Analysis is equal to the PCA module (Principal Component Analysis): it consists in
researching a hierarchisation of the data contained in a matrix constituted with n lines (with n the number of entities
in the map) and p columns (with p the number of entry data).
The CA presents the advantage to allow the process of qualitative variables and to bring to the fore structures that
are not necessarily linear.
Attention, the different qualities are not considered in a particula order, and you cannot use qualities like :
"Heavy", "Medium", "Light". The module will not take this order in the computation, and the result may be false.
Calculation
- standardisation of the data by a complete disjunctive coding,
- constitution of the frequency table,
- constitution of the inertia matrix according to the chi² metric,
- calculation of the coordinates (value of the projections above factorial axis), of the contribution (part of an i
or j point to the inertia explained by the factor) and the qualities of representation,
- calculation of the other settings (see below).
Interpretation
You first have to determine the number of axis on which to work thanks to latent roots and then determine the
definition of the axis thanks to the coordinates, contributions and qualities.
In a CA, the lines and columns play the same role, thus we only represent one graph. Therefore, we can graphically
interpret the proximity between the line-individuals and the column-individuals. Two close line-individuals reveal a
similar behaviour (provided these qualities are suitable). You cannot interpret the proximity between a line-individual
and a column-individual. However, you can explain the position of a line-individual in comparison with all the
column-individuals or the position of one column-individual in comparison with all the line-individuals.
Graphs
Under each graph, there is a button : allowing to create a new graphic window
from the current drawing. This window will enable you to compare several graphs, to export them or to print them.
Results
In the table located in the bottom left of the setting window are the latent roots and the percentage of information
took into account in the corresponding axis. The graph located at the bottom right represents the histogram of the
latent roots. If you click on a red rectangle (representing the latent root), the number of the latent root selected is
displayed above the graph.
Individuals and variables are represented in the graph located on the upper right corner (respectively red and blue
squares). They are displayed in the chosen factorial plan. Indeed, the horizontal axis corresponds to the axis you
selected in the "axis #1" field, and the vertical axis to the "axis #2" field. These axes are included between 1 and
the number of data in entry.
When you click on a red or blue square, you display the name of the individual or the variable under the graph. If you
want to know precisely the coordinates of a variable or an individual in the factorial plan, you must click on the
"save the results" button and read the so-created file.
If you change the size of the window, every drawing will automatically adapt to this new
size.
If the option all the axis is checked, the calculations will be made for all the axis, otherwise they would
only be made for the two chosen axis, which would reduce the processing time (especially in the case when the number of
entry of the module is significant), and the size of the result file.
If you click on the Save in a file button, you shall create a text file in which will be written all the
results of the AC.
In this file, you shall find :
- The latent roots and vectors of the correlation matrix
- The total inertia of the scattergram
- The hierarchy between the axes (the percentage of concentration of the information of each axis given by the
latent root)
- The sum of the hierarchy between the axes
- The information related to :
-
- All the axes if you notched the option all the axes
- The two chosen axes if you notched the option 2 axes
The informations related to the axes are the following :
For each individual and each variable, here are the 5 different types of results :
- Their coordinates on the chosen axes, allowing to locate them according to the axes system.
- The relative weight, indicating the importance of the role of each of them in the process.
- Their contribution to the chosen axes, measuring the role played by each of them in the axis formation.
- Their representation qualities on the chosen axes, measuring their proximity with the axes.
Output
The table that contains the individuals' coordinates on all the axes is provided in exit, in order to make an
Ascending Hierarchical Classification consecutive to this module. See the AHC module for further details about this
classification.
Notions of statistic
The detailed results presented above must be carefully interpreted. Indeed, the results are reliable only for the
strongest (and weakest) values of the variables and individuals' coordinates.
The closer a data gets to 0, the less the corresponding axis is significant (the variable or the individual
participates less and less to the structure brought to light by the axis). In order to interpret the coordinates, we
must care about the extreme values.
Moreover, we cannot have clear idea of what characterises an individual with only one axis (even the first one).
However, the examination of several axes will allow to reconstitute with accuracy the specific characteristics of each
individual according to the set of variables.
Should you need further information, you can refer to the principle and calculation
detail used by the module.
Sources : Lecture about the Quantitative Methods from François Bavaud
University of Lausanne
Switzerland
Script :
2 module untyped_list ""
3 mod_type integer "103"
3 mod_subtype integer "518"
3 mod_name string "AFC"
3 mod_dads integer_list ""
4 ? integer "4"
4 ? integer "5"
4 ? integer "9"
4 ? integer "6"
4 ? integer "7"
4 ? integer "8"
4 ? integer "14"
3 all_axis boolean "F"
3 axis1 integer "1"
3 axis2 integer "2"
3 ind_nb integer "20"
3 var_nb integer "21"
Samples
|