Home
 

Correspondance Analysis (CA)

Inputs Qualitative data (unlimited number)
or
specific CA matrix
Output qualitative data

Display

Description

The aim of the Correspondence Analysis is equal to the PCA module (Principal Component Analysis): it consists in researching a hierarchisation of the data contained in a matrix constituted with n lines (with n the number of entities in the map) and p columns (with p the number of entry data).
The CA presents the advantage to allow the process of qualitative variables and to bring to the fore structures that are not necessarily linear.
Attention, the different qualities are not considered in a particula order, and you cannot use qualities like : "Heavy", "Medium", "Light". The module will not take this order in the computation, and the result may be false.

Calculation

  • standardisation of the data by a complete disjunctive coding,
  • constitution of the frequency table,
  • constitution of the inertia matrix according to the chi² metric,
  • calculation of the coordinates (value of the projections above factorial axis), of the contribution (part of an i or j point to the inertia explained by the factor) and the qualities of representation,
  • calculation of the other settings (see below).

Interpretation

You first have to determine the number of axis on which to work thanks to latent roots and then determine the definition of the axis thanks to the coordinates, contributions and qualities.
In a CA, the lines and columns play the same role, thus we only represent one graph. Therefore, we can graphically interpret the proximity between the line-individuals and the column-individuals. Two close line-individuals reveal a similar behaviour (provided these qualities are suitable). You cannot interpret the proximity between a line-individual and a column-individual. However, you can explain the position of a line-individual in comparison with all the column-individuals or the position of one column-individual in comparison with all the line-individuals.

Graphs

Under each graph, there is a button : allowing to create a new graphic window from the current drawing. This window will enable you to compare several graphs, to export them or to print them.


Results

In the table located in the bottom left of the setting window are the latent roots and the percentage of information took into account in the corresponding axis. The graph located at the bottom right represents the histogram of the latent roots. If you click on a red rectangle (representing the latent root), the number of the latent root selected is displayed above the graph.
Individuals and variables are represented in the graph located on the upper right corner (respectively red and blue squares). They are displayed in the chosen factorial plan. Indeed, the horizontal axis corresponds to the axis you selected in the "axis #1" field, and the vertical axis to the "axis #2" field. These axes are included between 1 and the number of data in entry.
When you click on a red or blue square, you display the name of the individual or the variable under the graph. If you want to know precisely the coordinates of a variable or an individual in the factorial plan, you must click on the "save the results" button and read the so-created file.

If you change the size of the window, every drawing will automatically adapt to this new size.

If the option all the axis is checked, the calculations will be made for all the axis, otherwise they would only be made for the two chosen axis, which would reduce the processing time (especially in the case when the number of entry of the module is significant), and the size of the result file.

If you click on the Save in a file button, you shall create a text file in which will be written all the results of the AC.

In this file, you shall find :

  • The latent roots and vectors of the correlation matrix
  • The total inertia of the scattergram
  • The hierarchy between the axes (the percentage of concentration of the information of each axis given by the latent root)
  • The sum of the hierarchy between the axes
  • The information related to :
    • All the axes if you notched the option all the axes
    • The two chosen axes if you notched the option 2 axes

The informations related to the axes are the following :

For each individual and each variable, here are the 5 different types of results :

  • Their coordinates on the chosen axes, allowing to locate them according to the axes system.
  • The relative weight, indicating the importance of the role of each of them in the process.
  • Their contribution to the chosen axes, measuring the role played by each of them in the axis formation.
  • Their representation qualities on the chosen axes, measuring their proximity with the axes.

Output

The table that contains the individuals' coordinates on all the axes is provided in exit, in order to make an Ascending Hierarchical Classification consecutive to this module. See the AHC module for further details about this classification.

Notions of statistic

The detailed results presented above must be carefully interpreted. Indeed, the results are reliable only for the strongest (and weakest) values of the variables and individuals' coordinates.
The closer a data gets to 0, the less the corresponding axis is significant (the variable or the individual participates less and less to the structure brought to light by the axis). In order to interpret the coordinates, we must care about the extreme values.
Moreover, we cannot have clear idea of what characterises an individual with only one axis (even the first one). However, the examination of several axes will allow to reconstitute with accuracy the specific characteristics of each individual according to the set of variables.

Should you need further information, you can refer to the principle and calculation detail used by the module.

Sources : Lecture about the Quantitative Methods from François Bavaud
University of Lausanne
Switzerland

Script :

2      module untyped_list ""
3        mod_type integer "103"
3        mod_subtype integer "518"
3        mod_name string "AFC"
3        mod_dads integer_list ""
4          ? integer "4"
4          ? integer "5"
4          ? integer "9"
4          ? integer "6"
4          ? integer "7"
4          ? integer "8"
4          ? integer "14"
3        all_axis boolean "F"
3        axis1 integer "1"
3        axis2 integer "2"
3        ind_nb integer "20"
3        var_nb integer "21"


Samples