Ascending Hierarchical Classification
| Inputs |
Continuous data (unlimited number)
or
1 Specific P.C.A. Matrix |
| Output |
Qualitative data |
Display :

Description :
This module enables an automatic classification of geographical entities to be obtained as a function of a number of
statistical data items.
The entities on the working map are called the individual values, and the input data are called the variables.
The Ascending Hierarchical Classification consists of carrying out progressive grouping of individual values in
accordance with their degrees of similarity to obtain a single class that groups them all.
Once this calculation has been made, the individual values are divided up into various classes; the class number
is fixed by the user.
This module can use a Principal Component Analysis module as an input: the data will then be
the coordinates of the individual values on the m first axes (m being the number of axes, varying from 1 to the
number of inputs from the P.C.A. module).
Calculations:
- Stage 1:
We consider the n individual values characterized by their values as compared with the p variables,
and we determine the 2 individual values that have the highest degree of similarity as compared with this set of
variables. These 2 individual values are grouped to form a class.
- Stage 2:
On this level, we have (n-1) classes, one made up of the 2 individual values grouped above, and the others
containing only a single individual value. We can then determine the 2 classes that have the highest degree of
similarity, and group them.
- Following stages:
We repeat the same operation until we obtain a single class grouping all the individual values.
This procedure entails 2 choices:
- Determining a criterion of similarity between the individual values. The euclidian distance is used.
- Determining a distance between classes. This process is called an aggregation criterion; we can propose
several criteria.
Aggregation criteria:
- nearest neighbour
The distance between 2 classes c1 and c2 is defined as the smallest distance separating an individual value in c1
from an individual value in c2:

- maximum diameter
The distance between 2 classes c1 et c2 is defined as the greatest distance separating an individual value in c1
from an individual value in c2:

- average distance
This criterion consists of calculating the average distance between all the elements of c1 and all the elements of
c2:
where ni = number of elements in class ci
- centers of gravity
The distance between 2 classes c1 and c2 is defined as the distance between their centers of gravity:
where Gi = center of gravity of class ci
- centered moment of order 2
This criterion consists of looking for a partition giving minimum internal variance within the classes and maximum
variance between the classes:
where refers to the inertia of
the partition under which c1 and c2 are separated
refers to the inertia of the partition under which c1 and c2 are together
Gi = center of gravity of class ci
i.e.:
where Gi = center of gravity of the class ci
G = center of gravity of the cloud
mi = relative mass of the class ci = 
The difficulty in the choice of the aggregation criterion is that these criteria can lead to differing results.
The most widely used criterion is that of the centered moment of order 2, applied to a centered, reduced matrix.
Numerical example:
Initial matrix:
| |
v1
|
v2
|
v3
|
|
A
|
4
|
2
|
4
|
|
B
|
2
|
1
|
6
|
|
C
|
3
|
4
|
4
|
|
D
|
4
|
5
|
2
|
|
E
|
2
|
4
|
5
|
Table of distances between individual values:
|
d2
|
A
|
B
|
C
|
D
|
E
|
|
A
|
0
|
|
|
|
|
|
B
|
9
|
0
|
|
|
|
|
C
|
5
|
14
|
0
|
|
|
|
D
|
13
|
36
|
6
|
0
|
|
|
E
|
9
|
10
|
2
|
14
|
0
|
The individual values C and E can clearly be seen to have the highest level of similarity with a squared distance of
2 (the lowest value in the table).
They are thus grouped together in a class, and the first phase of classification, independent of the pooling criterion,
has been finished.
Example of A.H.C. using the criterion of maximum diameter:
Table of distances between classes:
|
d2
|
A
|
B
|
{C,E}
|
D
|
|
A
|
0
|
|
|
|
|
B
|
9
|
0
|
|
|
|
{C,E}
|
9
|
14
|
0
|
|
|
D
|
13
|
36
|
14
|
0
|
The problem here is finding out which of the classes are closest: {A} and {B}, or {A} and {C,E}. No criterion other
than arbitrary choice enables us to find the best aggregation. We choose the distance that was calculated first:
grouping {A} and {B}.
Table of distances between classes:
|
d2
|
{A,B}
|
{C,E}
|
D
|
|
{A,B}
|
0
|
|
|
|
{C,E}
|
14
|
0
|
|
|
D
|
36
|
14
|
0
|
There is a problem here, too, to find the closest classes: {C,E}and {D}, or {A,B}and {C,E}. We choose, for example,
to group {C,E}and {D}.
There are thus 2 classes left:
{A,B} and {C,D,E}
The following hierarchical tree is obtained:

If we carry out a division in 2 classes, we obtain: {A,B} and {C,D,E}.
If we carry out a division in 3 classes, we obtain: {A,B}, {C,E} and {D}.
With the criterion of the nearest neighbour, we obtain the following tree:

If we carry out a division in 2 classes, we obtain: {A,C,D,E} and {B}.
If we carry out a division in 3 classes, we obtain: {A,C,E}, {B} and {D}.
The results are thus different depending on the pooling criterion chosen.
Elements of vocabulary:
A rather precise set of vocabulary is used within the framework of ascending hierarchical classification. We present it
here using the numerical example above.
The individual values A, B, C, D and E form classes with a single element, called terminal classes.
A group of 2 classes is called a node.
The first of these nodes carries the number of the total number of terminal classes +1 ( = n+1). Here,
the number is 6.
Each group gives rise to a new node whose number follows that of the preceding group.
The last node will thus have the number: 2 x n -1. Here, the number is thus 2 x 5 - 1 = 9.
Each node is thus made up of 2 classes that we call its successors; one is called the elder and the other
the junior (chosen arbitrarily).
By extension, we also designate the individual values using a node number between 1 and n. If you need to
know the exact node number corresponding to a particular individual value, you must click on the Save in a file
button and read, in the text file created, the paragraph entitled "The classes of the hierarchy".
With the criterion of the nearest neighbour, we thus obtain the following tree:
which is the
equivalent of 
Node 8 has thus as its successors classes {D} and 7. Arbitrarily, {D} is designated as the elder and 7 as the
junior.
The level indices enable us to measure and represent the importance of the difference between 2 classes that
we are grouping. They are defined by the distance between the 2 classes forming the node.
In this way, node 6 has the square root of 2 level index (distance between the elements C and E, see above).
These indices are very useful in studying the hierarchical tree.
Interpretation:
The graph shows the hierarchical tree obtained after classification, together with the histogram of the level indices.
It depends on the number of nodes drawn, which varies from 3 to 21. If you modify the number, the graph changes
as a result.
If you modify the window size, the drawing adapts itself automatically to the new
size.
Choice of class number:
The user must choose the desired class number. The number varies between 1 and the number of entities in the
working map.
A choice based on the level index is a judicious one. It is determined by reading off the histogram of level indices:
the user must notice large leaps in the values of the level index. If these leaps concern the k last nodes of the tree,
a division in (k+1) classes is appropriate.
For example, on the display copy at the top of the present document, we can see that the level index shows large leaps
for the last five nodes in the tree. The histogram is then more regular. A division in 6 classes would thus seem
appropriate.
To make it easier to read off the histogram and find large leaps in the level index, we advise you to increase to
the maximum the number of nodes to be drawn.
If you need to know the exact value of a level index, move the mouse pointer onto the corresponding histogram; the
value of the index is displayed in the text area below the graph.
Building up the classes:
To build up a division into 4 classes, we divide the hierarchy starting from the final node. The successors of this
node are the classes i and j. At this level, we thus have a division in 2 classes. We then subdivide the highest node
in the hierarchy (i for example). This node is made up of the 2 classes k and l, and at this level we have 3 classes:
j, k and l. The highest node is again subdivided into 2 classes to obtain the desired division into 4 classes.
Results:
By clicking on the Save in a file button, you create a tabulated text file in which the A.H.C. results are
inscribed.
In this file, you will find:
- Your choices concerning the parameters of the module that have an effect on the classification:
- The aggregation criterion selected.
- If the Center and reduce the matrix box has been checked, this choice will also be inscribed.
- The class number chosen.
- The list of the non-valid individual values, if any.
An individual value is considered to be non-valid if at least one of the data items of the input does not have a
value associated with the individual value.
- The overall inertia of the cloud.
- The classes of the hierarchy:
This table enables us to analyse in detail the resemblances between individual values. It also enables the content
of the classes corresponding to the division chosen to be established.
This is a table in which each line corresponds to a node in the hierarchy. For each line, we have the following
columns:
- Number of the class (if the node corresponds to a class)
- Number of the node
- Elder of the node
- Junior of the node
- Number of elements contained in the node
- Identifying elements of the individual values making up the node
- The parameters of the hierarchy:
This is a table in which each line corresponds to a node in the hierarchy. For each line, we have the following
columns:
- Number of the node
- Level index of the node
- Inertia rate of the node (e.g.: if it is shown as 230, this means that the node represents 23 % of the overall
inertia)
- Sum total of the inertia rates (e.g.: if it is shown as 548 for the kth last node in the tree, this
means that 54.8 % of the differences between individual values are expressed through this division into (k+1)
classes)
- Interpreting aids:
This table shows some general parameters on each of the nodes. It enables us to set out directly the characteristics
of each class of the division.
This is a table in which each line corresponds to a node in the hierarchy. For each line, we have the following
columns:
- Number of the class (if the node corresponds to a class)
- Number of the node
- Relative mass of the node
- Contribution of the node to the dispersion of the cloud
- Deviation separating the node from the general average
- In the following columns: deviation of the node as compared with the average of each variable. The highest values,
in absolute terms, of these columns indicate which variables best characterize the node (e.g.: for a given class, if
the variable V1 has a value of 1500, V2 2700, V3 100, V4 200, V5 -1200, then the characters contributing the most to
differentiate this class from the group of the other individual values are, in this order: the importance of the
characters V2 and then V1, and the relative weakness of the character V5; the other variables make only an
insignificant contribution to the specificity of this class).
Formulas used:
Total inertia of the cloud:
where G = center of gravity of the cloud
mi = relative mass of the individual value i
=
Rate of inertia of the node J:
where I(J) = level index of the node J
Relative mass of the class c:

Contribution of the class c to the dispersion of the cloud:
where mc = relative mass of the class c = 
Deviation separating the class c from the overall average:

Remark:
All the values given in the paragraphs "The parameters of the hierarchy" and "Interpreting aids" are multiplied by
1000, except RHO2 which is multiplied by 100.
This method has been adopted by most data analysis programs; its purpose is to simplify the presentation of the
results.
Graphics :
Under the graphic, you will find a button : that creates a "graphical window" from
current drawing. This window let you compare graphics, export them, or print them.
For further details, you can consult the following book:
L'analyse statistique des données en géographie. (Statistical data analysis in geography). Lena
Sanders. Collection Alidade. Groupement d'Intérêt Public Reclus. Montpellier. 1989.
Script:
2 module untyped_list ""
3 mod_type integer "103"
3 mod_subtype integer "527"
3 mod_name string "CAH"
3 mod_dads integer_list ""
4 ? integer "5"
4 ? integer "8"
4 ? integer "9"
4 ? integer "4"
4 ? integer "6"
4 ? integer "7"
3 class_nb integer "6"
3 cah_type integer "705"
3 center_matrix boolean "T"
3 draw_nb integer "17"
3 ind_nb integer "20"
3 var_nb integer "6"
3 acp_dad boolean "F"
3 axis_nb integer "1"
Values for typical hac:
nearest neighbour 701
maximum diameter 702
average distance 703
centers of gravity 704
centered moment of order 2 705
Samples
|