Reduction of search space for the mean partition problem

JYRKO CORREA-MORRIS

Mathemtics and Natural Sciences Department

Miami Dade College, Padron Campus

627 SW 27 Ave, Miami, Florida

USA

Abstract: The contributions of this paper are threefold. First, it conducts a formal comparison of the primary ap-

proaches to consensus clustering, using the concepts of agreement and consent. Secondly, it presents theoretical

evidence justifying the preference for mean-based methods, which rely on consent, over other agreement-based

procedural methods like the q-rule, which are now mostly used as quality evaluators in practice. Thirdly, the

paper computes the exact reduction achieved by criteria available in existing literature to assess the quality of

mean-based consensus solutions and reduce the search space’s size. Finally, it compiles the regions where con-

sensus functions associated with well-known dissimilarity measures, such as the Mirkin metric and Variation of

Information, accumulate their consensus solutions.

Key-Words: Lattice of partitions, Consensus, Mean partition, Quota Rules, Cluster ensemble, Search Space

Reduction

Received: May 25, 2023. Revised: August 26, 2023. Accepted: September 28, 2023. Published: October 20, 2023.

1 Introduction

Clustering problems are widespread and have led to

the development of numerous algorithms; however,

the lack of reliable performance information has cre-

ated “The User’s Dilemma,” [1], making algorithm

selection uncertain.

To address this, two approaches have emerged.

First, formal theories aim to establish generic con-

cepts and rules to classify and understand cluster-

ing algorithms, [2], [3], [4], [5]. Second, more ad-

vanced algorithms combine outcomes from multiple

traditional methods to produce integrated solutions,

known as “consensus methods,” [6], [7], [8], [9]. This

paper primarily focuses on consensus methods, par-

ticularly those based on the concept of a mean parti-

tion.

1.1 Consensus methods

Consensus algorithms take a set of partitions

p1,p2, . . . , pm, usually called an ensemble or a

profile, generated by standard clustering algorithms

and merge them into a holistic solution called the

“consensus partition.” These input partitions rep-

resent partial solutions, akin to expert judgments

from different perspectives. The consensus par-

tition aims to condense these partial perspectives

into a comprehensive representation. There are

several interesting axiomatic studies, many of them

published in journal of economics or social sci-

ences, [10], [11], [12], [13], [14], which address

partitions as equivalence relations and define the

consensus mechanisms as a function that receives an

m-tuple of equivalence relations (an ordered ensem-

ble) and returns the equivalence relation resulting of

taking the meet (see Section 2) of some subset of the

inputs determined by a fixed subset of the indexes

(e.g., the meet of the first, third, and fifth partitions

in the tuple), [11]. Similar mechanisms are obtained

if instead of the meet, the join operator is used. In

the first case, the consensus mechanism is called a

meet aggregator, while in the second case it is called

join aggregator. What varies from one aggregator

to another of the same kind is the set of indexes

to be considered. Apart from the axiomatic ap-

proaches, there exist two closely related fundamental

mechanisms of consensus in the ambit of partitions,

which have been often applied in pattern recognition

and machine learning tasks: quota rules [15] and

mean-based consensus, [16], [6], [7], [8].

Quota rules constitute a slightly more flexible join

aggregator. Given 0≤q≤m(mdenotes the total

number of partitions in the ensemble), the q-quota rule

is the finest partition that places in the same cluster

all those pairs of objects that were placed in the same

cluster of more than qpartitions of the ensemble, [15],

[17]. Given a subset Aof X, we said that a partition

of Xrefines Ais every cluster of this partition either

lies entirely within Aor entirely outside A. The dual

q-quota rule, 0≤q≤m, is the coarsest partition that

refines all subsets Aof Xthat are refined by more

than qpartitions of the ensemble.

In turn, the consensus solutions in the mean-based

approach are defined as the minimizers of the con-

sensus function φ(p) =



i=1

D(p,pi), where Dis

a dissimilarity measure for quantifying the resem-

blance between two partitions and p1,p2, . . . , pmare

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.81

Jyrko Correa-Morris

E-ISSN: 2224-2880

736

Volume 22, 2023

the members of the ensemble.

When comparing different methodologies, quota

rules align nicely with the idea of agreement, offer-

ing straightforwardly interpretable results. However,

they are highly sensitive to factions within the en-

semble, which is common in practice. This sensitiv-

ity leads to fragmented partitions and limited insights

into data interrelations. Meet aggregators share sim-

ilar sensitivity issues. In contrast, dual quota rules

and join aggregators tend to produce compact parti-

tions dominated by a few large clusters, known as the

“chaining effect.”

Mean-based methods, favored for their robustness

in practical applications, handle diverse ensembles

and provide more informative outcomes compared to

quota rules, especially regarding cluster number and

size. However, accurately computing mean-based so-

lutions can be computationally expensive due to the

large search space, unless additional information is

available for effective reduction. Additionally, the ac-

curacy of approximations by commonly used heuris-

tic methods for averaging remains uncertain.

From a foundational perspective, there are uncer-

tainties about how the consensus function φthat com-

putes the mean using a dissimilarity measure aligns

meet and join operations in the lattice of partitions

with basic arithmetic operations on real numbers.

These unique characteristics raise questions concern-

ing the solutions quality and regarding the solutions

location.

1.1.1 Quality of consensus solutions

As mentioned earlier, quota rules and dual quota rules

align well with the concept of agreement, making

their results easily interpretable and practical. While

quota rules offer limited but consistently reliable in-

formation, dual quota rules might provide some noise

by potentially including some irrelevant information.

Quota rules may not be highly informative as consen-

sus solutions, but they capture essential information.

In contrast, dual quota rules encompass all the infor-

mation (even if it includes noise) that a quality con-

sensus solution should contain.

Consequently, these rules, although not serving as

primary consensus solutions, serve as quality evalua-

tors. They assess the quality of mean-based consen-

sus solutions as follows: a consensus solution is con-

sidered acceptable if it contains at least the informa-

tion provided by a quota rule and no more than what a

dual quota rule offers. These rules play a crucial role

in evaluating the effectiveness of mean-based consen-

sus solutions.

This situation has been empirically addressed by

several researchers, [18], [19], [20].

1.1.2 Location of consensus solutions

To tackle computational challenges, various ap-

proaches have proposed reducing the search space,

each with its own method. While these reductions

may appear intuitive, they often lack substantial ev-

idence or support from general principles governing

consensus solution behavior.

The exponential growth of the Bell sequence (the

nth Bell number represents the number of partitions of

a set with nelements) underscores the impossibility

of evaluating each individual partition of a given fi-

nite dataset to find consensus solutions. Hence, prac-

tical procedures frequently employ unverified search

methods. Pruning the search space becomes neces-

sary to handle computational constraints. This in-

volves identifying a smaller region where consensus

solutions are likely to be found, thereby improving

computational efficiency and approximation quality.

However, certain questions arise that warrant con-

sideration: Is there a common region where con-

sensus solutions, regardless of the consensus func-

tion, should be sought? For many dissimilarity mea-

sures, the answer is yes. As mentioned earlier, quota

rules and dual quota rules serve as lower and upper

bounds of consensus solutions, significantly reducing

the search space.

Another important question is whether all consen-

sus functions accumulate their solutions in the same

region of the search space. The answer is no. For ex-

ample, consensus functions associated with the lattice

metric and the symmetric difference metric accumu-

late their solutions in opposing regions of the partition

space, [15].

1.2 Contributions

Motivated by these considerations, this paper delves

into the distinction between “agreement” and “con-

sent” within the context of clustering consensus meth-

ods, focusing on significant subtleties and nuances,

and formalizes these concepts. Subsequently, the arti-

cle translates the primary agreement-based consensus

criteria into rules for evaluating the quality of mean-

based consensus solutions.

In addition to these criteria, the paper incorporates

other criteria derived from axiomatic approaches,

though it is important to note that not all of them are

equally applicable or desirable. These are necessary

steps to facilitate access to the main contribution of

the paper: the exact computation of the pruning capa-

bility of these criteria.

For each criterion, the paper investigates its po-

tential as a means to prune the search space by pre-

cisely calculating the number of partitions it elimi-

nates. This aspect of the methods has not been ex-

plored to the best of my knowledge, making it a novel

and unexplored area of research.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.81

Jyrko Correa-Morris

E-ISSN: 2224-2880

737

Volume 22, 2023

By addressing these aspects, the paper aims to

deepen our understanding of clustering consensus

methods and provide valuable insights into their per-

formance and effectiveness.

2 The size of the lattice of partitions

Let X={x1, x2, . . . , xn}be a finite set. A parti-

tion p of Xis a collection p ={C1, C2, . . . , Cs}of

subsets of Xsuch that Ci∩Cj=∅for i=jand

∪s

i=1Ci=X. The subsets C1, C2, . . . , Csare called

the clusters or blocks of p. Throughout this paper,

mXdenotes the partition all whose clusters are sin-

gletons —mX:= {{x}:x∈X}—, while gXde-

notes the partition whose only cluster is the entire set

X—gX:= {X}. There is a natural partial order

between partition: we say that the partition p refines

the partition p′, in notation notation p ⪯p′, if the

following condition holds: for every pair of objects

x, x′∈X, if xand x′are in the same cluster of p,

then they are also in the same cluster of p′. Naturally,

p≺p′means that p strictly refines p′. We say that

a partition p covers a partition p′if (a) p and p′are

different partitions; (b) p′refines p; and (c) if p′re-

fines p′′ which in turn refines p, then either p′′ =p′

or p′′ =p (i.e., there is no partition in between p′and

p). The notation p′⊏p means p covers p′. For any

two partitions p and p′there are partitions that refine

both. The coarsest partition that satisfies this property

is denoted by p ∧p′and is called the meet of p and p′.

Two elements x, y ∈Xare placed in the same cluster

of p ∧p′if and only if they are simultaneously placed

in the same cluster of p and in the same cluster of p′.

Therefore the clusters of p ∧p′are all the possible

non-empty intersections of one cluster of p with one

cluster of p′. For any two partitions p and p′there are

partitions that are refined by both. The finest partition

that satisfies this property is denoted by p ∨p′and is

called the join of p and p′. Two elements x, y ∈X

are placed in the same cluster of p ∨p′if and only if

there is a sequence x=xi1, xi2, . . . , xik=ysuch

that, for all j∈ {1,2, . . . , k −1},xijand xij+1 are

either placed in the same cluster of p or in the same

cluster of p′. Thus, the cluster of p ∨p′are the subsets

of Xthat can be expressed both as unions of clusters

of p and as unions of clusters of p′, and are minimal

with respect to this property. Henceforth, PXdenotes

the lattice of all partitions of X.

The partitions with exactly n−1clusters, i.e.,

partitions all whose clusters are singletons except for

one which includes exactly two elements, are called

the atoms or atomic partitions of PX. From now

on, pxx′denotes the atomic partition whose only non-

singleton cluster is {x, x′}. Moreover, AX, or simply

A, denotes the set of atoms of PXwhile A(p)denotes

the set of all the atoms of PXthat refine the given

partition p. Analogously, the partitions with exactly

2 clusters are called the co-atoms or co-atomic parti-

tions of PX. Henceforth, pMdenotes the co-atomic

partition whose only clusters are Mand X−M.

CAXdenotes the set of co-atoms of PXand CA(p)

denotes the set of co-atoms of PXthat are refined the

given partition p. It is well-known that every partition

p=mXcan be expressed as the join of atom while

every partition p =gXcan be represented as the meet

of co-atoms.

The number of partitions in PXis given by the nth

Bell number, Bn, where nis the number of elements

in X. Bell numbers can be computed by using any of

the following equivalent formulas:

Bn=1

∞



k=0

k!(Dobinski’s formula); (1)

Bn=

n−1



k=0 n−1

kBk, B0= 1; (2)

Bn=



k=1

S(n, k).(3)

Equation (3) defines the Bell numbers as the sum

of the Stirling numbers of the second kind, S(n, k),

where, for specific values of nand k,S(n, k)gives

the number of partitions of nelements into kclusters.

In turn, Equations (1) and (2) together show that the

difference between two consecutive Bell numbers is

an exponential, which implies that Bnincreases ex-

tremely quickly as nincreases. Table 1 shows the first

ten Bell numbers, which are additionally compared to

2n−1, as well as the difference between any two con-

secutive of them, in order to numerically illustrate the

claim above.

Table 1: The first ten Bell numbers

n7 8 9 10

Bn877 4,140 21,147 115,975

Bn−Bn−1674 3,273 17,007 94,828

2n−164 128 256 512

3 Consensus by agreement vs.

consensus by consent: Quota rules

and mean-based consensus

This section commences by exploring the distinction

between two closely related concepts: ”agreement”

and ”consent.” The primary objective of this discus-

sion is to offer formal evidence regarding the dissimi-

larity between the two main approaches in consensus

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.81

Jyrko Correa-Morris

E-ISSN: 2224-2880

738

Volume 22, 2023

methods: quota rules and mean-based methods. Fur-

thermore, it seeks to clarify why the latter approach

has gained more popularity in applied fields such as

pattern recognition and machine learning.

In essence, agreement can be viewed as what is

desired, whereas consent represents what is accepted.

Consequently, in an ensemble, a member may choose

to consent to a proposal that is not their top choice,

aiming to achieve a solution that might not be optimal

but is satisfactory to all members. As a result, obtain-

ing consent appears to be easier than achieving agree-

ment. This flexibility in consent-based methods leads

to more informative solutions with a better balance

between the size and number of clusters. On the con-

trary, agreement-based methods tend to reveal only a

limited amount of the underlying data structure, pri-

marily due to either a high level of fragmentation or a

high level of concentration. Organizing the lattice of

partitions, we observe that agreement methods often

produce partitions that lie either in the lower or higher

levels of the partition space.

Despite the limited information provided by

agreement-based methods, it is undeniable and can

be valuable as a source of information when seeking

to prune the search space in mean-based methods. In

this context, it plays a crucial role, even though it may

not suffice as a standalone consensus solution.

To formalize these ideas, I shall introduce the fol-

lowing definition:

Definition 1 In the process of creating the consensus

solution p∗, the jth member of the ensemble, pj, is

said to agree that the pair of objects xand x′to be

placed in the same cluster of p∗if and only if xand x′

are in the same cluster of pj. Thus, the ensemble E=

{p1,p2, . . . pm}is said to contain a quota q∈[0,1]

of agreement with the fact that the pair of objects x

and x′in Xto be placed in the same cluster of p∗if

and only if more than mq members of the ensemble E

agree that xand x′to be placed in the same cluster of

p∗. In other words, if

|{pj∈ E :pxx′⪯pj}| > mq.

In the process of creating the consensus solution

p∗, the jth member of the ensemble, pj, is said to con-

sent, according to the dissimilarity D,that the pair of

objects xand x′in Xto be placed in the same cluster

of p∗if and only if the partition obtained from p∗by

merging those clusters containing xand x′, respec-

tively, either maintains or lowers the dissimilarity

with respect to pj. In other words, D(pj,p∨pxx′)≤

D(pj,p). The entire ensemble Eis said to consent,

according to D, that the pair of objects xand x′in

Xto be placed in the same cluster of p∗if and only if

φ(p∗∨pxx′)≤φ(p∗), which amounts to saying that



i=1

D(pi,p∗∨pxx′)≤



i=1

D(pi,p∗).

The dissimilarity D(pj,p∗)can be thought as the

degree of consent that the jth member of the ensem-

ble, pj, assigns to the proposal “the partition p∗is the

consensus solution”.

This definition establishes that a consensus

reached by agreement is the solution of certain q-

quota rule, while a consensus obtained by consent is

the solution of a certain mean-based method. Now, it

is worthy, for the sake of completeness, to introduce

a more formal definition of quota rules.

Quota rules:Given an ensemble E=

{p1,p2, . . . pm}of partial partitionings of the

data set X, and a real number q∈[0,1], the q-quota

rule, in notation pq, is the finer partition from among

those that are refined by every atom that, in turn,

refines more than mq members of the ensemble.

By definition, the finer partition refined by some

given partitions is their join. It can be therefore con-

cluded that:

pq={pxx′∈ A :|{pi∈ E :pxx′⪯pi}| > mq}.

In the case that the set of such atoms is empty, then

pq=mX.

Among the q-quota rules, unanimity rule and ma-

jority rule stand out by being used the most in practice

(see, for instance, [15]).

Unanimity rule:The unanimity rule is the q-quota

rule that corresponds to q=m−1

m. (In this case,

mq =m−1, which means that the atoms to be con-

sidered are those that refine all the members of the

ensemble.) Accordingly, the unanimity consensus is

given by the meet of all the members of the ensemble:



i=1

pi.

Majority rule:The majority rule is the q-quota

rule that corresponds to q= 1/2.

In an effort to address the issue of quota rules tend-

ing to generate small clusters, the concept of Dual

Quota Rules has been introduced [15].

Dual quota rules:Given an ensemble E=

{p1,p2, . . . pm}of partial partitionings of the data set

Xand a real number q∈[0,1], the dual q-quota rule,

in notation pd

q, is the coarser partition from among

those that refine every co-atoms that, in turn, is re-

fined by more than mq members of the ensemble.

Since the coarser partition that refines some given

partitions is their meet, it can be concluded that:

q={pM∈CA,:|{pi∈ E :pi⪯pM}| > mq}.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.81

Jyrko Correa-Morris

E-ISSN: 2224-2880

739

Volume 22, 2023

In the event that the set of such co-atoms is empty,

then pd

q=gX.

The Dual unanimity rule and the Dual majority

rule correspond to q=m−1

mand q=1

2, respectively.

Similar to quota rules, these dual rules are frequently

employed in practical applications.

Restated in terms of consent, quota rules and dual

quota serve as criteria designed to evaluate the in-

ner workings of other consensus methods that may

not be equally transparent. This assessment provides

valuable information about the quality and location of

their consensus solutions, which can then be utilized

to effectively prune the search space.

Quota rules for consent:Given an ensemble E=

{p1,p2, . . . pm}of partial partitionings of the data set

Xand a real number q∈[0,1], the q-quota rule for

consent establishes that any consensus partition, par-

ticularly the minimizers of the mean-based consensus

function φ(p) =



i=1

D(p,pi), must be refined by pq.

In essence, q-quota rules for consent stipulate that

pairs of objects appearing together in the same cluster

for more than qm partitions of the ensemble should

be grouped together in the same cluster of any con-

sensus partition. Consequently, this rule ensures that

any consensus solution must be at least as inclusive as

pq, rather than declaring pqitself as the consensus so-

lution. To illustrate, in terms of intervals in PX, these

rules propose replacing a larger interval, namely the

entire set PXof partitions X, with a smaller interval

defined as [pq,gX].

Similarly, the dual q-quota rules for consent assert

that any consensus solution lies within the interval

[mX,pq].

Dual quota rules for consent:Given an ensem-

ble E={p1,p2, . . . pm}of partial partitionings of

the data set Xand a real number q∈[0,1], the dual

q-quota rule for consent establishes that any consen-

sus partition, particularly the minimizers of the mean-

based consensus function φ(p) =



i=1

D(p,pi), must

refine pd

It is time to introduce the Undesired Atoms crite-

rion. Instead of listing the pairs that should appear

together in the same cluster of a consensus partition,

the following criterion has a prohibitive approach.

Undesired atoms:Given an ensemble E=

{p1,p2, . . . pm}of partial partitionings of the dataset

X, this rule establishes that any atomic parti-

tion pxx′for which there is no sequence x=

xi1, xi2, . . . , xik=x′such that every atom pxij,xij+1 ,

j= 1,2, . . . , j −1, refines at least one partition of the

ensemble E, must refine no consensus solution.

This rule finds its foundation in the notion that

there are essentially two fundamental reasons for

placing a pair of objects in the same cluster of a con-

sensus partition: (i) when the ensemble reaches an

agreement on this pairing, and (ii) when the ensem-

ble decides to do so through consent or cooperation.

For (i), this pair must appear together in a sufficient

number of ensemble members, ensuring the required

quota of agreement. Conversely, in (ii), this pair is

placed together in the same cluster of the consensus

partition due to the chaining effect resulting from co-

operation among the ensemble members. An unde-

sirable atom is one that lacks justification for its ele-

ments to be grouped together in the same cluster, nei-

ther through agreement nor cooperation.

To elaborate, if a partition p refined by an unde-

sirable atom pxx′is prohibited from being a consen-

sus solution, it implies that the union of the intervals

[pxx′,gX], where pxx′ranges over the set of undesired

atoms, does not contain any consensus partitions.

To conclude this section, I will introduce two novel

criteria that I have recently uncovered while inves-

tigating the influence of submodularity on the local-

ization of consensus partitions, [21], [22]. Submodu-

lar functions can be considered akin to discrete con-

vex functions, and as it is widely recognized, con-

vexity is one of the most advantageous scenarios in

continuous optimization. The exploration of the im-

pact of this seemingly natural property has garnered

increasing interest within several areas of mathemat-

ics and computer science, including optimization and

machine learning.

Unanimity rule for consent bounded from above

by the ensemble:Given an ensemble E=

{p1,p2, . . . pm}of partial partitionings of the dataset

X, this rule establishes that any consensus solution

must be refined by the meet of all the members of E,

m

j=1 pj, and refines at least one member of E.

Essentially, this rule asserts that any consensus

partition can be obtained by refining a certain parti-

tion of the ensemble to be discovered, ensuring that

no pairs placed together in the same cluster for any

partition in the ensemble are split apart. Formally,

this statement implies that all consensus partitions lie

within the union of the intervals 





j=1

pj,pi

, pi∈ E.

While this rule may impose a restrictive condition,

it is not intended to be satisfied by every consensus

function; rather, it has emerged naturally when inves-

tigating the behavior of consensus solutions produced

by submodular consensus functions.

In a similar vein and stemming from the same

source, the following and final rule posits that any

consensus partition can be identified by merging clus-

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.81

Jyrko Correa-Morris

E-ISSN: 2224-2880

740

Volume 22, 2023

ters from a particular member of the ensemble to be

revealed, with the assurance that all new pairs are the

result of cooperation among the members of the en-

semble.

Dual unanimity rule for consent bounded from

below by the ensemble:Given an ensemble E=

{p1,p2, . . . pm}of partial partitionings of the dataset

X, this rule establishes that any consensus solution

must be refined by at least one the pi’s in Eand re-

fines m

j=1 pj, the join of all the members of E. Ac-

cordingly, any consensus partition lies in the union of

the intervals 

pi,



j=1

pj

, pi∈ E.

4 On the exact prune of the search

space

In this section, we calculate the exact reduction of-

fered by each criterion examined in the previous sec-

tion. The phrase ”reduction provided by the rule R”

implies that if an average consensus function satisfies

the rule R, the minimizers of φare among the parti-

tions of PXaccepted by R.

As each criterion provides a region that is either an

interval or a collection of intervals, it is convenient to

determine the number of partitions contained within

an interval [p,p′].

Lemma 1 Let pand p′={C1, C2, . . . , Ck}be par-

titions such that p≺p′. If the cluster Ci,1≤i≤k,

of p′is the union of niclusters of p, then the number

of partitions in the interval [p,p′]is



i=1

Bni.

Table 2 shows some illustrative numbers.

Table 2: Computing the number of partitions in [p,p′]

by applying Lemma 1

n k #p n1;. . . ;nk



i=1

BniBn

10 2 7 3; 4 75 115,975

10 2 8 4; 4 225 115,975

10 2 8 2; 6 406 115,975

12 3 10 2; 3; 5 520 4,213, 597

12 4 11 2; 2; 3; 4 300 4,213,597

15 3 12 3; 4; 5 3 900 1,382,958,545

15 4 10 3; 3; 4 375 1,382,958,545

PROOF 1 Any partition that refines the partition p′

can be thought of as an union of partitions of the clus-

ters of p′. Thus, as a first step toward our desired

conclusion, we could think that if we compute the to-

tal number aiof ways in which every cluster Ciof p′

can be partitioned, then the product of the product of

the ai’s would provide us with an estimate of the num-

ber that we are looking for. Notice that Ciis a set

itself, hence the number of partitions of Ciis the Bell

number corresponding to the number of elements in X

that form Ci. However, we do not want to count every

partition that refines p′, but only those that are refined

by pas well. To take into consideration the latter con-

dition, it suffices to count only those partitions of Ci

in which the elements lying in the same cluster of p

remain together. This means that the finest partition

of Cithat we can consider is that consisting of those

clusters of pthat are contained in Ci. This suggests

that, instead of considering the elements of Xthat

form Cito determine the Bell number that matches

ai, the clusters of pcontained in Cishould be con-

sidered as individual elements to determine such Bell

number. Since there are niof such clusters of p, the

number of ways of partitioning Ciunder the required

condition is Bni, the nith Bell number. Therefore, the

number of partitions in the interval [p,p′]is



i=1

Bni.

Now, we are poised to calculate the precise reduc-

tion offered by each criterion presented in the previ-

ous section. I will present them in the same order in

which they were introduced earlier.

Proposition 1 If pqhas kclusters, then the reduction

provided by q-quota rule for consent is of Bn−Bk

partitions.

PROOF 2 Remember that the q-quota rule for con-

sent basically establishes that any consensus partition

must lie in the interval pq,gX. Since the only cluster

of gXis the entire set X, which is the union of kclus-

ters of pq, it can be concluded, in view of Lemma 1,

that the number of partitions in the interval pq,gXis

Bk. Consequently, the reduction provided by q-quota

rule is of Bn−Bkpartitions.

Let us analyze now the reduction provided by the

dual q-quota rule.

Proposition 2 If the cluster Ciof pd

qhas nielements,

then the reduction provided by the dual q-quota rule

is of Bn−k

i=1 Bnipartitions.

PROOF 3 The dual q-quota rules basically states

that any consensus partition lies in the interval

mX,pd

q. Since all the clusters of mXare singletons,

each cluster of pd

qis the union of niclusters of mXif

and only if it contains nielements of X. The result is

now a direct consequence of Lemma 1.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.81

Jyrko Correa-Morris

E-ISSN: 2224-2880

741

Volume 22, 2023

Now, let us direct our attention to the Undesired

Atoms rule. Before calculating the reduction pro-

vided by this rule, we must address a more funda-

mental question: How many undesirable atoms are

there? An atom pxx′is deemed undesirable if and only

if the elements xand x′belong to different clusters



j=1

pj—the join of all members in the ensemble

E. Thus, the number of undesirable atoms is equiva-

lent to the number of possibilities for selecting a pair

of elements from Xthat reside in distinct clusters of



j=1

pj.

Given this, each element in an arbitrary cluster Cu

of the join can be paired with each element in an-

other arbitrary cluster Cvof this partition, resulting

in #Cu·#Cvundesirable pairs. This proves the fol-

lowing lemma.

Lemma 2 If



j=1

pjhas kdistinct clusters

C1, C2, . . . , Ckwith n1, n2, . . . , nk, respectively,

then, according to E, there are



1≤u<v≤k

nunv

undesired atoms.

Lemma 3 The join of r≥1atoms consists of at least

n−rclusters (and at most n−1clusters).

PROOF 4 Given an arbitrary partition pand an ar-

bitrary atom pxx′, either xand x′are together in the

same cluster of por not. In the case that xand x′are

placed in the same cluster of p, then p∨pxx′=p. Oth-

erwise, when the join of pand pxx′is taken, the clus-

ters of pthat contain xand x′, respectively, merge into

a single cluster. Thus, p∨pxx′⊏p, which means that

p∨pxx′has one cluster less than p. Therefore, each

time the join of any partition and an atom is taken,

either the partition states the same or its number of

clusters decreases by one.

Suppose the join of ratoms is now considered.

We can think of the initial partition pas any of these

atoms, and then, we will take the join with the r−1

remaining atoms, one at a time. By virtue of the rea-

soning that we just made, after having performing the

join with all the r−1remaining atoms, the initial p

lost no more than r−1clusters.

Proposition 3 Let UA ={A1,A2, . . . , Aq}be the

set of all the undesired atoms according to the ensem-

ble E. The reduction provided by Undesired Atoms

rule is of

qBn−1+

n−1



j=2 





r=j

(−1)r+1prj 

Bn−j(4)

partitions, where prj is the number of sets consist-

ing of exactly rundesired atoms whose join has n−j

clusters (1≤j≤r).

PROOF 5 According to this rule, if a partition lies

in some of the intervals [Aj,gX], then it is prohib-

ited from being a consensus solution. As a result,

the union of these intervals must be removed from the

search space and the reduction provided by this rule

is the cardinality of such a union.

Applying the inclusion-exclusion principle, we get

that #



j=1

[Aj,gX]is equal to



r=1 

u∈Or

(−1)r+1#

pu,gX,(5)

where Or

q:= {u= (u1, u2, . . . , ur)∈Zr: 0 <

u1< u2< . . . < ur< q}and upushould be

understood as pu1∧pu2∧. . . ∧pur, provided u=

(u1, u2, . . . , ur)∈ Or

Now, by virtue of Lemma 3, the join of r≥2atoms

at least n−rclusters. Let us group all the possible

joins of rundesired atoms according to their number

of clusters. Using the fact that if a partition phas

n−jclusters, the interval [p,gX]has Bn−jparti-

tions, we can conclude that Aj1∨Aj2∨. . . Ajrcon-

tributes to (5) with (−1)r+1Bn−jprovided this join

has n−jclusters. Denoting now by prj the num-

ber of joins of ratoms that have n−jclusters, we

get that the total contribution such undesired atoms

is (−1)r+1prj Bn−j. Adding up these contributions,

the result follows.

Next, we examine the Unanimity Rule for Con-

sent Bounded from Above by the Ensemble. As

demonstrated earlier, this rule stipulates that consen-

sus solutions should lie in the union of the inter-

vals 





j=1

pj,pi

,i= 1,2, . . . , m. Applying again

the inclusion-exclusion principle combined with the

property that the intersection of two intervals of the

form [p,p′]and [p,p′′], respectively, is the interval

[p,p′∧p′′]because p′∧p′′ is the coarser partition that

simultaneously refines the partitions p′and p′′, it fol-

lows that



i=1 





j=1

pj,pi

is equal to:

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.81

Jyrko Correa-Morris

E-ISSN: 2224-2880

742

Volume 22, 2023

m−1



r=1

(−1)r−1

u∈Or

#





j=1

pj,

pu

.

It suffices now to apply Lemma 1 to prove the fol-

lowing proposition.

Proposition 4 Given u∈ Or

m,r= 1,2, . . . , m −1,

let nu

sustand for the number of clusters of



j=1

whose union is the suth cluster of 

pu, where

pu1,pu2, . . . , purbelong to ensemble E. The re-

duction provided by Unanimity Rule for Consent

Bounded From Above by the Ensemble is of:

Bn−

m−1



r=1

(−1)r−1

u∈Or

m

Bnu

su.

Analogously to the previous rule, we can calculate

the reduction provided by the Dual Unanimity Rule

for Consent Bounded from Below by the Ensemble.

This rule essentially states that any consensus solu-

tion lies in the union of the intervals 

pi,



j=1

pj

,

i= 1,2, . . . , m. The cardinality of these intervals

can be determined using the inclusion-exclusion prin-

ciple coupled with the fact that the intersection of two

intervals of the form [p′,p]and [p′′,p]is the interval

[p′∨p′′,p]because p′∨p′′ is the finer partition that

simultaneously refines the partitions p′and p′′.

Thus, the cardinality #



i=1 

pi,



j=1

pj

is equal to:

m−1



r=1

(−1)r−1

u∈Or

#



pu,



j=1

pj

.

On account of Lemma 1 we get the following

proposition.

Proposition 5 Given u∈ Or

m,r= 1,2, . . . , m −

1, let mu

sstand for the number of clusters of 

pu,

pu1,pu2, . . . , pur∈ E, whose union is the sth cluster



j=1

pj,1≤s≤k. The reduction provided by Dual

Unanimity Rule for Consent Bounded From Below by

the Ensemble is of:

Bn−

m−1



r=1

(−1)r−1

u∈Or



s=1

Bmu

5 State-of-the-art dissimilarity

measures

The Table 3 compiles the most significant results

reported in literature for consensus functions associ-

ated to well-known dissimilarity functions.

Table 3: State-of-the-art measures’ compliance with

reduction criteria (Unanimity (U); Dual Unanimity

(DU); Majority (M); Dual Majority (DM)).

Measure U DU M DM

Var. of Inform. ✓ ✓

Mirkin metric ✓ ✓

Dual Symm. Diff. Metric ✓ ✓ ✓

Lattice Metric ✓ ✓ ✓

0-1 Disagreement meas. ✓

Split-merge meas. ✓ ✓

As previously mentioned, the works of

Barthélemy and Leclerc ensure that the consen-

sus functions linked to any metric constructed from

a lower valuation exhibit a behavior akin to the

consensus functions associated with the Variation

of Information and Mirkin metric, respectively.

Likewise, the consensus functions related to any

metric constructed from an upper valuation display a

behavior similar to the consensus function associated

with the lattice metric [15].

6 Conclusions and future work

This paper embarked on an exploration of lattice the-

ory to delve into the fundamental distinctions between

the two primary approaches to consensus within the

domain of partitional clustering. By presenting for-

mal evidence, we clarified the preference for average-

based methods over quota rules, despite the latter’s

intuitive and easily interpretable nature. We under-

scored that quota rules and similar criteria have not

diminished in importance; rather, they have assumed

a new role.

These methods are now instrumental in apprais-

ing the quality of average-consensus solutions and

furnishing valuable insights into the whereabouts of

average solutions. They assist in the efficient prun-

ing of the search space. The principal contribution

of this article is the compilation of key criteria for

evaluating average-consensus solutions and the quan-

tification of the exact reduction in the search space

achieved by each criterion. Moreover, these findings

offer insights into the pruning capabilities of each cri-

terion, enhancing our understanding of mean-based

consensus methods, which carry significant practical

implications.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.81

Jyrko Correa-Morris

E-ISSN: 2224-2880

743

Volume 22, 2023

Looking ahead, further investigation is warranted

to analyze the various average consensus functions

corresponding to well-known dissimilarity measures.

While the Symmetric Difference measure has re-

ceived extensive scrutiny, other widely-used metrics

such as Variation of Information and Van Dongen’s

metric remain less explored. The results presented

here expand the horizons for in-depth exploration of

mean-based consensus methods in future research.

References:

[1] Anil K. Jain, Data clustering: 50 years beyond K-

means. Pattern Recognit. Lett. 31(8), 2010, 651-

666.

[2] Jon M. Kleinberg, An Impossibility Theorem for

Clustering, in Advances in Neural Information

Processing Systems 15 [Neural Information Pro-

cessing Systems, NIPS 2002, December 9-14,

2002, Vancouver, British Columbia, Canada],

MIT Press, (2002) 446–453.

[3] Margareta Ackerman, Shai Ben-David, Simina

Brânzei, David Loker: Weighted clustering: To-

wards solving the user’s dilemma. Pattern Recog-

nit. 120, 2021, 108152.

[4] Carlsson, Gunnar and Mamoli, Facundo, Classi-

fying Clustering Schemes, Foundations of Com-

putational Mathematics, 13 (2), (2013) 221–252.

[5] Jyrko Correa-Morris, An indication of unifica-

tion for different clustering approaches, Pattern

Recognit., 46 (9), (2013) 2548–2561.

[6] Tossapon Boongoen, Natthakan Iam-On, Cluster

ensembles: A survey of approaches with recent

extensions and applications, Computer Science

Review, 28, 2018, 1-25.

[7] Mustafa R. Kadhim, Guangyao Zhou, Wenhong

Tian, A novel self-directed learning framework

for cluster ensemble, Journal of King Saud Uni-

versity - Computer and Information Sciences,

34(10), Part A, 7841-7855.

[8] Hanan G. Ayad, Mohamed S. Kamel, On voting-

based consensus of cluster ensembles, Pattern

Recognition, 43(5), 2010, 1943-1953.

[9] Caroline X. Gao, Dominic Dwyer, Ye Zhu,

Catherine L. Smith, Lan Du, Kate M. Filia, Jo-

hanna Bayer, Jana M. Menssink, Teresa Wang,

Christoph Bergmeir, Stephen Wood, Sue M. Cot-

ton, An overview of clustering methods with

guidelines for application in mental health re-

search, Psychiatry Research, 327, 2023, 115265.

[10] Fishburn, Peter C., and Ariel Rubinstein, Aggre-

gation of equivalence relations, Journal of classi-

fication 3, 1986, 61-65.

[11] Dinko Dimitrov, Thierry Marchant and Debasis

Mishra, Separability and aggregation of equiva-

lence relations, Economic Theory, 51 (1), 2012,

212.

[12] Livieratos, John, Phokion G. Kolaitis, and Lef-

teris Kirousis, On the computational complexity

of non-dictatorial aggregation, Journal of Artifi-

cial Intelligence Research, 72, 2021, 137-183.

[13] Baronchelli Andrea, The emergence of con-

sensus: a primer, R. Soc. Open Sci., 2018,

5172189172189.

[14] Jeff E. Biddle; Daniel S. Hamermesh, Theory

and Measurement: Emergence, Consolidation,

and Erosion of a Consensus, History of Political

Economy, 49 (Supplement), 2017, 34–57.

[15] Jean-Pierre Barthélemy and Bruno Leclerc, The

Median Procedure for Partitions, Partitioning

Data Sets, American Mathematics Society, Series

in Discrete Math, 1995, 3-34.

[16] Sandro Vega-Pons, Jyrko Correa-Morris, José

Ruiz-Shulcloper: Weighted partition consensus

via kernels. Pattern Recognit. 43(8), 2010, 2712-

2724.

[17] Zoi Terzopoulou, Quota rules for incomplete

judgments, Mathematical Social Sciences, 107,

2020, 23-36.

[18] J. He, L. Cai and X. Guan, Differential Pri-

vate Noise Adding Mechanism and Its Applica-

tion on Consensus Algorithm, IEEE Transactions

on Signal Processing, 68, 2020, 4069-4082, doi:

10.1109/TSP.2020.3006760.

[19] Teixeira, P., Marques, M. M., Hagger, M.

S., Silva, M., Brunet, J., Duda, J., … Hag-

ger, M., Classification of techniques used in

self-determination theory-based interventions in

health contexts: an expert consensus study, 2019,

https://doi.org/10.31234/osf.io/z9wqu

[20] Adrianna Kozierkiewicz-Hetmańska, The anal-

ysis of expert opinions’ consensus quality, Infor-

mation Fusion, 34, 2017, 80-86.

[21] Jyrko Correa-Morris, Comparing Partitions:

Shortest Path Length Metrics and Submodularity,

Intern. J. of Math. Models and Meth. in Appl. Sci.,

13, 2019, 45-51.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.81

Jyrko Correa-Morris

E-ISSN: 2224-2880

744

Volume 22, 2023

[22] Jyrko Correa-Morris: The median partition and

submodularity, Appl. Math. Comput. 410, 2021,

126450.

Contribution of Individual Authors to the

Creation of a Scientific Article

Jyrko Correa-Morris was responsible for the sections

of the paper.

Sources of Funding for Research Presented in a

Scientific Article or Scientific Article Itself

Partial support for this research has been provided

by the U.S. Department of Education under grant

P120A200005.

Conflicts of Interest

The authors have no conflicts of interest to

declare that are relevant to the content of this

article.

Creative Commons Attribution License 4.0

(Attribution 4.0 International , CC BY 4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.0/deed.en

_US

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.81

Jyrko Correa-Morris

E-ISSN: 2224-2880

745

Volume 22, 2023