Two algorithms for finding optimal solutions of the Kemeny rank aggregation problem for full rankings


Abstract


The analysis of ranking data has recently received increasing attention in many fields (i.e. political sciences, computer sciences, social sciences, medical sciences, etc.).Typically when dealing with preference rankings one of the main issue is to find a ranking that best represents the set of input rankings.Among several measures of agreement proposed in the literature, the Kendall's distance is probably the most known. We propose a branch-and-bound algorithm to find the solution(s) even when we take into account a relatively large number of objects to be ranked. We also propose a heuristic variant of the branch-and-bound algorithm useful when the number of objects to rank is particularly high. We show how the solution(s) achieved by the algorithm can be employed in different analysis of rank data such as Mallow's phi model, mixtures of distance-based models, cluster analysis and so on.

DOI Code: 10.1285/i20705948v8n2p198

Keywords: Consensus ranking, Median ranking, Branch and bound, Mallows-phi model, exponential models.

References


Alvo, M. and Yu, P. L. (2014). Statistical Methods for Ranking Data. Springer.

Amodio, S., D'Ambrosio, A., and Siciliano, R. (2015). Accurate algorithms for identifying

the median ranking when dealing with weak and partial rankings under

the kemeny axiomatic approach. European Journal of Operational Research. doi:

http://dx.doi.org/10.1016/j.ejor.2015.08.048.

Bartholdi III, J., Tovey, C. A., and Trick, M. A. (1989). Voting schemes for which it can

be dicult to tell who won the election. Social Choice and welfare, 6(2):157-165.

Borg, I., Groenen, P. J., and Mair, P. (2012). Applied multidimensional scaling. Springer

Science & Business Media.

Brusco, M. J. and Stahl, S. (2006). Branch-and-bound applications in combinatorial data

analysis. Springer Science & Business Media.

Busing, F. M., Heiser, W. J., and Cleaver, G. (2010). Restricted unfolding: Preference

analysis with optimal transformations of preferences and attributes. Food quality and

preference, 21(1):82-92.

Carroll, J. D. (1972). Individual dierences and multidimensional scaling. Multidimensional scaling: Theory and applications in the behavioral sciences, 1:105-155.

Clausen, J. (1999). Branch and bound algorithms-principles and examples. Department

of Computer Science, University of Copenhagen, pages 1-30.

Condorcet, M. (1785). Essai sur l'application de l'analyse à la probabilité des décisions

rendues à la pluralitè des voix. L'imprimerie royale.

Contucci, P., Panizzi, E., Ricci-Tersenghi, F., and Sirbu, A. (2015). Egalitarianism in

the rank aggregation problem: a new dimension for democracy. Quality & Quantity,

pages 1-16.

Cook, W. D. (2006). Distance-based and ad hoc consensus models in ordinal preference

ranking. European Journal of Operational Research, 172(2):369-385.

Coombs, C. H. (1950). Psychological scaling without a unit of measurement. Psychological review, 57(3):145.

Coombs, C. H. (1964). A theory of data.

Critchlow, D. E., Fligner, M. A., and Verducci, J. S. (1991). Probability models on

rankings. Journal of mathematical psychology, 35(3):294-318.

Croon, M. A. (1989). Latent class models for the analysis of rankings. Advances in

psychology, 60:99-121.

D'Ambrosio, A. and Amodio, S. (2015). ConsRank, Compute the Median Ranking(s)

According to the Kemeny's Axiomatic Approach. R package version 1.0.2. https:

//cran.r-project.org/web/packages/ConsRank/index.html/.

D'Ambrosio, A. and Heiser, W. J. (2009). Decision trees for preference rankings. In

Procceedings of the 7th Meeting of the Classification and Data Analysis Group, Catania

- Italy, pages 133{136. CLEUP-Padova.

D'Ambrosio, A. and Heiser, W. J. (2015). A recursive partitioning method for the

prediction of preference rankings based upon kemeny distances. Technical report,

University of Naples Federico II. Submitted manuscript.

de Borda, J. C. (1781). Mèmoire sur les èlections au scrutin.

Deza, M. M. and Deza, E. (2009). Encyclopedia of distances. Springer.

Diaconis, P. (1988). Group representations in probability and statistics. Lecture Notes-

Monograph Series, pages i-192.

Emond, E. J. and Mason, D. W. (2002). A new rank correlation coecient with application

to the consensus ranking problem. Journal of Multi-Criteria Decision Analysis,

(1):17-28.

Fligner, M. A. and Verducci, J. S. (1986). Distance based ranking models. Journal of

the Royal Statistical Society. Series B (Methodological), pages 359-369.

Fligner, M. A. and Verducci, J. S. (1988). Multistage ranking models. Journal of the

American Statistical association, 83(403):892-901.

Garca-Lapresta, J. L. and Pèrez-Romàn, D. (2010). Consensus measures generated by

weighted kemeny distances on weak orders. In Procceedings of the 10th International

Conference on Intelligent Systems Design and Applications, Cairo.

Gormley, I. C. and Murphy, T. B. (2008). Exploring voting blocs within the irish electorate:

A mixture modeling approach. Journal of the American Statistical Association,

(483):1014-1027.

Greenho, K. and MacFie, H. (1994). Preference mapping in practice. In Measurement

of food preferences, pages 137-166. Springer.

Heiser, W. J. (2004). Geometric representation of association between categories. Psychometrika, 69(4):513-545.

Heiser, W. J. and D'Ambrosio, A. (2013). Clustering and prediction of rankings within

a kemeny distance framework. In Algorithms from and for Nature and Life, pages

-31. Springer.

Heiser, W. J. and D'Ambrosio, A. (2014). K-median cluster component analysis. Technical report, University of Naples Federico II. Submitted manuscript.

Heiser, W. J. and De Leeuw, J. (1981). Multidimensional mapping of preference data.

Mathèmatiques et Sciences humaines, 73:39-96.

Jarvelin, K. and Kekalainen, J. (2002). Cumulated gain-based evaluation of ir techniques.

ACM Transactions on Information Systems (TOIS), 20(4):422-446.

Kemeny, J. G. (1959). Mathematics without numbers. Daedalus, 88(4):577-591.

Kendall, M. G. (1948). Rank correlation methods. Griffin.

Mallows, C. L. (1957). Non-null ranking models. i. Biometrika, pages 114-130.

Marden, J. I. (1996). Analyzing and modeling rank data. CRC Press.

Murphy, T. B. and Martin, D. (2003). Mixtures of distance-based models for ranking

data. Computational statistics & data analysis, 41(3):645-655.

Pagliuca, M. M. and Scarpato, D. (2011). Food quality, consumer perception and preferences:

an analysis on olive oil. Electronic Journal of Applied Statistical Analysis,

(2):215-226.

Shieh, G. S. (1998). A weighted kendall's tau statistic. Statistics & Probability Letters,

(1):17-24.

Shieh, G. S., Bai, Z., and Tsai, W.-Y. (2000). Rank tests for independence-with a

weighted contamination alternative. Statistica Sinica, 10(2):577-594.

Tarsitano, A. (2009). Comparing the eectiveness of rank correlation statistics. Working

Papers 200906, Università della Calabria, Dipartimento di Economia e Statistica.

Thompson, G. (1993). Generalized permutation polytopes and exploratory graphical

methods for ranked data. The Annals of Statistics, pages 1401-1430.

Thurstone, L. L. (1927). A law of comparative judgment. Psychological review, 34(4):273.

Vermunt, J. K. (2003). Multilevel latent class models. Sociological methodology,

(1):213-239.


Full Text: pdf


Creative Commons License
This work is licensed under a Creative Commons Attribuzione - Non commerciale - Non opere derivate 3.0 Italia License.