Wiki3DRank

a model for measuring the relevance of knowledge objects using quantitative data from Wikidata and Wikipedia

Authors

Keywords:

Wiki3DRank, Ranking, Wikidata, Wikipedia, Encyclopedic Knowledge, Domain analysis

Abstract

This research introduces the Wiki3DRank, a model combining real-time extracted quantitative data from Wikidata and Wikipedia to obtain a ranking of knowledge objects through a quantitative value that measures the relevance of one object compared to others in a specific domain. The model is based on the distribution of knowledge objects in a vector space, whose components are based on three main variables: the number of statements on Wikidata about an item, the number of articles in different Wikipedia editions, and the length in number of words of these articles. These variables are asso-ciated with the level of description of the Wikidata items, the dissemination of the referred knowledge objects in Wikipedia editions in different languages, and the degree of editorial elaboration of the corre-sponding Wikipedia articles. To demonstrate the viability of the model, a series of use cases across various domains are analysed: books, movies, cathedrals, earthquakes, rivers, and chemical ele-ments. From the results obtained, it is possible to conclude that Wiki3DRank is a tool that allows measure the relevance of knowledge objects in the context of a knowledge domain. The operation of an open-source tool that enables the online calcula-tion of Wiki3DRank is presented. The results sug-gest that the proposed model can be applied to different contexts and domains and that it`s ease to expand it by adding elements of weighting and extending the model with new components based on other characteristics of the encyclopaedic data of the knowledge objects, while the base vector calculation system is maintained.

Downloads

Download data is not yet available.

References

Ahnert, Ruth; Ahnert, Sebastian; Coleman, Catherine; Weingart, Scott (2020). The Network Turn: Changing Perspectives in the Humanities. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781108866804

Anderson, Chris (2014). The Longer Tail Why the Future of Business is Selling Less of More. New York: Hachette Books.

Beytía, Pablo; Schobin, Janosch (2020) Networked Pantheon: a Relational Database of Globally Famous People. // Research Data Journal for the Humanities and Social Sciences. 5, 50-65. https://doi.org/10.1163/24523666-00501002

Bianchini, Carlo; y Sardo, Lucia (2022). Wikidata : a new perspective towards universal bibliographic control. // JLIS. 13:1, 291-311. https://doi.org/10.4403/jlis.it-12725

Blank, Grant (2007). Critics, Ratings, and Society. Lanham: Rowman and Littlefield.

Blasco-Blasco, Olga; Rodríguez-Castro, Marta; Túñez-López, Miguel (2020). Composite indicators as an innovative methodology for Communication Sciences: implementation for the assessment of European public service media”. // Profesional de la información. 29, n. 4, e290437, 2020. https://doi.org/10.3145/epi.2020.jul.37

Borgman, Christine L. (2017). Big data, little data, no data. Cambridge, Massachusetts: The MIT Press. https://doi.org/10.7551/mitpress/9963.001.0001

Brown, Andrew (2011). A brief history of encyclopaedias: from Pliny to Wikipedia. Londres: Hesperus.

Halpern, Orit; Mitchell, Robert (2022) The smartness mandate. Cambridge, Massachusetts: The MIT Press. https://doi.org/10.7551/mitpress/14623.001.0001

Hill, Benjamin Mako; Shaw, Aaron (2020). The Most Important Laboratory for Social Scientific and Computing Research in History. // Reagle, Joseph; Koerner, Jackie (eds.). Wikipedia @ 20: Stories of an Incomplete Revolution. Cambridge, Massachusetts: The MIT Press. https://doi.org/10.7551/mitpress/12366.001.0001

Lewoniewski, Włodzimierz; Węcel, Krzysztof; Abramowicz, Witold (2019). Multilingual Ranking of Wikipedia Articles with Quality and Popularity Assessment in Different Topics. // Computers. 8:3, 60. https://doi.org/10.3390/computers8030060

McDowell, Zachary J.; Vetter, Matthew A (2022). Wikipedia and the Representation of Reality. New York: Routledge. https://doi.org/10.4324/9781003094081

Minguillón, Julia; Lerga, Maura; Aibar, Eduard; Lladós-Masllorens, Josep; y Meseguer-Artola, Antoni (2017). Semi-automatic generation of a corpus of Wikipedia articles on science and technology. // El Profesional de la Información. 26:5, 995-1004. https://doi.org/10.3145/epi.2017.sep.20

Miquel-Ribé, Marc (2019). The Sum of Human Knowledge? Not in One Wikipedia Language Edition. Wikipedia@20. https://wikipedia20.mitpress.mit.edu/pub/26ke5md7/release/15

Moás, Pedro Miguel; Teixeira Lopes, Carla (2023). Automatic Quality Assessment of Wikipedia Articles: A Systematic Literature Review. // ACM Computing Surveys. 56:4, article 95. https://doi.org/10.1145/3625286

Nielsen, Finn Årup (2012). Wikipedia Research and Tools: Review and Comments. http://doi.org/10.2139/ssrn.2129874

Piscopo, Alessandro; y Simperl, Elena (2018). Who Models the World?: Collaborative Ontology Creation and User Roles in Wikidata. // Proceedings of the ACM on Human-Computer Interaction. 2:CSCW, Article 141. https://doi.org/10.1145/3274410

Reznik, Ilia; Shatalov, Vladimir (2016). Hidden revolution of human priorities: An analysis of biographical data from Wikipedia. // Journal of Informetrics. 10:1, 124-131. https://doi.org/10.1016/j.joi.2015.12.002

Shenoy, Kartik; Ilievski, Filip; Garijo, Daniel; Schwabe, Daniel; Szekely, Pedro (2022). A study of the quality of Wikidata. Journal of Web Semantics. 72, 100679. https://doi.org/10.1016/j.websem.2021.100679

Skiena, Steven; Ward, Charles B. (2014). Who’s bigger? Where historical figures really rank. Cambridge: Cambridge University Press.

Torres-Salinas, Daniel; Robinson-García, Nicolás; Jiménez-Contreras, Evaristo (2023). The bibliometric journey towards technological and social change: A review of current challenges and issues. // Profesional de la información. 32:2, e320228. https://doi.org/10.3145/epi.2023.mar.28

Published

2024-06-14

How to Cite

Pastor-Sánchez, J.-A., Saorín, T., & Baños-Moreno, M.-J. (2024). Wiki3DRank: a model for measuring the relevance of knowledge objects using quantitative data from Wikidata and Wikipedia. Ibersid: Journal of Information and Documentation Systems (ISSNe 2174-081X; ISSN 1888-0967), 18(1), 55–70. Retrieved from https://ibersid.eu/ojs/index.php/ibersid/article/view/4967

Issue

Section

Articles