Ontology Mining by exploiting Machine Learning for Semantic Data Management (slides)
Mercredi 15 novembre, 09:00-10:00. Président de session : Amedeo Napoli (CNRS, LORIA & Inria Nancy).
Claudia d’Amato, University of Bari, Italy
In the Semantic Web view, ontologies play a key role. They act as shared vocabularies to be used for semantically annotating Web resources and they allow to perform deductive reasoning for making explicit knowledge that is implicitly contained within them. However, noisy/inconsistent ontological knowledge bases may occur, being the Web a shared and distributed environment, thus making deductive reasoning no more straightforwardly applicable.
Machine learning techniques, and specifically inductive learning methods, could be fruitfully exploited in this case. Additionally, machine learning methods, jointly with standard reasoning procedure, could be usefully employed for discovering new knowledge from an ontological knowledge base, that is not logically derivable.
The focus of the talk will be on various ontology mining problems and on how machine learning methods could be exploited for coping with them. For ontology mining are meant all those activities that allow to discover hidden knowledge from ontological knowledge bases, by possibly using only a sample of data.
Specifically, by exploiting the volume of the information within an ontology, machine learning methods could be of great help for semi-automatically enriching and refining existing ontologies, for detecting concept drift and novelties within ontologies and for discovering hidden knowledge patterns.
If on one hand this means to abandon sound and complete reasoning procedures for the advantage of uncertain conclusions, on the other hand this could allow to reason on large scale and to to dial with the intrinsic uncertainty characterizing the Web, that, for its nature, could have incomplete and/or contradictory information.
Claudia d’Amato is a research assistant (in tenure track for associate professorship) at the University of Bari – Computer Science Department. Along the years, she has been invited researcher in several universities and international research institutes such as University of Koblenz-Landau, University of Oxford, INRIA at Sophia-Antipolis, University of Poznan and Fondazione Bruno Kessler among the others.
Claudia d’Amato obtained her PhD in 2007 from the University of Bari, defending the thesis titled « Similarity Based Learning Methods for the Semantic Web » for which she got the the nomination as author of one of the Best Italian PhD Thesis in Artificial Intelligence from the Artificial Intelligence Italian Commission for the AI*IA award 2007. She pioneered the research on developing Machine Learning methods for ontology mining, that still represents her main research interest.
Her research activity has been disseminated through 19 journal papers, 12 book chapters, 55 papers in international collections, 27 papers in international workshop proceedings and 13 articles in national conference and workshop proceedings. She edited 27 books and proceedings and 3 journal special issues. During her research activity she also won several best paper awards.
Claudia d’Amato served/is serving as Program Chair at ISWC 2017, ESWC 2014, Vice-Chair at ISWC’09, Journal Track chair at WWW 2018, Machine Learning Track Chair at ESWC’12-’13-’16-’17, PhD Symposium chair at ESWC’15 and Workshop and Tutorial Chair at ISWC’12, EKAW’12, ICSC’12.
She served/is serving as a program committee member of a number of international conferences in the area of Artificial Intelligence, Machine Learning and Semantic Web such as AAAI, IJCAI, ECAI, ECML, ISWC, WWW, ESWC.
Swift Logics for Big Data (article)
Jeudi 16 novembre, 12:00-13:00. Président de session : Pierre Genevès (CNRS, LIG).
Georg Gottlob, University of Oxford and TU Wien
Reasoning with and about big data, in particular, massive web data is a great challenge. On one hand, we aim for powerful inference mechanisms that add value by creating knowledge from the data. Such mechanisms seem to require sophisticated logics with a high expressive power. On the other hand, we need swift inference algorithms with an acceptable computational complexity. In this talk, reasoning formalisms that achieve both are presented: We introduce and describe specific KRR formalisms for big data that belong to the Datalog+/- family of languages. These logical languages extend the well-known Datalog language by additional features (the « + ») to gain expressive power, but simultaneously make syntactic restrictions (the “-“) so as to achieve tractability and scalability. After discussing the theoretical foundations of Datalog+/-, some applications to ontological reasoning, web data extraction, data wrangling, and general reasoning about data will be illustrated, among which are some recent commercial applications.
Georg Gottlob is a Professor of Informatics at Oxford University and at TU Wien. His interests include KR, theory of data and knowledge bases, logic and complexity, problem decompositions, and, on the more applied side, web data extraction, and database query processing. Gottlob has received the Wittgenstein Award from the Austrian National Science Fund, is an ACM Fellow, an ECCAI Fellow, a Fellow of the Royal Society, and a member of the Austrian Academy of Sciences, the German National Academy of Sciences, and the Academia Europaea. He chaired the Program Committees of IJCAI 2003 and ACM PODS 2000. He was the main founder of Lixto, a company that provides tools and services for semi-automatic web data extraction which was acquired by McKinsey & Company in 2013. Gottlob was awarded an ERC Advanced Investigator’s Grant for the project « DIADEM: Domain-centric Intelligent Automated Data Extraction Methodology ». Based on results of this project, he co-founded Wrapidity Ltd, a company that specializes in fully automated web data extraction that was recently acquired by Meltwater, an international media intelligence firm.
Galois connections for dependencies in databases (slides)
Vendredi 17 novembre, 09:00-10:00. Président de session : Amedeo Napoli (CNRS, LORIA & Inria Nancy).
Sergei O. Kuznetsov, National Research University Higher School of Economics, Moscow
Dependencies in databases were an important issue starting from the first works on databases. Functional dependencies, multivalued dependencies, and other type of dependencies were used for database engineering database decomposability, they were also used to define database schemes. Research in data mining and knowledge discovery urged a new wave of interest to dependencies and their approximated versions, however from another point of view: they are being “mined” from databases, not given in advance. In this talk I show that Galois connections, a construction from order and lattice theory, allows for a general view on dependencies, relating them to other tools of knowledge discovery, such as domain taxonomies and biclusters. Algorithmic issues of various problems related to generation and inference of dependencies will be discussed.
Sergei O. Kuznetsov graduated in 1985 from the Moscow Physical-Technical Institute, Department of Control and Applied Mathematics with Diploma on Combinatorial and Logical Issues of a Plausible Reasoning System. From 1985 to 2006 was a researcher at the All-Russian Institute for Scientific and Technical Information (VINITI) of the Russian Academy of Sciences (Moscow). In 1990 defended “Candidate of Science” (PhD equivalent) thesis “On algorithmic and knowledge representation issues of a machine learning system (JSM-method)” in Theoretical Foundations of Computer Science at VINITI (Moscow). In 2002 defended “Doktor Nauk” (habilitation) thesis “A Theory of Machine Learning in Concept Lattices” at the Computer Center of the Russian Academy of Sciences.
In 1999-2004 was Humboldt fellow and invited professor at the Department of Mathematics and Science, Dresden Technical University (Dresden, Germany). From 2006 Professor of the National Research University Higher School of Economics (HSE), Head of Department of Data Analysis and Artificial Intelligence, Head of International Laboratory for Intelligent Systems and Structural Analysis, HSE (Moscow).
Scope of interests: methods and algorithms of machine learning and knowledge discovery, formal concept analysis, algorithmic complexity.