Skip to Main content Skip to Navigation
Habilitation à diriger des recherches

Contributions to Pattern Discovery and Formal Concept Analysis

Abstract : The process of collecting and analyzing data to answer predictive, explanatory, and decision-making issues has come to be known as ``data science'' for more than thirty years. Firstly used only by scientists, mainly by statisticians, the term is now widely used in the academics and industrial world. This can be explained in two ways: (i) data is ubiquitous, large, and varied, and (ii) there has been an awareness of the omniscient potential of data. The latter can be economic, societal, scientific, or related to health-care, and is based not only on the data that an entity has, but also on data that it can get (sensors, social networks, open data, etc., freely or not) making the data a black oil that still needs algorithms, methods and methodologies, to be properly refined. One component of data science, Knowledge Discovery in databases (KDD), deals in particular with the Data-Information-Knowledge process with the aim of explaining relationships or discovering hidden properties. Opposed to a purely statistical approach, a family of methods has met an important success over the last twenty years: data-mining and especially pattern-mining. Their goal is to describe, summarize, raise hypotheses from data. In particular, pattern mining makes it possible to efficiently find regularities of various types (such as frequent patterns in a set of transactions, molecular sub-graphs characteristic of toxicity, locally co-expressed gene groups, etc.). In fact, where conventional approaches aim to validate or invalidate an hypothesis given a priori, the search of patterns is seen as an enumeration technique of all the possible hypotheses (a set of exponential size w.r.t the input data) verifying some given constraints or maximizing a certain interest for the expert. Once discovered, the best hypotheses can then be tested, validated or invalidated and ultimately validated as knowledge unit. My scientific adventure began with the study of a binary relationship, very often illustrated by grocery store transaction data, linking customers and products they buy. How to make this relationship speak? What knowledge, behavioral habits, recommendations, etc. can we characterize? This initial question allowed me to travel through different application fields (biology, neuroscience, social networks and video games analytics), seeking to implement or adapt data mining methods to try to understand some phenomena while properly formalizing data and patterns in the most rigorous way. This is the story of this manuscript, according to three main research axes: the formalism framing the methods (Formal Concept Analysis), the methodological and algorithmic aspects related in Data mining, and finally the Knowledge Discovery ``in practice'' through several concrete applications encountered during collaborations with other scientists or industrial partners.
Document type :
Habilitation à diriger des recherches
Complete list of metadata

Cited literature [185 references]  Display  Hide  Download
Contributor : Mehdi Kaytoue Connect in order to contact the contributor
Submitted on : Sunday, March 1, 2020 - 8:00:21 PM
Last modification on : Friday, September 30, 2022 - 11:34:15 AM
Long-term archiving on: : Sunday, May 31, 2020 - 12:40:46 PM


Files produced by the author(s)


  • HAL Id : tel-02495263, version 1


Mehdi Kaytoue. Contributions to Pattern Discovery and Formal Concept Analysis. Artificial Intelligence [cs.AI]. INSA LYON; Université Claude Bernard Lyon 1, 2020. ⟨tel-02495263⟩



Record views


Files downloads