Publications

Thesis

[1]-À la recherche de l'échafaudage parfait : efficace, de qualité et garanti (in french)
Tom Davot
Thesis
2020
Abstractdoi logorg logo
Sequencing is a process in biology that determines the order of nucleotides in the DNA. It produces a set of fragments, called reads, in which the genetic information is known. Unfortunatly, the genomic sequence is decomposed in small pieces. In order to analyse it, it is necessary to reconstruct it using a number of computer processes. In this thesis, we studied two mathematical problems arising from this sequencing: the scaffolding and the linearization.The scaffolding is a process that takes place after the reads assembly into larger subsequences called contigs. It consists in the search of paths and cycles in a particular graph called scaffold graph. These paths and cycles represent the linear and circular chromosomes of the organism whose DNA has been sequenced. The linearization is a problem related to the scaffolding. When we take into account that contigs may appear several times in the genomic sequence, some ambiguities can arise. If this ambiguities are not deleted, then a chimeric sequence may be produced by the scaffolding. To solve this problem, a solution computed by the scaffolding should be wisely deteriorated. In any case, both problems can be modelized as optimization problems in a graph.In this document, we study both problems focusing on three aspects. The first aspect consists in the study of the complexity of these problems. The second aspect consists in the development of algorithms, exact or approximate, to solve these problems. Finally, the last aspect consists in implementing and testing these algorithms to look at their behaviors on real instances.

Journal

2023

[2]-On the Shared Transportation Problem: Computational Hardness and Exact Approach
Tom Davot, Rodolphe Giroudeau and Jean-Claude König
International Journal of Foundations of Computer Science
2023
Abstractdoi logorg logo
In our modern societies, a certain number of people do not own a car, by choice or by obligation. For some trips, there is no or few alternatives to the car. One way to make these trips possible for these people is to be transported by others who have already planned their trips. We propose to model this problem using as path-finding problem in a list edge-colored graph. This problem is a generalization of the s-t-path problem, studied by Böhmová et al. We consider two optimization functions: minimizing the number of color changes and minimizing the number of colors. We study for the previous problems, the classic complexity (polynomial-case, NP-completeness, hardness of approximation) and parameter complexity (W[2]-hardness) even in restricted cases. We also propose a lower bound for exact algorithm. On the positive side we provide a polynomial-time approximation algorithm and a FPT algorithm.
[3]-Ricochet Robots with Infinite Horizontal Board is Turing-complete
Samuel Masseport and Tom Davot and Rodolphe Giroudeau
Journal of Information Processing
2023
Abstractdoi logorg logo
This paper investigates the Ricochet Robots game problem from a complexity standpoint. The problem consists of moving robots in a rectangular square-tiled board, from initial tiles to reach specific target tiles. A robot can only move vertically or horizontally and when it starts to move in a given direction, the robot follows this direction, until being blocked by a wall or another robot. This paper proves that the corresponding decision problem to Ricochet Robots is Turing-complete for endless board game and an infinite number of robots. A reduction from a universal Turing machine to Ricochet Robots is exhibited.

2022

[4]-On a greedy approach for genome scaffolding
Tom Davot, Annie Chateau, Rohan Fossé, Rodolphe Giroudeau and Mathias Weller
Algorithms for Molecular Biology
2022
Abstractdoi logorg logo
*Background* Scaffolding is a bioinformatics problem aimed at completing the contig assembly process by determining the relative position and orientation of these contigs. It can be seen as a paths and cycles cover problem of a particular graph called the “scaffold graph”. *Results* We provide some NP-hardness and inapproximability results on this problem. We also adapt a greedy approximation algorithm on complete graphs so that it works on a special class aiming to be close to real instances. The described algorithm is the first polynomial-time approximation algorithm designed for this problem on non-complete graphs. *Conclusion* Tests on a set of simulated instances show that our algorithm provides better results than the version on complete graphs.

2021

[5]-Producing Genomic Sequences after Genome Scaffolding with Ambiguous Paths: Complexity, Approximation and Lower Bounds
Tom Davot, Annie Chateau, Rodolphe Giroudeau, Mathias Weller and Dorine Tabary
Algorithmica
2021
Abstractdoi logorg logo
Scaffolding is the final step in assembling Next Generation Sequencing data, in which pre-assembled contiguous regions (”contigs”) are oriented and ordered using information that links them (for example, mapping of paired-end reads). As the genome of some species is highly repetitive, we allow placing some contigs multiple times, thereby generalizing established computational models for this problem. We study the subsequent problems induced by the translation of solutions of the model back to actual sequences, proposing models and analyzing the complexity of the resulting computational problems. We find both polynomial-time and NP-hard special cases like planarity or bounded degree. Finally, we propose two polynomial-time approximation algorithms according to cut/weight score.
[6]-Efficient assembly consensus algorithms for divergent contig sets
Annie Chateau, Tom Davot and Manuel Laffond
Computational Biology and Chemistry
2021
Abstractdoi logorg logo
Assembly is a fundamental task in genome sequencing, and many assemblers have been made available in the last decade. Because of the wide range of possible choices, it can be hard to determine which tool or parameter to use for a specific genome sequencing project. In this paper, we propose a consensus approach that takes the best parts of several contigs datasets produced by different methods, and combines them into a better assembly. This amounts to orienting and ordering sets of contigs, which can be viewed as an optimization problem where the aim is to find an alignment of two fragmented strings that maximizes an arbitrary scoring function between matched characters. In this work, we investigate the computational complexity of this problem. We first show that it is NP-hard, even in an alphabet with only two symbols and with all scores being either 0 or 1. On the positive side, we propose an efficient, quadratic time algorithm that achieves approximation factor 3.

2019

[7]-A general decomposition theory for the 1-2-3 Conjecture and locally irregular decompositions
Olivier Baudon, Julien Bensmail, Tom Davot, Hervé Hocquard, Jakub Przybylo, Mohammed Senhaji, Éric Sopena and Mariusz Wozniak
Discrete Mathematics & Theoretical Computer Science
2019
Abstractdoi logorg logo
How can one distinguish the adjacent vertices of a graph through an edge-weighting? In the last decades, this question has been attracting increasing attention, which resulted in the active field of distinguishing labellings. One of its most popular problems is the one where neighbours must be distinguishable via their incident sums of weights. An edge-weighting verifying this is said neighbour-sum-distinguishing. The popularity of this notion arises from two reasons. A first one is that designing a neighbour-sum-distinguishing edge-weighting showed up to be equivalent to turning a simple graph into a locally irregular (i.e., without neighbours with the same degree) multigraph by adding parallel edges, which is motivated by the concept of irregularity in graphs. Another source of popularity is probably the influence of the famous 1-2-3 Conjecture, which claims that such weightings with weights in 1,2,3 exist for graphs with no isolated edge. The 1-2-3 Conjecture has recently been investigated from a decompositional angle, via so-called locally irregular decompositions, which are edge-partitions into locally irregular subgraphs. Through several recent studies, it was shown that this concept is quite related to the 1-2-3 Conjecture. However, the full connexion between all those concepts was not clear. In this work, we propose an approach that generalizes all concepts above, involving coloured weights and sums. As a consequence, we get another interpretation of several existing results related to the 1-2-3 Conjecture. We also come up with new related conjectures, to which we give some support.

Conference

2023

[8]-On the enumeration of non-dominated spanning trees with imprecise weights
Tom Davot and Sébastien Destercke and David Savourey
ECSQARU 2023, Arras (France)
2023
Abstractdoi logorg logo
Many works within robust combinatorial optimisation consider interval-valued costs or constraints. While most of these works focus on finding unique solutions such as minimax ones, a few consider the problem of characterising a set of non-dominated optimal solutions. This paper is situated within this line of work, and consider the problem of exactly enumerating the set of non-dominated spanning trees under interval-valued costs. We show in particular that each tree in this set can be obtained through a polynomial procedure, and provide an efficient algorithm to achieve the enumeration.
[9]-Degreewidth: a New Parameter for Solving Problems on Tournaments
Tom Davot, Lucas Isenmann and Sanjukta Roy and Jocelyn Thiebaut
WG, Fribourg (Switzerland)
2023
Abstractdoi logorg logo
In the paper, we define a new parameter for tournaments called degreewidth which can be seen as a measure of how far is the tournament from being acyclic. The degreewidth of a tournament T denoted by \Delta(T) is the minimum value k for which we can find an ordering \langle v_1, \dots, v_n \rangle of the vertices of T such that every vertex is incident to at most k backward arcs (ie an arc (v_i,v_j) such that j

2022

[10]-K-Partitioning with Imprecise Probabilistic Edges
Tom Davot, Sébastien Destercke and David Savourey
SMPS, Valladolid (Spain)
2022
Abstractdoi logorg logo
Partitioning a set of elements into disjoint subsets is a common problem in unsupervised learning (clustering) as well as in networks (e.g., social, ecological) where one wants to find heterogeneous subgroups such that the elements within each subgroup are homogeneous. In this paper, we are concerned with the case where we imprecisely know the probability that two elements should belong to the same partition, and where we want to search the set of most probable partitions. We study the corresponding algorithmic problem on graphs, showing that it is difficult, and propose heuristic procedures that we test on data sets.

2021

[11]-Complexity and Approximation Results on the shared transportation problem
Tom Davot, Rodolphe Giroudeau and Jean-Claude König
COCOA, Tianjin (China)
2021
Abstractdoi logorg logo
In our modern societies, a certain number of people do not own a car, by choice or by obligation. For some trips, there is no or few alternatives to the car. One way to make these trips possible for these people is to be transported by others who have already planned their trips. We propose to model this problem using as path-finding problem in a list edge-colored graph. This problem is a generalization of the s-t-path problem, studied by Böhmová et al. We study two optimization functions: minimizing the number of color changes and minimizing the number of colors. We study the complexity and the approximation of this problem. We show the existence of polynomial cases. We show that this problem is NP-complete and hard to approximate, even in restricted cases. Finally, we provide an approximation algorithm.
[12]-On the approximation hardness of geodetic set and its variants
Tom Davot, Lucas Isenmann and Jocelyn Thiebaut:
COCOON, Taina (Taiwan)
2021
Abstractdoi logorg logo
Given a graph, a geodetic set (resp. edge geodetic set) is a subset of its vertices such that every vertex (resp. edge) of the graph is on a shortest path between two vertices of the subset. A strong geodetic set is a subset S of vertices and a choice of a shortest path for every pair of vertices of S such that every vertex is on one of these shortest paths. The geodetic number (resp. edge geodetic number) of a graph is the minimum size of a geodetic set (resp. edge geodetic set) and the strong geodetic number is the minimum size of a strong geodetic set. We first prove that, given a subset of vertices, it is NP-hard to determine whether it is a strong geodesic set. Therefore, it seems natural to study the problem of maximizing the number of covered vertices by a choice of a shortest path for every pair of a provided subset of vertices. We provide a tight 2-approximation algorithm to solve this problem. Then, we show that there is no 781/780-polynomial-time approximation algorithm for edge geodetic number and strong geodetic number on subcubic bipartite graphs with arbitrarily high girth. We also prove that geodetic number and edge geodetic number are both LOG-APX-hard, even on subcubic bipartite graphs with arbitrarily high girth. Finally, we disprove a conjecture of Iršiš and Konvalinka by proving that the strong geodetic number can be computed in polynomial time in complete multipartite graphs.

2020

[13]-Linearizing Genomes: Exact Methods and Local Search
Tom Davot, Annie Chateau, Rodolphe Giroudeau and Mathias Weller
SOFSEM, Limassol (Cyprus)
2020
Abstractdoi logorg logo
In this article, we address the problem of genome linearization from the perspective of Polynomial Local Search, a complexity class related to finding local optima. We prove that the linearization problem, with a neighborhood structure, the neighbor slide, is PLS-complete. On the positive side, we develop two exact methods, one using tree decompositions with an efficient dynamic programming, the other using an integer linear programming. Finally, we compare them on real instances.

2019

[14]-New Polynomial-Time Algorithm Around the Scaffolding Problem
Tom Davot, Annie Chateau, Rodolphe Giroudeau and Mathias Weller
ALCOB, Berkely (USA)
2019
Abstractdoi logorg logo
We describe in this paper an approximation algorithm for the scaffolding problem, which is part of genome inference in bioinformatics. The aim of the problem is to find a maximum weighted collection of disjoint alternating cycles and paths covering a particular graph called scaffold graph. The problem is known to be NP-complete, and we describe further result concerning a special class of graphs aiming to be close to real instances. The described algorithm is the first polynomial-time approximation algorithm designed for this problem on non-complete graphs.

2018

[15]-New Results About the Linearization of Scaffolds Sharing Repeated Contigs
Dorine Tabary, Tom Davot, Mathias Weller, Annie Chateau and Rodolphe Giroudeau
COCOA, Atlanta (USA)
2018
Abstractdoi logorg logo
Solutions to genome scaffolding problems can be represented as paths and cycles in a “solution graph”. However, when working with repetitions, such solution graphs may contain branchings and, thus, they may not be uniquely convertible into sequences. Having introduced various ways of extracting the unique parts of such solutions, we extend previously known NP-hardness results to the case that the solution graph is planar, bipartite, and subcubic, and show that there is no PTAS in this case.
[16]-On the Hardness of Approximating Linearization of Scaffolds Sharing Repeated Contigs
Tom Davot, Annie Chateau, Rodolphe Giroudeau and Mathias Weller:
RECOMB-CG, Sherbrook (Canada)
2018
Abstractdoi logorg logo
Solutions to genome scaffolding problems can be represented as paths and cycles in a “solution graph”. However, when working with repetitions, such solution graph may contain branchings and they may not be uniquely convertible into sequences. Having introduced, in a previous work, various ways of extracting the unique parts of such solutions, we extend previously known NP-hardness results to the case that the solution graph is planar, bipartite, and subcubic, and show the APX-completeness in this case. We also provide some practical tests.