LPAR23:Papers with Abstracts

Papers
Abstract. We shed new light on the Literal Block Distance (LBD) and glue-based heuristics used in current SAT solvers. For this, we first introduce the concept of stickiness: given a run of a CDCL SAT solver, for each pair of literals we define, by a real value between 0 and 1, how sticky they are, basically, how frequently they are set at the same decision level.
By means of a careful and detailed experimental setup and analysis, we confirm the following quite surprising fact: given a SAT instance, when running different CDCL SAT solvers on it, no matter their settings or random seeds, the stickiness relation between literals is always very similar, in a precisely defined sense.
We analyze how quickly stickiness stabilizes in a run (quite quickly), and show that it is stable even under different encodings of cardinality constraints. We then describe how and why these solid new insights lead to heuristics refinements for SAT (and extensions, such as SMT) and improved information sharing in parallel solvers.
Abstract. In this paper, we study the parameter synthesis problem for probabilistic hyperproper- ties. A probabilistic hyperproperty stipulates quantitative dependencies among a set of executions. In particular, we solve the following problem: given a probabilistic hyperprop- erty ψ and discrete-time Markov chain D with parametric transition probabilities, compute regions of parameter configurations that instantiate D to satisfy ψ, and regions that lead to violation. We address this problem for a fragment of the temporal logic HyperPCTL that allows expressing quantitative reachability relation among a set of computation trees. We illustrate the application of our technique in the areas of differential privacy, probabilistic nonintereference, and probabilistic conformance.
Abstract. Logic-based approaches to AI have the advantage that their behaviour can in principle be explained by providing their users with proofs for the derived consequences. However, if such proofs get very large, then it may be hard to understand a consequence even if the individual derivation steps are easy to comprehend. This motivates our interest in finding small proofs for Description Logic (DL) entailments. Instead of concentrating on a specific DL and proof calculus for this DL, we introduce a general framework in which proofs are represented as labeled, directed hypergraphs, where each hyperedge corresponds to a single sound derivation step. On the theoretical side, we investigate the complexity of deciding whether a certain consequence has a proof of size at most n along the following orthogonal dimensions: (i) the underlying proof system is polynomial or exponential; (ii) proofs may or may not reuse already derived consequences; and (iii) the number n is represented in unary or binary. We have determined the exact worst-case complexity of this decision problem for all but one of the possible combinations of these options. On the practical side, we have developed and implemented an approach for generating proofs for expressive DLs based on a non-standard reasoning task called forgetting. We have evaluated this approach on a set of realistic ontologies and compared the obtained proofs with proofs generated by the DL reasoner ELK, finding that forgetting-based proofs are often better w.r.t. different measures of proof complexity.
Abstract. We introduce a Curry–Howard correspondence for a large class of intermediate logics characterized by intuitionistic proofs with non-nested applications of rules for classical disjunctive tautologies (1-depth intermediate proofs). The resulting calculus, we call it λ∥, is a strongly normalizing parallel extension of the simply typed λ-calculus. Although simple, the λ∥ reduction rules can model arbitrary process network topologies, and encode interesting parallel programs ranging from numeric computation to algorithms on graphs.
Abstract. We propose a number of powerful dynamic-epistemic logics for multi-agent information sharing and acts of publicly or privately accessing other agents’ information databases. The static base of our logics is obtained by adding to standard epistemic logic comparative epistemic assertions for groups or individuals, as well as a common distributed knowledge operator (that combines features of both common knowledge and distributed knowledge). On the dynamic side, we introduce actions by which epistemic superiority can be acquired: “sharing all one knows” (by e.g. giving access to one’s information database to all or some of the other agents), as well as more complex informational events, such as hacking. We completely axiomatize several such logics and prove their decidability.
Abstract. Given an unsatisfiable Boolean Formula F in CNF, i.e., a set of clauses, one is often interested in identifying Maximal Satisfiable Subsets (MSSes) of F or, equivalently, the complements of MSSes called Minimal Correction Subsets (MCSes). Since MSSes (MC- Ses) find applications in many domains, e.g. diagnosis, ontologies debugging, or axiom pinpointing, several MSS enumeration algorithms have been proposed. Unfortunately, finding even a single MSS is often very hard since it naturally subsumes repeatedly solving the satisfiability problem. Moreover, there can be up to exponentially many MSSes, thus their complete enumeration is often practically intractable. Therefore, the algorithms tend to identify as many MSSes as possible within a given time limit. In this work, we present a novel MSS enumeration algorithm called RIME. Compared to existing algorithms, RIME is much more frugal in the number of performed satisfiability checks which we witness via an experimental comparison. Moreover, RIME is several times faster than existing tools.
Abstract. We present a system that utilizes machine learning for tactic proof search in the Coq Proof Assistant. In a similar vein as the TacticToe project for HOL4, our system predicts appropriate tactics and finds proofs in the form of tactic scripts. To do this, it learns from previous tactic scripts and how they are applied to proof states. The performance of the system is evaluated on the Coq Standard Library. Currently, our predictor can identify the correct tactic to be applied to a proof state 23.4% of the time. Our proof searcher can fully automatically prove 39.3% of the lemmas. When combined with the CoqHammer system, the two systems together prove 56.7% of the library’s lemmas.
Abstract. This paper presents a novel algorithm for automatically learning recursive shape pred- icates from memory graphs, so as to formally describe the pointer-based data structures contained in a program. These predicates are expressed in separation logic and can be used, e.g., to construct efficient secure wrappers that validate the shape of data structures exchanged between trust boundaries at runtime. Our approach first decomposes memory graph(s) into sub-graphs, each of which exhibits a single data structure, and generates candidate shape predicates of increasing complexity, which are expressed as rule sets in Prolog. Under separation logic semantics, a meta-interpreter then performs a systematic search for a subset of rules that form a shape predicate that non-trivially and concisely captures the data structure. Our algorithm is implemented in the prototype tool ShaPE and evaluated on examples from the real-world and the literature. It is shown that our approach indeed learns concise predicates for many standard data structures and their implementation variations, and thus alleviates software engineers from what has been a time-consuming manual task.
Abstract. Mathematical induction is a fundamental tool in computer science and mathematics. Henkin [12] initiated the study of formalization of mathematical induction restricted to the setting when the base case B is set to singleton set containing 0 and a unary generating function S. The usage of mathematical induction often involves wider set of base cases and k−ary generating functions with different structural restrictions. While subsequent studies have shown several Induction Models to be equivalent, there does not exist precise logical characterization of reduction and equivalence among different Induction Models. In this paper, we generalize the definition of Induction Model and demonstrate existence and construction of S for given B and vice versa. We then provide a formal characterization of the reduction among different Induction Models that can allow proofs in one Induction Models to be expressed as proofs in another Induction Models. The notion of reduction allows us to capture equivalence among Induction Models.
Abstract. The entailment between separation logic formulæ with inductive predicates, also known as sym- bolic heaps, has been shown to be decidable for a large class of inductive definitions [7]. Recently, a 2-EXPTIME algorithm was proposed [10, 14] and an EXPTIME-hard bound was established in [8]; however no precise lower bound is known. In this paper, we show that deciding entailment between predicate atoms is 2-EXPTIME-hard. The proof is based on a reduction from the membership problem for exponential-space bounded alternating Turing machines [5].
Abstract. Based on our formal framework for CDCL (conflict-driven clause learning) using the proof assistant Isabelle/HOL, we verify an extension of CDCL computing cost-minimal models called OCDCL. It is based on branch and bound and computes models of minimal cost with respect to total valuations. The verification starts by developing a framework for CDCL with branch and bound, called CDCLBnB, which is then instantiated to get OCDCL. We then apply our formalization to three different applications. Firstly, through the dual rail encoding, we reduce the search for cost-optimal models with respect to partial valuations to searching for total cost-optimal models, as derived by OCDCL. Secondly, we instantiate OCDCL to solve MAX-SAT, and, thirdly, CDCLBnB to compute a set of covering models. A large part of the original CDCL verification framework was reused without changes to reduce the complexity of the new formalization. To the best of our knowledge, this is the first rigorous formalization of CDCL with branch and bound and its application to an optimizing CDCL calculus, and the first solution that computes cost-optimal models with respect to partial valuations.
Abstract. The paper describes a deep reinforcement learning framework based on self-supervised learning within the proof assistant HOL4. A close interaction between the machine learning modules and the HOL4 library is achieved by the choice of tree neural networks (TNNs) as machine learning models and the internal use of HOL4 terms to represent tree structures of TNNs. Recursive improvement is possible when a task is expressed as a search problem. In this case, a Monte Carlo Tree Search (MCTS) algorithm guided by a TNN can be used to explore the search space and produce better examples for training the next TNN. As an illustration, term synthesis tasks on combinators and Diophantine equations are specified and learned. We achieve a success rate of 65% on combinator synthesis problems outperforming state-of-the-art ATPs run with their best general set of strategies. We set a precedent for statistically guided synthesis of Diophantine equations by solving 78.5% of the generated test problems.
Abstract. NACRE, for Nogood And Clause Reasoning Engine, is a constraint solver written in C++. It is based on a modular architecture designed to work with generic constraints while implementing several state-of-the-art search methods and heuristics. Interestingly, its data structures have been carefully designed to play around nogoods and clauses, making it suit- able for implementing learning strategies. NACRE was submitted to the CSP MiniTrack of the 2018 and 2019 XCSP3 [8] competitions where it took the first place. This paper gives a general description of NACRE as a framework. We present its kernel, the available search algorithms, and the default settings (notably, used for XCSP3 competitions), which makes NACRE efficient in practice when used as a black-box solver.
Abstract. Deep neural networks (DNNs) are revolutionizing the way complex systems are de- signed, developed and maintained. As part of the life cycle of DNN-based systems, there is often a need to modify a DNN in subtle ways that affect certain aspects of its behav- ior, while leaving other aspects of its behavior unchanged (e.g., if a bug is discovered and needs to be fixed, without altering other functionality). Unfortunately, retraining a DNN is often difficult and expensive, and may produce a new DNN that is quite different from the original. We leverage recent advances in DNN verification and propose a technique for modifying a DNN according to certain requirements, in a way that is provably minimal, does not require any retraining, and is thus less likely to affect other aspects of the DNN’s behavior. Using a proof-of-concept implementation, we demonstrate the usefulness and potential of our approach in addressing two real-world needs: (i) measuring the resilience of DNN watermarking schemes; and (ii) bug repair in already-trained DNNs.
Abstract. In the last years, several works were concerned with identifying classes of programs
where termination is decidable. We consider triangular weakly non-linear loops
(twn-loops) over a ring Z ≤ S ≤ R_A , where R_A is the set of all real algebraic
numbers. Essentially, the body of such a loop is a single assignment
(x_1, ..., x_d) ← (c_1 · x_1 + pol_1, ..., c_d · x_d + pol_d)
where each x_i is a variable, c_i ∈ S, and each pol_i is a (possibly non-linear)
polynomial over S and the variables x_{i+1}, ..., x_d. Recently, we showed that
termination of such loops is decidable for S = R_A and non-termination is
semi-decidable for S = Z and S = Q.

In this paper, we show that the halting problem is decidable for twn-loops over any
ring Z ≤ S ≤ R_A. In contrast to the termination problem, where termination on all
inputs is considered, the halting problem is concerned with termination on a given
input. This allows us to compute witnesses for non-termination.

Moreover, we present the first computability results on the runtime complexity of
such loops. More precisely, we show that for twn-loops over Z one can always
compute a polynomial f such that the length of all terminating runs is bounded
by f( || (x_1, ..., x_d) || ), where || · || denotes the 1-norm. As a corollary, we
obtain that the runtime of a terminating triangular linear loop over Z is
at most linear.
Abstract. We study light-weight techniques for preprocessing of WSkS formulae in an automata- based decision procedure as implemented, e.g., in Mona. The techniques we use are based on antiprenexing, i.e., pushing quantifiers deeper into a formula. Intuitively, this tries to alleviate the explosion in the size of the constructed automata by making it happen sooner on smaller automata (and have the automata minimization reduce the output). The formula transformations that we use to implement antiprenexing may, however, be applied in different ways and extent and, if used in an unsuitable way, may also cause an explosion in the size of the formula and the automata built while deciding it. Therefore, our approach uses informed rules that use an estimation of the cost of constructing automata for WSkS formulae. The estimation is based on a model learnt from runs of the decision algorithm on various formulae. An experimental evaluation of our technique shows that antiprenexing can significantly boost the performance of the base WSkS decision procedure, sometimes allowing one to decide formulae that could not be decided before.
Abstract. In biology, Boolean networks are conventionally used to represent and simulate gene regulatory networks. The attractors are the subject of special attention in analyzing the dynamics of a Boolean network. They correspond to stable states and stable cycles, which play a crucial role in biological systems. In this work, we study a new representation of the dynamics of Boolean networks that are based on a new semantics used in answer set programming (ASP). Our work is based on the enu- meration of all the attractors of asynchronous Boolean networks having interaction graphs which are circuits. We show that the used semantics allows to design a new approach for computing exhaustively both the stable cycles and the stable states of such networks. The enumeration of all the attractors and the distinction between both types of attractors is a significant step to better understand some critical aspects of biology. We applied and evaluated the proposed approach on randomly generated Boolean networks and the obtained results highlight the benefits of this approach, and match with some conjectured results in biology.
Abstract. The triguarded fragment of first-order logic is an extension of the guarded fragment in which quantification for subformulas with at most two free variables need not be guarded. Thus, it unifies two prominent decidable logics: the guarded fragment and the two-variable fragment. Its satisfiability problem is known to be undecidable in the presence of equality, but becomes decidable when equality is forbidden. We consider an extension of the tri- guarded fragment without equality by transitive relations, allowing them to be used only as guards. We show that the satisfiability problem for the obtained formalism is decidable and 2-ExpTime-complete, that is, it is of the same complexity as for the analogous exten- sion of the classical guarded fragment. In fact, in our satisfiability test we use a decision procedure for the latter as a subroutine. We also show how our approach, consisting in exploiting some existing results on guarded logics, can be used to reprove some known facts, as well as to derive some other new results on triguarded logics.
Abstract. Fault Tree Analysis (FTA) is a prominent technique in industrial and scientific risk assessment. Repairable Fault Trees (RFT) enhance the classical Fault Tree (FT) model by introducing the possibility to describe complex dependent repairs of system components. Usual frameworks for analyzing FTs such as BDD, SBDD, and Markov chains fail to assess the desired properties over RFT complex models, either because these become too large, or due to cyclic behaviour introduced by dependent repairs. Simulation is another way to carry out this kind of analysis. In this paper we review the RFT model with Repair Boxes as introduced by Daniele Codetta-Raiteri. We present compositional semantics for this model in terms of Input/Output Stochastic Automata, which allows for the modelling of events occurring according to general continuous distribution. Moreover, we prove that the semantics generates (weakly) deterministic models, hence suitable for discrete event simulation, and prominently for rare event simulation using the FIG tool.
Abstract. Satisfiability (SAT) solving has become an important technology in computer-aided mathematics with various successes in number and graph theory. In this paper we apply SAT solvers to color infinitely long strips in the plane with a given height and number of colors. The coloring is constrained as follows: two points that are exactly unit distance apart must be colored differently. To finitize the problem, we tile the strips and all points on a tile have the same color. We evaluated our approach using two different tile shapes: squares and hexagons. The visualization of bounded height strips using 3 to 6 colors reveal patterns that are similar to the best known lower bounds for infinite strips. Our method can be a useful tool for mathematicians to search for patterns that can be generalized to infinite strips and allowed us to increase the lower bound for the strip height with 5 colors to an improved height of 1.700084.
Abstract. Symbolic-heap separation logic with inductive definitions is a popular formalism for reasoning about heap-manipulating programs. The fragment SLIDbtw introduced by Iosif, Rogalewicz and Simacek, is one of the most expressive fragments with a decidable entailment problem. In recent work, we improved on the original decidability proof by providing a direct model-theoretic construction, obtaining a 2-Exptime upper bound. In this paper, we investigate separation logics built on top of the inductive definitions from SLIDbtw, i.e., logics that feature the standard Boolean and separation-logic operators. We give an almost tight delineation between decidability and undecidabilty. We establish the decidability of the satisfiability problem (in 2-Exptime) of a separation logic with conjunction, disjunction, separating conjunction and guarded forms of negation, magic wand, and septraction. We show that any further generalization leads to undecidabilty (under mild assumptions).
Abstract. In this work we develop a new learning-based method for selecting facts (premises) when proving new goals over large formal libraries. Unlike previous methods that choose sets of facts independently of each other by their rank, the new method uses the notion of state that is updated each time a choice of a fact is made. Our stateful architecture is based on recurrent neural networks which have been recently very successful in stateful tasks such as language translation. The new method is combined with data augmentation techniques, evaluated in several ways on a standard large-theory benchmark and compared to state-of-the-art premise approach based on gradient boosted trees. It is shown to perform significantly better and to solve many new problems.
Abstract. Inprocessing techniques have become one of the most promising advancements in SAT solving over the last decade. Some inprocessing techniques modify a propositional formula in non model-perserving ways. These operations are very problematic when Craig inter- polants must be extracted: existing methods take resolution proofs as an input, but these inferences require stronger proof systems; state-of-the-art solvers generate DRAT proofs. We present the first method to transform DRAT proofs into resolution-like proofs by elim- inating satisfiability-preserving RAT inferences. This solves the problem of extracting interpolants from DRAT proofs.
Abstract. Controlling access to knowledge plays a crucial role in many multi-agent systems. In- deed, it is related to different central aspects in interactions among agents such as privacy, security, and cooperation. In this paper, we propose a framework for dealing with access to knowledge that is based on the inference process in classical propositional logic: an agent has access to every piece of knowledge that can be derived from the available knowledge using the classical inference process. We first introduce a basic problem in which an agent has to hide pieces of knowledge, and we show that this problem can be solved through the computation of maximal consistent subsets. In the same way, we also propose a coun- terpart of the previous problem in which an agent has to share pieces of knowledge, and we show that this problem can be solved through the computation of minimal inconsis- tent subsets. Then, we propose a generalization of the previous problem where an agent has to share pieces of knowledge and hide at the same time others. In this context, we introduce several concepts that allow capturing interesting aspects. Finally, we propose a weight-based approach by associating integers with the pieces of knowledge that have to be shared or hidden.
Abstract. Motivated by Gromov’s subgroup conjecture (GSC), a fundamental open conjecture in the area of geometric group theory, we tackle the problem of the existence of partic- ular types of subgroups—arising from so-called periodic apartments—for a specific set of hyperbolic groups with respect to which GSC is currently open. This problem is equiv- alent to determining whether specific types of graphs with a non-trivial combination of properties exist. The existence of periodic apartments allows for ruling the groups out as some of the remaining potential counterexamples to GSC. Our approach combines both automated reasoning techniques—in particular, Boolean satisfiability (SAT) solving—with problem-specific orderly generation. Compared to earlier attempts to tackle the problem through computational means, our approach scales noticeably better, and allows for both confirming results from a previous computational treatment for smaller parameter values as well as ruling out further groups out as potential counterexamples to GSC.
Abstract. Globalization of integrated circuits manufacturing has led to increased security con- cerns, notably theft of intellectual property. In response, logic locking techniques have been developed for protecting designs, but many of these techniques have been shown to be vulnerable to SAT-based attacks. In this paper, we explore the use of Boolean sensi- tivity to analyze these locked circuits. We show that in typical circuits there is an inverse relationship between input width and sensitivity. We then demonstrate the utility of this relationship for deobfuscating circuits locked with a class of “provably secure” logic lock- ing techniques. We conclude with an example of how to resist this attack, although the resistance is shown to be highly circuit dependent.
Abstract. Isabelle/HOL augments classical higher-order logic with ad-hoc overloading of constant definitions— that is, one constant may have several definitions for non-overlapping types. In this paper, we present a mechanised proof that HOL with ad-hoc overloading is consistent. All our results have been formalised in the HOL4 theorem prover.
Abstract. Kleene Algebra and variants thereof have been successfully used in verification of se- quential programs. The leap to concurrent programs offers many challenges, both in terms of devising the right foundations to study concurrent variants of Kleene Algebra but also in finding the right models to enable effective verification of relevant programs. In this talk, we will review existing and ongoing work on concurrent Kleene Algebra with a focus on a variant called partially observable concurrent Kleene algebra (POCKA). POCKA offers an algebraic framework to reason about concurrent programs with control structures, such as conditionals and loops. We will show how a previously developed technique for com- pleteness of Kleene Algebra can be lifted to prove that POCKA is a sound and complete axiomatization of a model of partial observations. We illustrate the use of the framework in the analysis of sequential consistency, i.e., whether programs behave as if memory accesses taking place were interleaved and executed sequentially.
The work described in this invited talk is based on [1, 2, 3], and it is joint with a won- derful group of people: Paul Brunet, Simon Docherty, Tobias Kapp ́e, Jurriaan Rot, Jana Wagemaker, and Fabio Zanasi.