Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/146182
|Analysis and applications of orthogonal approaches to simplify Mixed Boolean-Arithmetic expressions
|Gàmez Montolio, Arnau
|Hernández Jiménez, Enric
mixed boolean-arithmetic expressions
|Universitat Oberta de Catalunya (UOC)
|A Mixed Boolean-Arithmetic (MBA) expression is composed of integer arithmetic operators and bitwise operators. MBA expressions can be leveraged to obfuscate the data-flow of code by iteratively applying rewrite rules and function identities that complicate (obfuscate) the initial expression while preserving its semantic behavior. This possibility is motivated by the fact that the combination of operators from these different fields do not interact well together: we have no rules (distributivity, factorization...) or general theory to deal with this mixing of operators. In this project, we study and explore approaches to MBA simplification which are orthogonal to current techniques relying on a combination of symbolic execution and program synthesis. The main idea is to be able to extract some information from the underlying mathematical structure of such expressions. Then, this information can be used either alone or in combination with other techniques to aid in the task of MBA simplification. The focus is set upon recent research literature that develops such ideas, mainly addressing the problem of providing a normalized representation of (a subset of) linear MBA expressions as linear combinations with respect to an arbitrary set of minimal operators and expressions that form a basis of an ad hoc vector space where such linear MBA expressions live. We analyze the contributions, flaws and limitations of recent research in this regard, and provide practical applications. In particular, we leverage a semantics preserving transformation which reduces the alternation of arithmetic and bitwise operators of an MBA expression in the context of program synthesis based code deobfuscation. This transformation is then used to aid in the problem of verifying the semantic correctness of a synthesized candidate expression, thus improving the soundness of such technique.
|Appears in Collections:
|Trabajos finales de carrera, trabajos de investigación, etc.
Files in This Item:
|Report of TFM