PhD Thesis Defenses


PhD thesis defenses are a public affair and open to anyone who is interested. Attending them is a great way to get to know the work going on by your peers in the various research groups. On this page you will find a list of upcoming and past defense talks.


Please go here for electronic access to most of the doctoral dissertations from Saarbrücken Computer Science going back to about 1990.








High Dynamic Range Imaging: Problems of Video Exposure Bracketing, Luminance Calibration and Gloss Editing

(Advisor: Dr.-Ing. habil. Karol Myszkowski)

Fri, 02.06.2017, 10:00h, building E1 4, room 0.19


Two-dimensional, conventional images are gradually losing their hegemony, leaving room for novel formats. Among these, 8 bit images give place to high dynamic range (HDR) images, allowing to improve color gamut and visibility of details in dark and bright areas leading to a more immersive viewing experience. In the same time, light field scene representation as well is gaining importance, further propelled by the recent reappearance of virtual reality. Light field data allows changing a camera position, an aperture or a focal length in post-production. It facilitates object insertions and simplifies visual effects workflow by integrating 3D nature of visual effects with 3D nature of light fields. Content generation is one of the stumbling blocks in these realms. Sensor limitation does not allow to capture a wide dynamic range with a conventional camera. The “HDR mode", often met on mobile devices, relies on techniques called images fusion and allows to partially overcome the limited range of a sensor. The HDR video in the same time remains a challenging problem. A solution for capturing HDR video on a conventional capturing device will be presented. Further, HDR content visualization task often requires an input to be in absolute values: whether the target media is an HDR or a standard/low dynamic range display. To this end, a calibration algorithm is proposed, that can be applied to existent imaging and does not require any additional measurement hardware. Finally, as the use of multidimensional scene representations becomes more common, a key challenge is the ability to edit or modify the appearance of the objects in the light field. A multidimensional filtering approach is described in which the specular highlights are filtered in the spatial and angular domains to target a desired increase of the material roughness.



Interactive On-Skin Devices for Expressive Touch-based Interactions

(Advisor: Prof. Jürgen Steimle)

Thu, 01.06.2017, 10:00h, building E1 7, room 001


Skin has been proposed as a large, always-available, and easy to access input surface for mobile computing. However, it is fundamentally different than prior rigid de-vices: skin is elastic, highly curved, and provides tactile sensation. This thesis ad-vances the understanding of skin as an input surface and contributes novel skin-worn devices and their interaction techniques.

We present the findings from an elicitation study on how and where people interact on their skin. The findings show that participants use various body locations for on-skin interaction. Moreover, they show that skin allows for expressive interaction using multi-touch input and skin-specific modalities.

We contribute three skin-worn device classes and their interaction techniques to enable expressive on-skin interactions: iSkin investigates multi-touch and pressure input on various body locations. SkinMarks supports touch, squeeze, and bend sen-sing with co-located visual output. The devices' conformality to skin enables interaction on highly challenging body locations. Finally, ExpressSkin investigates expressive interaction techniques using fluid combinations of high-resolution pressure, shear, and squeeze input. Taken together, this thesis contributes towards expressive on-skin interaction with multi-touch and skin-specific input modalities on various body locations.





Analyzing DNA Methylation Signatures of Cell Identity

(Advisor: Prof. Thomas Lengauer)

Wed, 31.05.2017, 16:00h, building E1 5, room 029


Although virtually all cells in an organism share the same genome, regulatory mecha-nisms give rise to hundreds of different, highly specialized cell types. These mechanisms are governed by epigenetic patterns, such as DNA methylation, which determine DNA packaging, spatial organization of the genome, interactions with regulatory enzymes as well as RNA expression and which ultimately reflect the state of each individual cell. In my dissertation, I have developed computational methods for the interpretation of epigenetic signatures inscribed in DNA methylation and implemented these methods in software packages for the comprehensive analysis of genome-wide datasets. Using these tools, we characterized epigenetic variability among mature blood cells and described DNA methylation dynamics during immune memory formation and during the reprogramming of leukemia cells to a pluripotent cell state. Furthermore, we profiled the methylomes of stem and progenitor cells of the human blood system and dissected changes during cell lineage commitment. We employed machine learning methods to derive statistical models that are capable of accurately inferring cell identity and identify methylation signatures that describe the relatedness between cell types from which we derived a data-driven reconstruction of human hematopoiesis. In summary, the methods and software tools developed in this work provide a framework for interpreting the epigenomic landscape spanned by DNA methylation and for understanding epigenetic regulation involved in cell differentiation and disease.


Maksim LAPIN

Image Classification with Limited Training Data and Class Ambiguity

(Advisor: Prof. Bernt Schiele)

Mon, 22.05.2017, 16:00h, building E1 4, room 024


Modern image classification methods are based on supervised learning algorithms that require labeled training data. However, only a limited amount of annotated data may be available in certain applications due to scarcity of the data itself or high costs associated with human annotation. Introduction of additional information and structural constraints can help improve the performance of a learning algorithm. In this talk, we study the framework of learning using privileged information and demonstrate its relation to learning with instance weights. We also consider multitask feature learning and develop an efficient dual optimization scheme that is particularly well suited to problems with high dimensional image descriptors. Scaling annotation to a large number of image categories leads to the problem of class ambiguity where clear distinction between the classes is no longer possible. Many real world images are naturally multilabel yet the existing annotation might only contain a single label. In this talk, we propose and analyze a number of loss functions that allow for a certain tolerance in top k predictions of a learner. Our results indicate consistent improvements over the standard loss functions that put more penalty on the first incorrect prediction compared to the proposed losses.



Generating and Grounding of Natural Language Descriptions for Visual Data

(Advisor: Prof. Bernt Schiele)

Mon, 15.05.2017, 16:00h, building E1 4, room 024


Generating natural language descriptions for visual data links computer vision and computational linguistics. Being able to generate a concise and human-readable description of a video is a step towards visual understanding. At the same time, grounding natural language in visual data provides disambiguation for the linguistic concepts, necessary for many applications. This thesis focuses on both directions and tackles three specific problems.

First, we develop recognition approaches to understand video of complex cooking activities. We propose an approach to generate coherent multi-sentence descriptions for our videos. Furthermore, we tackle the new task of describing videos at variable level of detail. Second, we present a large-scale dataset of movies and aligned professional descriptions. We propose an approach, which learns from videos and sentences to describe movie clips relying on robust recognition of visual semantic concepts. Third, we propose an approach to ground textual phrases in images with little or no localization supervision, which we further improve by introducing Multimodal Compact Bilinear Pooling for combining language and vision representations. Finally, we jointly address the task of describing videos and grounding the described people.

To summarize, this thesis advances the state-of-the-art in automatic video description and visual grounding and also contributes large datasets for studying the intersection of computer vision and computational linguistics.



Analysis and Improvement of the Visual Object Detection Pipeline

(Advisor: Prof. Bernt Schiele)

Mon, 02.05.2017, 16:00h, building E1 4, room 024


Visual object detection has seen substantial improvements during the last years due to the possibilities enabled by deep learning. In this thesis, we analyse and improve different aspects of the commonly used detection pipeline: (1) We analyse ten years of research on pedestrian detection and find that improvement of feature representations was the driving factor. (2) Motivated by this finding, we adapt an end-to-end learned detector architecture from general object detection to pedestrian detection. (3) A comparison between human performance and state-of-the-art pedestrian detectors shows that pedestrian detectors still have a long way to go before reaching human level performance and diagnoses failure modes of several top performing detectors. (4) We analyse detection proposals as a preprocessing step for object detectors. By examining the relationship between localisation of proposals and final object detection performance, we define and experimentally verify a metric that can be used as a proxy for detector performance. (5) We analyse a common postprocessing step in virtually all object detectors: non-maximum suppression (NMS). We present two learnable approaches that overcome the limitations of the most common approach to NMS. The introduced paradigm paves the way to true end-to-end learning of object detectors without any post-processing.





XML3D: Cross-lingual Transfer of Semantic Role Labeling Models

(Advisor: Prof. Ivan Titov, now Amsterdam)

Mon, 24.04.2017, 14:00h, building C7 4, room 1.17


Semantic role labeling is an important step in natural language understanding, offering a formal representation of events and their participants as described in natural language, without requiring the event or the participants to be grounded. Extensive annotation efforts have enabled statistical models capable of accurately analyzing new text in several major languages. Unfortunately, manual annotation for this task is complex and requires training and calibration even for professional linguists, which makes the creation of manually annotated resources for new languages very costly. The process can be facilitated by leveraging existing resources for other languages using techniques such as cross-lingual transfer and annotation projection.

This work addresses the problem of improving semantic role labeling models or creating new ones using cross-lingual transfer methods. We investigate different approaches to adapt to the availability and nature of the existing target-language resources. Specifically, cross-lingual bootstrapping is considered for the case where some annotated data is available for the target language, but using an annotation scheme different from that of the source language. In the more common setup, where no annotated data is available for the target language, we investigate the use of direct model transfer, which requires no sentence-level parallel resources. Finally, for cases where the parallel resources are of limited size or poor quality, we propose a novel method, referred to as feature representation projection, combining the strengths of direct transfer and annotation projection.




Kristian SONS

XML3D: Interactive 3D Graphics for the Web

(Advisor: Prof. Philipp Slusallek)

Thu, 30.03.2017, 15:00h, building D3 4, room -1.63 (VisCenter)


The web provides the basis for worldwide distribution of digital information but also established itself as a ubiquitous application platform. The idea of integrating interactive 3D graphics into the web has a long history, but eventually both fields developed largely independent of each other.

In this thesis we develop the XML3D architecture, our approach to integrate a declarative 3D scene description into exiting web technologies without sacrificing required flexibility or tying the description to a specific rendering algorithm. XML3D extends HTML5 and leverages related web technologies including Cascading Style Sheets (CSS) and the Document Object Model (DOM). On top of this seamless integration, we present novel concepts for a lean abstract model that provides the essential functionality to describe 3D scenes, a more general description of dynamic effects based on declarative dataflow graphs, more general yet programmable material descriptions, assets which are still configurable during instantiation from a 3D scene, and a container format for efficient delivery of binary 3D content to the client.


Sourav DUTTA

Efficient knowledge management for named entities from text

(Advisor: Prof. Gerhard Weikum)

Thu, 09.03.2017, 16:00h, building E1 4, room 0.24


The evolution of search from keywords to entities has necessitated the efficient harvesting and management of entity-centric information for constructing knowledge bases catering to various applications such as semantic search, question answering, and information retrieval. The vast amounts of natural language texts available across diverse domains on the Web provide rich sources for discovering facts about named entities such as people, places, and organizations. A key challenge, in this regard, entails the need for precise identification and disambiguation of entities across documents for extraction of attributes/relations and their proper representation in knowledge bases. Additionally, the applicability of such repositories not only involves the quality and accuracy of the stored information, but also storage management and query processing efficiency. This dissertation aims to tackle the above problems by presenting efficient approaches for entity-centric knowledge acquisition from texts and its representation in knowledge repositories. This dissertation presents a robust approach for identifying text phrases pertaining to the same named entity across huge corpora, and their disambiguation to canonical entities present in a knowledge base, by using enriched semantic contexts and link validation encapsulated in a hierarchical clustering framework. This work further presents language and consistency features for classification models to compute the credibility of obtained textual facts, ensuring quality of the extracted information. Finally, an encoding algorithm, using frequent term detection and improved data locality, to represent entities for enhanced knowledge base storage and query performance is presented.





Populating knowledge bases with temporal information

(Advisor: Prof. Gerhard Weikum)

Tues, 28.02.2017, 15:00h, building E1 4, room 0.24


Recent progress in information extraction has enabled the automatic construction of large knowledge bases. Knowledge bases contain millions of entities (e.g. persons, organizations,events, etc.), their semantic classes, and facts about them. Knowledge bases have become a great asset for semantic search, entity linking, deep analytics, and question answering. However, a common limitation of current knowledge bases is the poor coverage of temporal knowledge. First of all, so far, knowledge bases have focused on popular events and ignored long tail events such as political scandals, local festivals, or protests. Secondly, they do not cover the textual phrases denoting events and temporal facts at all.

The goal of this dissertation, thus, is to automatically populate knowledge bases with this kind of temporal knowledge. The dissertation makes the following contributions to address the aforementioned limitations. The first contribution is a method for extracting events from news articles. The method reconciles the extracted events into canonicalized representations and organizes them into fine-grained semantic classes. The second contribution is a method for mining the textual phrases denoting the events and facts. The method infers the temporal scopes of these phrases and maps them to a knowledge base. Our experimental evaluations demonstrate that our methods yield high quality output compared to state-of-the-art approaches, and can indeed populate knowledge bases with temporal knowledge.



Provably Sound Semantics Stack for Multi-Core System Programming with Kernel Threads

(Advisor: Prof. Wolfgang Paul)

Fri, 24.02.2017, 15:15h, building E1 7, room 0.01


Operating systems and hypervisors (e.g., Microsoft Hyper-V) for multi-core processor architectures are usually implemented in high-level stack-based programming languages integrated with mechanisms for the multi-threaded task execution as well as the access to low-level hardware features. Guaranteeing the functional correctness of such systems is considered to be a challenge in the field of formal verification because it requires a sound concurrent computational model comprising programming semantics and steps of specific hardware components visible for the system programmer.

In this doctoral thesis we address the aforementioned issue and present a provably sound concurrent model of kernel threads executing C code mixed with assembly, and basic thread operations (i.e., creation, switch, exit, etc.), needed for the thread management in OS and hypervisorkernels running on industrial-like multi-core machines. For the justification of the model, we establish a semantics stack, where on its bottom the multi-core instruction set architecture performing arbitrarily interleaved steps executes binary code of guests/processes being virtualized and the compiled source code of the kernel linked with a library implementing the threads. After extending an existing general theory for concurrent system simulation and by utilising the order reduction of steps under certain safety conditions, we apply the sequential compiler correctness for the concurrent mixed programming semantics, connect the adjacent layers of the model stack, show the required properties transfer between them, and provide a paper-and-pencil proof of the correctness for the kernel threads implementation with lock protected operations and the efficient thread switch based on the stack substitution.



Minimal Assumptions in Cryptography

(Advisor: Prof. Dominique Schröder, now Erlangen)

Thurs, 09.02.2017, 16:00h, building E9 1, room 0.01


Virtually all of modern cryptography relies on unproven assumptions. This is necessary, as the existence of cryptography would have wide ranging implications. In particular, it would hold that P =/= NP, which is not known to be true. Nevertheless, there clearly is a risk that the assumptions may be wrong. Therefore, an important field of research explores which assumptions are strictly necessary under different circumstances.

This thesis contributes to this field by establishing lower bounds on the minimal assumptions in three different areas of cryptography. We establish that assuming the existence of physically uncloneable functions (PUF), a specific kind of secure hardware, is not by itself sufficient to allow for secure two-party computation protocols without trusted setup. Specifically, we prove that unconditionally secure oblivious transfer can in general not be constructed from PUFs. Secondly, we establish a bound on the potential tightness of security proofs for Schnorr signatures. Essentially, no security proof based on virtually arbitrary non-interactive assumptions defined over an abstract group can be significantly tighter than the known, forking lemma based, proof. Thirdly, for very weak forms of program obfuscation, namely approximate indistinguishability obfuscation, we prove that they cannot exist with statistical security and computational assumptions are therefore necessary. This result holds unless the polynomial hierarchy collapses or one-way functions do not exist.



Distributed Querying of Large Labeled Graphs

(Advisor: Prof. Gerhard Weikum)

Mon, 06.02.2017, 15:00h, building E1 5, room 0.29


"Labeled Graph", where vertices and edges are labeled, is an important adaptation of "graph" with many practical applications. An enormous research effort has been invested in to the task of managing and querying graphs, yet a lot challenges are left unsolved. In this thesis, we advance the state-of-the-art for the following query models by proposing a distributed solution to process them in an efficient and scalable manner.

• Set Reachability: A generalization of basic notion of graph reachability, set reachability deals with finding all reachable pairs for a given source and target sets. We present a non-iterative distributed solution that takes only a single round of communication for any set reachability query.

• Basic Graph Patterns (BGP): BGP queries are a common mode of querying knowledge graphs, biological datasets, etc. We present a novel distributed architecture that relies on the concepts of asynchronous executions, join-ahead pruning, and a multi-threaded query processing framework to process BGP queries in an efficient and scalable manner.

• Generalized Graph Patterns (GGP). These queries combine the semantics of pattern matching (BGP) and navigational queries. We present a distributed solution with bimodal indexing layout that individually support efficient and scalable processing of BGP queries and navigational queries.

We propose a prototype distributed engine, coined “TriAD” (Triple Asynchronous and Distributed) that supports all the aforementioned query models. We also provide a detailed empirical evaluation of TriAD in comparison to several state-of-the-art systems over multiple real-world and synthetic datasets.





r-Symmetry for Triangle Meshes: Detection and Applications

(Advisor: Prof. Philipp Slusallek)

Tues, 31.01.2017, 12:15h, building D3 2, DFKI VisCenter


In this thesis, we attempt to give insights into how rigid partial symmetries can be efficiently computed and used in the context of inverse modeling of shape families, shape understanding, and compression.

We investigate a certain type of local similarities between geometric shapes. We analyze the surface of a shape and find all points that are contained inside identical, spherical neighborhoods of a radius r. This allows us to decompose surfaces into canonical sets of building blocks, which we call microtiles. We show that the microtiles of a given object can be used to describe a complete family of related shapes. Each of these shapes is locally similar to the original, but can have completely different global structure. This allows for using r-microtiling for inverse modeling of shape variations and we develop a method for shape decomposition into rigid, 3D manufacturable building blocks that can be used to physically assemble shape collections. Furthermore, we show how the geometric redundancies encoded in the microtiles can be used for triangle mesh compression.


Xiaokun WU

Structure-aware content creation – Detection, retargeting and deformation

(Advisor: Prof. Hans-Peter Seidel)

Fri, 20.01.2017, 14:15h, building E1 4, room 019


Nowadays, access to digital information has become ubiquitous, while three-dimensional visual representation is becoming indispensable to knowledge understanding and information retrieval. Three-dimensional digitization plays a natural role in bridging connections between the real and virtual world, which prompt the huge demand for massive three-dimensional digital content. But reducing the effort required for three-dimensional modeling has been a practical problem, and long standing challenge in compute graphics and related fields.

In this talk, we propose several techniques for lightening up the content creation process, which have the common theme of being structure-aware, i.e. maintaining global relations among the parts of shape. We are especially interested in formulating our algorithms such that they make use of symmetry structures, because of their concise yet highly abstract principles are universally applicable to most regular patterns.

We introduce our work from three different aspects in this thesis. First, we characterized spaces of symmetry preserving deformations, and developed a method to explore this space in real-time, which significantly simplified the generation of symmetry preserving shape variants. Second, we empirically studied three-dimensional offset statistics, and developed a fully automatic retargeting application, which is based on verified sparsity. Finally, we made step forward in solving the approximate three-dimensional partial symmetry detection problem, using a novel co-occurrence analysis method, which could serve as the foundation to high-level applications.




2016 archive 

2015 archive 

2014 archive

2013 archive

2012 archive

2011 archive


Home > Doctoral students > PhD thesis defenses