description: Internet of Things (IoT) has already revolutionized our lives and strengthen our daily interaction with information that is sensed from our environment, and presented in an informative way to take actions upon, or combine multiple sources of information to understand a situation. The use cases many, however, one of them, that is being researched constantly is the Home Automation. For this seminar topic, is required to create a taxonomy of the used sensors/actuators and the measured quantity that they sense (e.g. temperature, luminous intensity etc.). Additionally, further alternatives should be investigated for expressing the same measured quantity by combining different information.
Topic 02: Operational dynamics in Object Identification
description: Governments in many countries introduced surveillance systems, for reasons such: coordination of traffic in busy roads, surveillance cameras in usually crowed places or for private reasons (i.e. banks, house surroundings etc.). Despite the promotion of the innocence of such installations in public places, functions as identification of an object could foment classification and further tracking of the object. For this seminar topic, it is required to investigate different ways to track dynamically an object from cameras and other sensors; what functions can you run upon them (i.e Face/eye recognition, classification based on clothes, posture recognition etc.).
description: Human tracking has been always a taboo, as it denotes to violate privacy. However, the use cases that is meant to be useful are many, such us: Emergency Health Detection, track a possible threat on a surroundings etc. The topic requires to identify further usecases that foresee Human Tracking in their actions, classify them and describe the functionality of existing platforms, algorithms etc.
description: In cloud computing physical servers are usually overbooked by virtual machines, meaning that more virtual resources are sold than physical resources are available. This model improves the utilisation of physical servers, since the virtual machines don't use all of their virtual resources over most of the time. Still, such situations are likely and harm the quality of experience of cloud customers. The question is, how the overbooking of each resource (cpu, network, memory, disk) can be efficiently detected at runtime, to inform the cloud provider about this concern. A technical solution should be sketched, about how to technically detect resource bottlenecks.
Topic 05: Unterschiede der verteilten Dateisysteme CEPH-FS, GlusterFS und DFS
description: Verteilte Dateisysteme agieren wie lokale Dateisysteme (ext4, ntfs) gegenüber einer Anwendung, legen die zu speichernden Elemente jedoch verteilt über mehrere Server hinweg ab. Dadurch können größere Datenvolumen und schnellere Verarbeitungszeiten erzielt werden, da mehrere Server im Parallelbetrieb arbeiten. Zudem erhöht eine Replizierung die Ausfallsicherheit.
Dieses Seminar-Thema soll drei der verteilten Dateisysteme genauer betrachten und miteinander vergleichen. Vergleichskriterien können die Architektur des Systems, vorhandene Geschwindigkeitsmessungen oder Funktionsmöglichkeiten sein.
Topic 06: Scalable Monitoring Frameworks for Large Setups
description: Monitoring your infrastructure is a necessary task, to detect
failures and to understand the performance of your hardware and
the requirements of your applications. This applies for IoT devices
as well as for cloud data centres. Tons of monitoring software exists,
as well as comparisons between them. The questions to be answerd in this
topic regarding these monitoring solutions are as follows:
How scalable are these systems? how do they scale?
How are tons of metrics aggregated and analysed to something meaningful?
How can optimisation tools be connected to the system?
description: The evolvement of cloud computing has led as well to changes in the database landscape. While traditional relational databases were deployed in a monolithic set up on dedicated hardware, recent database systems such as NoSQL systems promise high performance, scalability and elasticity by running on commodity hardware in the cloud. Yet, NoSQL sacrifice consistency in order to achieve scalability and elasticity. An even more recent type of database system, namely NewSQL databases, promise to be as scalable and elastic as NoSQL databases and still offer strong consistency. One of the first NewSQL databases was [Google Spanner] (http://dl.acm.org/citation.cfm?id=2491245) and open source representants such as VoltDB or CockroachDB followed. The term NewSQL and its capabilities is already discussed in the research community, but not yet classified. The object of this topic is to provide a classification of the term NewSQL and its correlation to NoSQL. A starting point is provided by this paper paper .
description: The evolvement of cloud computing and also IoT led to the continuous evolution of data centers in respect to resource sharing.
Besides hypervisor based virtualisation, containers (e.g. Docker) became have gained a momentum in the last years.
Compared to virtual machines, container promise to be more flexible and lightweight. As containers are mainly exploited to transform monolithic
systems into stateless micro-services, it is still not clear, if and how containers can be exploited to manage stateful applications such as databases.
Within this topic an overview of existing approaches and evaluations for containerised database systems should be provided.
A starting point could be this paper
Topic 09: Resource Centric Database Evaluation in the Cloud
description: The evolvement of cloud computing has led as well to changes in the database landscape. While traditional relational databases were deployed in a monolithic set up on dedicated hardware, recent database systems such as NoSQL and NewSQL systems promise high performance, scalability and elasticity by running on commodity hardware in the cloud. Yet, cloud resources tend to be quite heterogeneous on the physical level, which might influence the efficiency of a database, which is distributed in the cloud. The goal of this topic is to identify studies, which address the database deployment and evaluation of distributed databases in the cloud, with the focus on the cloud resource selection. A starting point could
be this paper
Topic 10: Storage solutions for time series sensor data
description: The evolvement of IoT led to tremendous increase of sensor data, which needs to be persisted and aggregated for further processing in the back ends, which typically run in the cloud. As sensor data comprises typically a timestamp, the actual sensor data and optional metadata, new storage solutions have evolved focusing exactly such kind of data. These storage solutions are commonly referred to a Time Series Databases (TSDBs) and common representants are InfluxDB, Prometheus, OpenTSDB or Druid. Yet, the architecture of these databases differs significantly and thorough evaluations of such TSDBs are rare. In the scope of this topic an overview of TSDB storage solutions should be provided based on identifying challenges and analysing existing TSDB storage solutions.
The rise of cloud computing requires a change in the way business applications are deployed to their infrastructure. The sheer amount of possible configurations and the error-proneness of manual deployments necessiate deployment and management automation. A common approach to achieve this kind of automation is the use of model driven approaches, were the user starts the deployment with model describing his application which is afterwards automatically deployed to the target (cloud) infrastructure and furthermore managed during the runtime.
The Topology and Orchestration Specification for Cloud Applications (TOSCA)  is an OASIS standard for describing cloud applications in a platform independent way. Around TOSCA an ecosystem has evolved offering tool support for managing applications described using TOSCA, e.g. OpenTOSCA , Cloudify  or Alien4Cloud .
The target of this topic is to provide an overview of the concepts of the TOSCA modeling language and its tool support by analyzing the offered features, but also their adherence to the TOSCA standard and cross-tool features like interoperability.
Cloud computing has emerged as the leading technology to provide on-demand computing services that can be represented as Software (SaaS), Infrastructure (IaaS) or Platforms (PaaS). To be able to continually improve this already wide-spread ecosystem, methods to evaluate new algorithms prior to real world usage have to be found. As the usage of real-world testbeds is not only costly with respect to money and time resources, but also limits experiments to the scale of the hardware, the usage of simulation environments becomes necessary. Using e.g. discrete event simulation, the time and resource usage for evaluation can be greatly reduced.
The task of this topic is to compare and evaluate existing simulation environments for cloud computing, like CloudSim. An overview of existing simulation software is given in . The result of this topic should evaluate the support of the different layers of cloud computing (physical layer to application layer) by each tools, and discuss the features of the tools based on those layers.
While Cloud Computing, due to its nearly unlimited on-demand resources, allows unhindered adaptation of one’s application to the end user’ needs, this holds only true of the application is designed and programmed in a way that it can take advantage of those resources. One architecture style, achieving the therefore required loose coupling between application components is Representational State Transfer (REST) . To increase the performance and reduce the amount of data transferred for the ever increasing number of mobile devices, Facebook recently published GraphQL  as open-source software. The task of this topic is to give a brief introduction of the GraphQL fundamentals, to compare it to traditional REST by focusing on the import aspects of loose coupling, implementation complexity and performance. Additionally it should be researched if real-work implementations for well-established programming languages exist.
To ease the management of large and distributed data centres, the term of data centre operating systems has arisen. They provide resource management features for clusters of computers similar to what a "classic" operating system does for a single computer. Therefore such systems provide a scheduling system where the jobs/tasks of the user's are mapped to the available resources. The task of this topic is to research the principle behind the term data centre operating system and to provide an overview of existing implementations.
The following resources may provide a good starting point:
description: Energy efficiency plays today one of the most important roles in every aspect of human being. Also Cloud and High Performance Computing try to be not only powerful but also efficient, try to optimize energy consumption especially when load is low. That’s why there are some trends to use low power devices in the data centres. Is it possible? Is it economically reasonable? What low power technologies do exist today and could they be combined with traditional powerful solutions? All these questions could be answered after the investigation of this topic.
description: Schedulers in HPC also could be considered like a communication mechanism between users and clusters: to run an application on the cluster user has to submit a job, after that scheduler puts this job into a queue where the job waits until suitable resources are available. When it happens, scheduler starts the job on a compute node and when the job is finished makes a report about it to the user.
There are different scheduling systems which are available on the market. E.g. on the bwForCluster JUSTUS in Ulm  are installed Moab as a scheduler and Torque as a resource manager, on the bwUniCluster a role of the resource manager plays Slurm .
The main idea of this topic is to get new knowledge about scheduling systems in HPC, find out a difference between scheduler and resource manager, analyse which scheduling systems are available on the market and which of them are used and where.
As an extra point must be done research if it is possible to use common HPC scheduling systems on the low power computers like Raspberry Pi, what kind of adaptations have to be done for that.
description: Such words as “blockchain” and “bitcoin” are very popular today, almost everyone is talking about it. Interesting, that an idea of the blockchain was born in 90s, and only in 2008 came an official realisation of the first distributed blockchain . Since that time until today this technology was always in rotation. Today blockchain is also a direction, where industry, business and financial organisation are looking at. As an example of this fact could be a meeting of the blockchain network this summer in Stuttgart, where this technology was discussed in scope of programming, business models and IoT .
An idea of this topic is to get more knowledge about the blockchain, what is hidden behind this technology, what are the main principals of work and how can we use it today, is it only mining of bitcoins or something else too. The links bellow could be useful for the starting research point.
description: Software Defined Networking (SDN) is a trending concept which is based on separating the data plane from the control plane. In SDN systems such as large-scale data centre networks, an essential part of network management is continuous monitoring of different performance metrics. One example can be link utilization for faster adoption of forwarding rules with respect to dynamic workload. The statistical results from monitoring have to be accurate and timely. The current flow-based network monitoring tools produce extreme overhead since the statistics are generated from the overall network at the central controller. To gain high accuracy and low overhead, some concepts have been developed:
The evaluation should consist of an analysis of the above options such as their pros and cons as well
as drawing a conclusion by determining the best solution for reduced overhead and high accuracy.
The question that should be tackled after investigating the above options is, 'How to increase
accuracy and decrease network overhead when aggregating the networks statistics by the SDN
controller or a monitoring device?'
description: Security is a big concern in the current cloud data centre network. Different kinds of attacks can be experienced such as Distributed Denial of Service attacks (DDoS), Worm propagation, Portscan etc.
Flow-based and entropy-based anomaly detection have been proved very effective in the cloud network infrastructure. Also, many useful anomaly detection algorithms have been developed such as statistical anomaly detection, machine learning based anomaly detection and data mining based anomaly detection (1). On the other hand, research has been done on developing Intrusion Detection as a Service in the cloud infrastructure network which provides security services to the users as well as to the network administrators.
This seminar topic requires an extensive analysis of different existing solutions for detecting anomalies in cloud data centre network which should lead to find the most appropriate solution/solutions with respect to present and future cloud network architecture.
description In September 2017 the Java Developer Kit has been released in its 9th version. This topic has the goal to get an overview on the new features of JDK 9 on both library as well as language (syntactic) level. Special focus shall be put on the new module system jigsaw.
description State machine replication is a common approach to ensure reliability of a (remote) service with no fail-over time in case of failures ,. It has been proposed by Leslie Lamport in the 1970s and has been a topic of research ever since. Besides some use in specialised domains
such as aviation, it, however, lacks wide-spread use due to complex implementation and very high overhead due to sequential execution of requests. This situation can be improved by applying deterministic scheduling to state machine implementations . This seminar thesis shall present
the state machine concept and describe its shortcomings. It shall also present the idea of deterministic scheduling and how this solves some of the problems. For master students, it is expected that two deterministic scheduling algorithms are researched, presented, and compared.
description Non-volatile RAM is flash-based memory that can be applied to computers (and servers) in the form of slow, but non-volatile memory modules. These have the very same properties as other flash-based storage media such as flash-based SSDs and can also be used for running file systems.
Goal of this topic is to identify application domains for NVRAM and hindrances current computer architectures for implementing it.
Topic 23: Algorithms for Application-aware Caching of File System Access
description Caching of file system access is a standard technology in almost any operating system to boost in particular read access to the file system. This approach, however, is implemented in the operating system and is usually agnostic to the running application. This thesis shall perform
research on existing caching algorithms for application-aware caching algorithms for file system access.
Topic 24: A Comparsion of Lifecycle Handling of Orchestration Tools
description In the last decade, the software engineering world has seen a boost of orchestration tools and platforms (Docker, Ansible, Puppet), but also containers (Docker, Singularity), or others (Heat, Murano). This thesis shall provide
an overview over existing tools and particularly language formats and lifecylce support these environments provide. Master students should additionally propose an abstraction layer for them.
description resilience for IoT architectures and infrastructures is an extremely emerging research field that captures programming of devices and other parts of the infrastructure as well as the design of the hardware and the networking facilities. This thesis shall present a high-level overview of existing techniques and further discuss two approaches in more detail.
description Unikernels are special purpose built applications, including only the minimal runtime required to provide its functionality. This approach offers several improvements regarding performance and security compared to traditional solutions. In this topic, you are going to research the basic concepts of these unikernels, compare it to existing solutions and discuss its viability for IoT end edge computing.
description: Software defined networking (SDN) is becoming as important as cloud computing: physical resources in a data centre are shared with customers as smaller sets of virtual resources. In this context, SDN is responsible for providing virtual networks, virtual private routers, and virtual IP addresses for virtual machines or containers. SDN in this area stays inside a single data centre. Google on the other hand brings SDN to its wide area network (WAN), which connects their data centres world wide. This seminar topic is about the interplay between SDN in cloud data centres and SDN in WANs between regions. As a starting point, the literature about the Google SDN WAN should be considered. Further, the relationship of SDN from WAN to LAN should be discussed.