description: Internet of Things (IoT) has already revolutionized our lives and strengthen our daily interaction with information that is sensed from our environment, and presented in an informative way to take actions upon, or combine multiple sources of information to understand a situation. The use cases many, however, one of them, that is being researched constantly is the Home Automation. For this seminar topic, is required to create a taxonomy of the used sensors/actuators and the measured quantity that they sense (e.g. temperature, luminous intensity etc.). Additionally, further alternatives should be investigated for expressing the same measured quantity by combining different information.
description: With the advancement of IoT, there is no doubt in this fact that more and more devices will be connected to the Internet in near future. Cisco has estimated the number of IoT devices will be >50 billion by 2020 . Such a high number of devices certainly need IP addresses to be connected to the Internet. Along with IoT devices, increasing growth of cloud computing stimulates the urge for more IP addresses. As IPv4 can provide only 4.3 billion IP addresses and there are few addresses left to be used, the focus is now going towards adopting IPv6. However, the deployment of IPv6 in modern network brings new challenges and complexities such as compatibility issues with IPv4, security etc. The 20 years old protocol needs further improvement to cope up with the requirements of the current and future network as well. Furthermore, security and privacy are big concerns in IoT which raise the question whether IPv6 is capable to deal with such security vulnerabilities in the interconnected network.
The seminar topic includes analysis of the challenges of using IPv6 in modern network as well as it’s pros and cons with respect to IoT network. The student should further check if the current protocol design is good enough to deal with security and privacy threats in IoT network and what specific enhancement is needed to improve to the protocol. The question in this topic to be answered is that if the current status of IPv6 makes it eligible to be deployed in the next-generation network.
description: A data centre operator aims to maximise it’s resource utilisation while achieving this with minimum number of servers, energy and resources to increase profit. Users aim to optimise on Cloud application level by scaling up and down according to user demands. Another aspect of optimization can be seen in cross data centre solutions, where Cloud applications are distributed across several vendors in order to avoid vendor lock-in and increase reliability.
The placement and migration problem of VMs in Cloud data centre is illustrated as a framework based on mathematical optimisation and objective function minimisation. VM allocation issue is considered as an NP-complete problem as we need to find out combinatorial optimisation in order to achieve the targets. The key challenge for an optimisation algorithm is to deliver a good solution in very short time. Furthermore, the found configuration must be stable enough to sustain sufficiently long to avoid continuous migration actions of VMs across the data centre.
There are a variety of optimisation algorithms, such as Convex optimisation problem, Minimum K cut/balanced Minimum K cut, Knapsack problem and Metaheuristic-based approaches such as Ant Colony Optimisation (ACO), Simulated Annealing (SA) etc. It is important to select an optimization technique which can be used for the validation of any resource allocation algorithm in a data centre before it's real deployment.
The topic requires an in-depth study on the optimisation algorithms/techniques which are currently being used in placing VMs in the cloud infrastructure. The student should consider the following questions.
What makes the algorithms applicable into VM placement scenario?
What are their pros and cons?
How optimal do the placements become with the use of those algorithms?
The ultimate goal is to find a set of best suited algorithms which can be used in a simulation environment to validate any VM placement algorithm.
description: Over the last years, the adoption of Cloud Infrastructure has not only increased in numbers but more and more resource demanding, and business critical applications such as High Performance Computing (HPC) simulations or Data intensive computing (DIC) applications are moved from dedicated infrastructure towards shared Cloud based solutions. These applications require large amount of compute and storage resources and are often executed as distributed or even parallel applications involving significant amount of communication with low latency among the hosted Virtual Machines. In order to handle the heavy network traffic load stimulated by those resource-demanding applications and to ensure their scalability and better performance in cloud environments, an intelligent and efficient resource allocation algorithm is necessary.
A common approach is to define a set of per-defined flavors with per-defined number of virtual CPUs and virtual memory capacity as static parameters. When a user has chosen a specific flavor, the deployment algorithm searches for the first fitting or, randomly selected host that meet the demand in terms of free memory and virtual CPUs below the maximum allowed overbooking factor. Partially, storage capacity and type are considered as well in selecting an appropriate host. Other parameters such as network load or, usage pattern are commonly not considered as decision parameter, as it is granted to be sufficiently addressed by over-provisioning of bandwidth in the network. However, network should be taken into account while designing a resource allocation algorithm in a cloud data centre to minimize network traffic and to guarantee optimal usage of compute and network resources along with reduction in energy and cost.
In recent years, research and study on network-aware VM placement and migration schemes have been emerged. The seminar topic needs an intensive study on the past and present algorithms and techniques for VM placement and migration in cloud data centres. The following questions need to be answered.
What network properties have been considered in those algorithms?
What are the placement type and constraints?
What are the objective of those algorithms?
What are their evaluation performance metrics?
How effective are the algorithms?
The student should make a comparison among the placement and migration algorithms to find out the most fitting algorithm for today’s large and highly scalable cloud data centres.
description: What determines a IoT network? Which technologies are used? IoT is more than a bunch of small nodes added to a existing wlan, but what are the differences?
Many standards and protocols like Bluetooth, ZigBee (RF4CE), Z-Wave, LoRaWAN, 6LoWPAN, Thread are used today, sort them and give an overview over actual wireless network solutions and their dedicated use cases.
description: For HPC clusters and Cloud applications, using containers is becoming a standard practice. In this seminar, the use of containers for machine and deep learning topics shall be evaluated, including a performance comparison of current frameworks, e.g. TensorFlow and Cafee2 with and without container solutions.
Topic 07: Generative Adversarial Networks (GANs) for Time-Series Workload Data Generation
description: GANs currently offer the possibility of artificial data generation. Given a dataset of a cluster or server workload, the feasibility and possibility of creating realistic synthetic data using GANs shall be researched and evaluated.
description: With the advancement of Cloud computing, a new era has begun where a significant number of applications of different kinds are being migrated to Cloud data centres. Among those applications, some stimulate heavy amount of communication with low latency among the hosted Virtual Machines. To determine the actual network resource demand of such applications and how they impact the performance of the network, it is very essential to model their communication behaviour. A widely used method is to monitor and evaluate the network traffic to determine and also forecast the real network resource utilisation of the applications. Network traffic models such as Poisson Distribution Model, Markov and Embedded Markov Models, Pareto Distribution Process can be used to understand both the network and the applications using the network better.
The seminar topic requires a comprehensive study on the well known traffic models to find out their features, pros and cons with respect to their use in the cloud data centre network. The ultimate goal is to find a set of best suited traffic models which can represent the communication behaviour of the applications more accurately in such networks.
description In September 2017 the Java Developer Kit has been released in its 9th version. This topic has the goal to get an overview on the new features of JDK 9 on both library as well as language (syntactic) level. Special focus shall be put on the new module system jigsaw.
description State machine replication is a common approach to ensure reliability of a (remote) service with no fail-over time in case of failures ,. It has been proposed by Leslie Lamport in the 1970s and has been a topic of research ever since. Besides some use in specialised domains
such as aviation, it, however, lacks wide-spread use due to complex implementation and very high overhead due to sequential execution of requests. This situation can be improved by applying deterministic scheduling to state machine implementations . This seminar thesis shall present
the state machine concept and describe its shortcomings. It shall also present the idea of deterministic scheduling and how this solves some of the problems. For master students, it is expected that two deterministic scheduling algorithms are researched, presented, and compared.
description Non-volatile RAM is flash-based memory that can be applied to computers (and servers) in the form of slow, but non-volatile memory modules. These have the very same properties as other flash-based storage media such as flash-based SSDs and can also be used for running file systems.
Goal of this topic is to identify application domains for NVRAM and hindrances current computer architectures for implementing it.
Topic 12: Algorithms for Application-aware Caching of File System Access
description Caching of file system access is a standard technology in almost any operating system to boost in particular read access to the file system. This approach, however, is implemented in the operating system and is usually agnostic to the running application. This thesis shall perform
research on existing caching algorithms for application-aware caching algorithms for file system access.
Topic 13: A Comparsion of Lifecycle Handling of Orchestration Tools
description In the last decade, the software engineering world has seen a boost of orchestration tools and platforms (Docker, Ansible, Puppet), but also containers (Docker, Singularity), or others (Heat, Murano). This thesis shall provide
an overview over existing tools and particularly language formats and lifecylce support these environments provide. Master students should additionally propose an abstraction layer for them.
description resilience for IoT architectures and infrastructures is an extremely emerging research field that captures programming of devices and other parts of the infrastructure as well as the design of the hardware and the networking facilities. This thesis shall present a high-level overview of existing techniques and further discuss two approaches in more detail.
Topic 15: Comparative Analysis of Distributed File Systems
description Distributed file systems support spreading data over pools of hardware in a way that is mostly transparent to the user. This thesis shall compare different types of distributed file systems and the trade-offs taken in their implementation.
Topic 16: The consensus problem in distributed computing
description: Consensus algorithms have a long tradition in theoretical and also applied computing. This thesis shall introduce the general problem of distributed consensus and present two famous consensus algorithms.
description: The evolvement of cloud computing has led as well to changes in the database landscape. While traditional relational databases were deployed in a monolithic set up on dedicated hardware, recent database systems such as NoSQL systems promise high performance, scalability and elasticity by running on commodity hardware in the cloud. Yet, NoSQL sacrifice consistency in order to achieve scalability and elasticity. An even more recent type of database system, namely NewSQL databases, promise to be as scalable and elastic as NoSQL databases and still offer strong consistency. One of the first NewSQL databases was [Google Spanner] (http://dl.acm.org/citation.cfm?id=2491245) and open source representants such as VoltDB or CockroachDB followed. The term NewSQL and its capabilities is already discussed in the research community, but not yet classified. The object of this topic is to provide a classification of the term NewSQL and its correlation to NoSQL. A starting point is provided by this paper paper.
Topic 18: Storage solutions for time series sensor data
description: The evolvement of IoT led to tremendous increase of sensor data, which needs to be persisted and aggregated for further processing in the back ends, which typically run in the cloud. As sensor data comprises typically a timestamp, the actual sensor data and optional metadata, new storage solutions have evolved focusing exactly such kind of data. These storage solutions are commonly referred to a Time Series Databases (TSDBs) and common representants are InfluxDB, Prometheus, OpenTSDB or Druid. Yet, the architecture of these databases differs significantly and thorough evaluations of such TSDBs are rare. In the scope of this topic an overview of TSDB storage solutions should be provided based on identifying challenges and analysing existing TSDB storage solutions.
The rise of cloud computing requires a change in the way business applications are deployed to their infrastructure. The sheer amount of possible configurations and the error-proneness of manual deployments necessiate deployment and management automation. A common approach to achieve this kind of automation is the use of model driven approaches, were the user starts the deployment with model describing his application which is afterwards automatically deployed to the target (cloud) infrastructure and furthermore managed during the runtime.
The Topology and Orchestration Specification for Cloud Applications (TOSCA)  is an OASIS standard for describing cloud applications in a platform independent way. Around TOSCA an ecosystem has evolved offering tool support for managing applications described using TOSCA, e.g. OpenTOSCA , Cloudify  or Alien4Cloud .
The target of this topic is to provide an overview of the concepts of the TOSCA modeling language and its tool support by analyzing the offered features, but also their adherence to the TOSCA standard and cross-tool features like interoperability.
While Cloud Computing, due to its nearly unlimited on-demand resources, allows unhindered adaptation of one’s application to the end user’ needs, this holds only true of the application is designed and programmed in a way that it can take advantage of those resources. One architecture style, achieving the therefore required loose coupling between application components is Representational State Transfer (REST) . To increase the performance and reduce the amount of data transferred for the ever increasing number of mobile devices, Facebook recently published GraphQL  as open-source software. The task of this topic is to give a brief introduction of the GraphQL fundamentals, to compare it to traditional REST by focusing on the import aspects of loose coupling, implementation complexity and performance. Additionally it should be researched if real-work implementations for well-established programming languages exist.
To ease the management of large and distributed data centres, the term of data centre operating systems has arisen. They provide resource management features for clusters of computers similar to what a "classic" operating system does for a single computer. Therefore such systems provide a scheduling system where the jobs/tasks of the user's are mapped to the available resources. The task of this topic is to research the principle behind the term data centre operating system and to provide an overview of existing implementations.
The following resources may provide a good starting point: