Student Colloquium 2013

Gert-Jan van Dijk is director at Target Holding. He will talk about big data, and how Target Holding uses it for Cultural Heritage, Dike Safety, Entertainment and Smart Grids.

Graph bundling is a technique used in order to improve the quality of the visualization of very large graphs. The problem with visualizing very large graphs by just drawing straight lines between the nodes is that many lines will intersect or overlap each other resulting in much cluttering and thus decreasing the readability of such graphs. In order to reduce the cluttering of the lines in the graph many methods have been developed. They all aim to bundle spatially groups related, close, edges of the graph into curved, dense, bundles, thereby reducing clutter and making the overall graph structure easier to see. The results of the different algorithms seem to produce similar results when been used on similar datasets however differences between these algorithms exist in terms of the computational speed, the simplicity of implementation, and the quality of the drawings output.

Our research extends to the following four graph bundling algorithms: Hierarchical edge bundles, Force directed edge bundling, Kernel density estimation[3] and winding roads[4]. We will compare and discuss them from the perspective of desirable quality criteria such as: the generation of uncluttered drawings, the scalability (for large graphs), the computational speed and the simplicity of implementation. In this paper we will describe the biggest and most evident differences between the four above mentioned algorithms. We will also look at what the different algorithms have in common.

Smart grids are dynamic, decentralized electrical grids, in which customers are not only able to consume energy but are also able to produce energy. The customer’s excess energy is returned to the smart grid, resulting in a situation where consumers and producers interact with each other. To facili- tate these interactions a smart grid architecture has to be in place. A common solution for supporting these interactions are service-oriented architectures.

In our research, two service-oriented smart grid architectures are analyzed and compared to uncover similarities and differences between these architectures. The two architectures that are researched are the Integration and Energy Man- agement system supported by the NOBEL project and the smart grid architec- ture supported by the SmartE project

Recent research indicates that one of the challenges of the smart grid is to se- cure privacy sensitive data and that only limited research was been done with regards to this topic. Our research will identify privacy sensitive data and inves- tigate strategies and techniques which can be used to increase the information security of this data. These strategies and techniques will be evaluated based on their usefulness and applicability in service-oriented smart grid architectures.

2D inpainting is the automated filling in of missing image regions in such a way that the result looks natural. The concept can be applied to restore damaged images, or to remove unwanted objects (e.g. tourists in nature photography). A famous application is the content-aware filling tool that was introduced in Adobe Photoshop CS5. The ‘missing’ region is usually defined by a human operator, but the method can also be used for advanced automated noise removal. Various techniques using highly different approaches have been developed to perform inpainting, the most prevalent of which are based on:

Geometric smoothness: in order to preserve structure, isophotes (lines of equal brightness) that arrive at the boundary of the missing region should be extrapolated throughout the region as smoothly as possible.
Texture synthesis: in many images, information is stored in multiple places (e.g. waves in a sea, clouds in the sky). Such information is called texture, and if it borders the missing region, it can be used to synthesize a natural-looking texture in the missing region.
Sparse representations: this relatively recent approach uses a dictionary of basis images, constructed from a database. We will not treat it in our research.

In our research we will focus on geometric smoothness and present and compare some of its algo- rithms. Our comparison will be based on domain of application and visual results, simplicity and ease of implementation, and their computational performance. Our results will provide a quick overview of the field, point out the limitations of each technique, and finally have a brief look at methods that combine geometric smoothness with texture synthesis.

Since most articles provide images to demonstrate their algorithm and report computational performance, we will use the results as reported in the literature for our comparison.

Many processes today generate extreme amounts of data, sometimes on petabyte scale. Exploration in that data often involves extraction and visual- ization of specific features. For example, in neuroscience, electron microscopy (EM) volumes of brain tissue are produced by physically cutting very thin sections of about 25–50 nm, and imaging each section at 35 nm pixel resolu- tion. Even sub-millimeter tissue blocks imaged at such resolutions comprise terabytes of raw data, and neuroscientists target much larger volumes of sev- eral petabytes.

In proposed paper we will discuss and evaluate latest advancements in inter- active visualization method of extremely large data. We will first introduce well established methods of visualization driven intelligent data streaming from larger, but slower memory units. The methods make it possible to han- dle large scenes that would otherwise never fit in fast GPU memory. Then we will discuss the method of scaling data manipulation operations on large data using MapReduce framework. Both approaches together enable a researcher to interactively manipulate and visualize extremely large volumetric data.

Ubiquitous computing, and specifically smart environment, starting from smart home to more complex one like smart city, is an emerging paradigm for interactions between people and computers. Its aim is to break away from desktop computing to provide computational services to a user when and where required. Large numbers of heterogeneous computing devices provide new functionality, enhance user productivity, and ease everyday tasks. In home, office, and public spaces, ubiquitous computing will unobtrusively augment work or recreational activities with information technology that optimizes the environment for people’s needs. This Smart environments demand efficient interoperation mechanism among different heterogeneous sensors including the discovery and the management of these devices. The diverse domains of applications also require interoperation among themselves. The middleware plays a key role to achieve this interoperation. From the previous works it's clear that there are a various kinds of devices and different integration methods.

The rationale behind this paper is to study the existence or lack of suitable middleware infrastructure for the development of applications and services in ubiquitous computing environments at smart homes. This is done by surveying the state of the art in the area, analyzing the requirements of a ubiquitous computing in middleware. After that, discussing the current middleware practices and comparing them considering the ubiquitous requirements. The achived result is that there are some good smart homes infrastructures and practices that consider some of the ubiquitous computing needs. Since, the area of networked embedded systems at smart homes recently is growing at a rapid pace and several industrial and academic research activities are underway. So, this paper is very important to help investigate the existing works, also showing the gab and the requirements for the future vision where no such work exist at the moment.

Computers with multi-core processors are common today and feature specialized co-proces- sors (like GPUs) in addition to traditional CPUs. While most applications use only one type of processor, peak performance can only be achieved when all available processors are used at the same time. This gives rise to the field of task scheduling on heterogeneous systems, that is, scheduling tasks over different types of processors. Several implementations of heterogeneous scheduling systems exist today and heterogeneous scheduling is already used, but currently only in games and scientific applications; it is very likely that in the near future this will be expanded to all applications. The development of fast and efficient scheduling algorithms is thus very relevant.

The main focus of our research will be studying the various scheduling algorithms that can be used for heterogeneous task scheduling systems and comparing their performances. We will research which scheduling algorithms are available, how they perform in means of performance and discuss how suitable they are for heterogeneous scheduling.

We will also do actual performance measures, interpret the results and discuss the impli- cations for common applications. Performance measures will be run on everyday computers, those available to most people. We will run CUDA and OpenCL applications on the GPU and compare the result to CPU results to estimate performance increase from rescheduling tasks from the CPU to the GPU. Lastly we will discuss the results and provide our vision on how much heterogeneous computing will gain us in the future.

In a little over two decades, we have witnessed the emergence of applying internet and communication technology (ICT) to the private enterprises(PE). In recently, hybrid cloud technology has been considered as a promising opportunity to enhance the usage of ICT in the PE operation in order to reduce cost of the IT infrastructure, centralize control for technology management, and still be able to keep the security risks at the minimum. In this paper we will research practices of implementing hybrid cloud computing technologies, focusing on the security level and privacy policies that can be applied. The goal is to find a promising approach for private enterprises that aim to adapt hybrid cloud model to their IT processes.

In web development, push messages are used to enable bidirectional communication. WebSockets are a fairly new way of establishing these push messages. However, many mobile phone browsers do not support this technique yet. There are several alternatives for WebSockets, like Comet and AJAX.

To find out what the best alternative for real world application is, our research will review and compares several push service techniques based on performance, ease of use and mobile phone browser support.

We will discuss and judge three push service techniques: WebSockets, Comet and AJAX. We expect to find the best push service technique for real world application in mobile web development.

A time serie is a sequence of data points subsequently measured over time. They appear in various fields such as quantitative finance, medical observations and sensory systems. Many machine learning tasks posed on time series, like indexing, anomaly detection and classification, consider each data point in time as a dimension. This easily results in high dimensional data with highly correlated features and high levels of noise. Therefore the raw representation does not allow these tasks to scale well in terms of storage and computational costs. A solution that has been proposed is the Symbolic Aggregate approXimation (SAX) representation, which aims to reduce dimensionality while still preserving the characteristics of the data.

Compared to many other representations SAX distinguishes oneself by it’s symbolic representation instead of continuous values. Opposed to other non-symbolic representations, SAX can use the efficient algorithms and data structures developed for discrete data e.g. suffix trees and hashing. SAX aims to assign symbols to ranges separated by breakpoints in such a way that each symbol occurs with equal probability. Crucial for this is the probability density function (PDF) assumed for the time series values, e.g. a normal or empirical distribution function. Lin et al. chose the standard normal distribution as their PDF supported by empirical evidence. To validate their assumption they randomly selected normalized segments of different time series in the UCR data set , and showed that on average it resembles a standard normal distribution.

In this paper we verify the assumption made by Lin et al. that for all mining tasks the time series values are normally distributed. We evaluate the distribution of time series values on the UCR time series database [1] for two specific mining tasks: indexing and anomaly detection. We expect that assuming a normal distribution of time series values is valid for indexing but invalid for anomaly detection.

In this paper we present a novel approach to interactive online gesture learn- ing in real-time for domestic service robots with the use of a single structured light 3D-scanner. The whole-body pose recognition approach uses the joint co- ordinate data from the processed depth images. The domestic service robot learns new gestures with the use of a layered control architecture. The continu- ous interaction between the human and the robot provide the start and ending conditions of a typical learned gesture. Together with an already implemented speech recognition system, that provides vocal cues for labelling dynamic ges- tures, it forms a Multimodal Human-Robot Interaction system. The system empowers robots to continuously learn new interactions from their human com- panions.

The research covered in this paper is part of a continuous effort of the BORG team of the University of Groningen to develop an autonomous domestic service robot. The team participates in the @Home league of the international RoboCup benchmark with its main focus in domestic applications. One of the key research fields, that will be the focus of this paper, is that of Human-Robot Interaction. The performance of the existing speech based Human-Robot Interaction system is improved by adding gestures as an additional modality. Labelling and verifi- cation of gestures is performed with the use of interactive behaviours. The different methods to perform online gesture learning with the underly- ing machine learning algorithms are discussed. In addition, the performance in terms of computational complexity of the machine learning algorithm is as- sessed. Ultimately, the improvement of the overall system with a number of different interaction behaviours is determined by laboratory experiments. In addition, the performance of the system is benchmarked at several RoboCup contests. The benchmarking of the system at the contests is used to gather quantitative and qualitative results in a real life domestic environment. The conclusions follow from the comparison of the state of the art with our novel approach in online gesture learning. The main criteria for assessment of the per- formance of the system are the computational complexity and the percentage of correctly classified learned gestures.

The extraction of curve skeletons of 3D shapes is a fundamental problem with many different applications. These include for example virtual navigation, animation or visualization improvement. Because there is such a broad interest in this field, a lot of different techniques have been developed over the years. Although skeleton extraction techniques have roughly the same purpose, their fundamental idea can be very different. For example they can either work with geometric methods or with volumetric methods. Geometric methods work with polygon meshes or point sets and volumetric methods have regularly partitioned voxelized representations or discretized field functions as input data.

First we study existing literature to find out what criteria for skeletonization, like robustness, reconstruction properties or connectivity, there are. Then we can compare these criteria by identifying which important properties they represent and classify them in contradicting properties or similar properties. Based on this identification a list can be created with relevant criteria for skeletonization method comparison.

The next step is to study existing skeletonization techniques, how they work and what characteristics the results show. We also investigate advantages and disadvantages, like computation complexity or memory usage, of them. Finally we create a comparison of the skeletonization techniques based on our criteria. Then it should be possible to distinguish between the different techniques in such a way that a appropriate technique can be identified on the basis of a set of criteria which are important for a specific application.

Speakers

Keynote

Gert-Jan van Dijk

Edge Bundling in Graphs

Jurgen Jans and Ralph Kiers

New Architecture of the Power Grid: The Smart Grid

Brian Setz and Ruurtjan Pul

Image Inpainting

Jelle Nauta and Sander Feringa

Interactive Rendering and Visualization of Extremely Large Data

Sardar Yumatov

Towards Ubiquitos Computing: a Middleware and Device Discovery Framework for Smart Home

Fatimah Alsaif

Heterogeneous Scheduling of GPU and CPU

Robert Witte and Christiaan Arnoldus

Security and Cloud Computing

Aurelian Buzdugan and Luu Tuan

Mobile Browser Communication

Erik Bakker and Mark Kloosterhuis

Road to improving Detection of Anomalies in a Time Series

Harm de Vries and Herbert Kruitbosch

Online Gesture Learning in Domestic Service Robots

Rutger Alders and Tim van Elteren

Curve Skeletonization

David Otterbein and Martien Scheepens