Archive for January, 2014

First semester in Master’s

Friday, January 24th, 2014

I haven’t written anything here for a while and that’s because my first semester in Master’s studies of Computer Science has been quite busy. Because it’s been so busy, I thought that I will write a little about what I’ve been doing. Also this post will be in English because most of my courses are in English and perhaps some non-Estonian-speaking people will also be interested.

Mat-inf teaduskond

Actually there would be a whole lot to talk about, but I’ll limit this to 3 courses that included projects. Because I did all the projects as browser apps then everyone can try them out and get a more visual idea about the topics. Overall I had 6 courses, each worth 6 ECTS (1 ECTS = 26h of work). So formally I did around 7.8 hours of work each day over the course of 4 months (including weekends). I’ll let readers do the conclusions about that themselves. As a side note I could mention that I’m playing Neverwinter MMO regularly and just bought Minecraft.

For the links below I recommend using Chrome, because it calculates faster and when your tab crashes, other tabs will keep running.

Advanced Algorithmics

Anyway. One of the most important courses in the Master’s curriculum is Advanced Algorithmics. It’s important because (at least for me) this is the basis of everything we do in CS. In Bachelor’s there is also a similar course called Algorithmics and Data Structures and that’s probably equally important. But the whole field is so wide that it will take several algorithmics courses to get the main idea about basics.
As for those who know something about CS, algorithmics isn’t just about programming. Programming (and the corresponding programming courses) are mainly just technical aspects on how to write down some logic or system. How to do loops or if-statements in some or other language, what are the variables etc. Of course there are some more complicated structures introduced in programming, but mainly it’s about how to write down your system for the computer.
Algorithmics on the other hand is about creating even more complicated and/or faster systems. For example you could learn how to program a class called Dog in programming. Dog will have a name, age, pedigree and so on. Let’s say that dog also knows about other dogs (s)he has interacted with, let’s call them dog’s friends. Now imagine a case that dogs can speak to each other and some dog called Muttley has some news he wants to share with his friends. To simulate the spread of this news one way would to do a breadth-first search on the dog friends graph starting from Muttley and upon visiting each other dog, we pass on the news to him/her. Breadth-first search is one of the basic methods described in algorithmics courses.

That said, the Advanced Algorithmics course was probably one of the most time-consuming courses (besides Robotex of course) I’ve taken. Each week there were around 6 exercises that took about 2 days to complete. Furthermore we had an essay, the project and of course the exam.

Apart from the project (that I will get to in a moment) I also did couple of homeworks in JavaScript (others were in PHP and not too fun to show to others).

First homework I’ll present here is to show how a binary search tree (without any balancing) does versus a treap in regard to balance with different inputs.

In the app you can set the input to be totally random, batched random (1-10 random, then 11-20 random etc) or ordered. Result is that when we randomize the input (either by shuffling the input or using a treap that will assign random priorities and keep a heap structure on top of the binary search tree) we can get a more balanced structure on average.
Why would we want a balanced structure? So that our search would be with time complexity log(n) of course.


Second homework is about dynamic programming and the Levenshtein distance of words.

It will illustrate a way to quickly search for similar words. Or perhaps it would be more correct to say that to compare the similarity of two words in our case. It is often so that people make mistakes when typing some words. We can take the user inputted word and compare it to some known words in a database. In case we don’t find an exact match, we could offer some nearly similar words. Levenshtein distance is a metric that will allow to specify different weights for a transposition of two letters, adding, deleting or substituting one letter. If we know that people are more prone to type letters in mixed order in contrast to leaving some letters out altogether, then we could assign corresponding weights and offer better suggestions on what the incorrectly typed word might actually be.

Levenshtein distance

Now finally the Advanced Algorithmics project. Because the article I read for the essay was about Locality-sensitive hashing (LSH), we decided to do the project about the nearest neighbour search. That is what LSH is meant to do fast. Although the main idea about LSH is to overcome the curse of dimensionality – fact that the complexity of a search will go from log(n) back to n for most structures as the dimensionality of our universe increases. On the other hand we wanted to make the result very accessible to future students and other people who are interested some ideas behind the nearest neighbour search, and this was most efficiently achieved with a 2D case. So we ended up doing the nearest neighbour search in 2D with visualizations of 4 different algorithms besides the brute-force linear search.

Hopefully this app is intuitive. One can use the Sel Node tool to do searches in the 2D space depicted on the main canvas. Implemented structures are: Quadtree, K-Dimensional tree, Random projection tree and LSH.
If you want to read more about what is done there, we have several pages on the site available and also a poster by Kristel:

Locality-sensitive hashing

There are more interesting projects done in that course by others:

Computer Graphics

We also had a Computer Graphics course that was given by Konstantin Tretjakov after some 6 years of pause. In my opinion this is too one of the most important courses ever. If algorithmics is a subject on its own, fields of mathematics are subjects on their own and so on, computer graphics is a field that binds it all together and does it in a fun way. For example let’s take quicksort: in algorithmics it is thought how to it’s supposed to work, in discrete math it is thought how to analyze it, but at the end of the day it’s still just some abstract idea… until ju actually implement it to do something cool. And that something cool isn’t taking a randomly generated array of a million floating point values and sorting it. That’s like constructing a crossword puzzle for yourself and then trying to enjoy solving it. I’m not saying that it’s not necessary, of course it will give you an estimate on how well the algorithm works and it’s good to compare it to other sorting algorithms… But what can you say about it to your friends? How was your day?; Awesome, I just sorted a million random floating point values.

Computer graphics on the other hand is a field that will take those same random space partitioning algorithms from algorithmics; matrix operations, transformation matrices and vector operations from algebra and geometry courses; derivatives, trigonometry and arithmetics from calculus and puts it all together to create something visible and relatable for everybody. How was your day?; Check out this or this.

Maybe this is just may taste in CS, but I don’t wanna do something in a corner that most people wouldn’t understand. Of course there are some such things that have to be done, but overall goal should be to in the end create something that will directly benefit most people. That something might even be inspiring an artist with a shader-rendered mandlebulb set.

Anyway, I could go on forever how computer graphics is a totally useful and in my mind an essential subject for any CS student. There are some fellow students who have missing links between the math and application. I totally see that happening if all the result you get from your hard work is some numbers on a console. But ok.. let’s get to the things we’re really here for.

There was a lecture about sampling in our course. One of the things mentioned there was that if you sample a signal with a too wide sample step, you might get Moire artefacts in your data. On the lecture slides there was a picture about a 2D cosine function with increasing period and when the sample step got bigger, the artefacts started to show. This seemed quite implausible for me and so I decided to try it out.

On the left I have the cosine wave with increasing period in a 2D image (you could imagine that you’re looking down on a cosine wave). Center is the image spectrum and on the right there is the image created from the spectrum. Actually the right image is only to check that I didn’t too anything weird with the spectrum and the reverse frequency domain transform I borrowed from some Chinese guy worked. Most important is the first image, when we increase the sample step size, there will be quite astonishing artefacts emerging.

Moire artefacts

Now, for the project I wanted to do something fun (of course) and new for me, but it had to be web-based. So in the end we decided to do an audio visualization project. We’re streaming some songs from SoundCloud and wanted to create visualizations that change regarding to the spectrum.

I did the visualizations called Angels, Hedgehog and Lotus. The main idea behind all of them is to map the spectrum onto a 3D sphere and mess around with rotation and light positions according to the average values of the spectrum’s high and low ends. Quite a simple approach, but it was extremely fun to do as a first step ever in sound analyzis and visualization. Also I very much like the result. This is kind of how I visualize music in my mind.
Credit for the other visualizations that also include beat detection, go to Karl-Aksel Puulmann. Furthermore it was a very good choice for a project, because we each worked on our own visualizations and thus ran into very little conflicts. Also the scalability was manageable, at first we had many more ideas in mind, but decided to go one step at a time and even with a scaled down scope, the result was good enough.

Angels visualization Lotus visualization

Here are some other CG projects:

Scientific Computing

Third project that I’ll discuss here is from a Scientific Computing course. That was a very math-heavy course and many people (including me) felt that the mathematical background isn’t that good to get the most out of it. We solved problems that included differential equations, for example how is heat distributed in a room with a heat source. Or solved an optimization problem of finding an equilibrium distance between atoms in a molecule. The whole subject behind the course is extremely interesting, the methods of solving these kinds of problems are also cool, but.. yeah, I wished to get more understanding out of it.

For the final project I chose to model a wave equation in 2D space. This is probably the most unstable project posted so far, so be vary of it.

There were some topics given out for projects and the wave equation was one of them. I think that because of the lack of understanding it was good that the topics were immensely covered by the instructor and quite fixed. Scope was to model the wave using the conjugate gradient method, but it was mentioned in the instructions that for a more better result it might be good to model the wave using both implicit and explicit methods. One requirement was also to send some calculations to C (overall project was meant to be done in some scripting language, preferably in Python, but I decided on JavaScript – otherwise other people might not have a chance to check it out). For doing the calculations in C, I decided to try out Chrome’s Native Client (NaCl for short) technology that allows C code to be run in a sandbox on the browser. Unfortunately for me this approach turned out to be asynchronous and so I had to make a quite messy code with many callbacks. There were two things that were supposed to be done in C: calculating a nested sum and applying a Gauss-Seidel preconditioner in the conjugate gradient method. Calculating the nested sum in C turned out to be more slower than in JS, probably because transferring the data just takes that much time. But because applying the Gauss-Seidel preconditioner includes quite costly loops over large arrays, this resulted in some speed gain over JS.

After a lot of debugging with the instructor, we got this monster to calculate the wave equation correctly and I think the result is quite ok.

Wave equation simulation

Well that’s about the things I wanted to cover here. Have to say, it was a killer semester. Lots of homeworks, projects and new material. So I hope all of you forgive me for not being able to post some concert reviews in the mean time. If not sooner than the summer festivals will of course be here.

Until then, here is a song that I think fits perfectly for people who are studying something: