<![CDATA[rolisz consulting]]>https://rolisz.ro/https://rolisz.ro/favicon.pngrolisz consultinghttps://rolisz.ro/Ghost 3.38Mon, 30 Nov 2020 17:30:42 GMT60<![CDATA[Machine Learning stories: Misunderstood suggestions]]>A couple of years ago I was working on a calendar application, on the machine learning team, to make it smarter. We had many great ideas, one of them being that once you indicated you wanted to meet with a group of people, the app would automatically suggest you a

]]>
https://rolisz.ro/2020/11/30/machine-learning-stories/5fc5247f53c65419dc54f518Mon, 30 Nov 2020 17:00:59 GMT

A couple of years ago I was working on a calendar application, on the machine learning team, to make it smarter. We had many great ideas, one of them being that once you indicated you wanted to meet with a group of people, the app would automatically suggest you a time slot for the meeting.

We worked on it for several months. We couldn't just use simple hand-coded rules, because we wanted to do things like learn every users working hours, which could vary based on many things. In the end, we implemented this feature using a combination of both hand coded rules (to avoid some bad edge cases) and machine learning. We did lots of testing, both automated and manually in our team.

Once the UI was ready, we did some user testing, where the new prototype was put in front of real users, unrelated to our team, who were recorded while they tried to use it and then were asked questions about the product. When the reports came in, the whole team banged their heads against the desk: most users thought we were suggesting times when the meeting couldn't take place!

What happened? If you included either many people or even only one very busy person, there will be no empty slot which is good for everyone. So our algorithm would make three suggestions, saying that for each there would be a different person who might not be able to make the meeting.

In our own testing, it was obvious to us what was happening, so we didn't consider it a big problem. But users who didn't know the system, found it confusing and kept going to the classic grid to manually find a slot for the meeting.

Lesson: machine learning algorithms are never perfect and every project needs to be prepared to deal with mistakes.

How will your machine learning project handle failures? How will you explain to the end users the decisions the algorithm made? If you need help answering these questions, let's talk.

]]>
<![CDATA[My next steps]]>https://rolisz.ro/2020/11/29/my-next-steps/5fc3f41febd40d0556a6e3b8Sun, 29 Nov 2020 19:19:40 GMT

Consulting is something I have dreamed of for a long time. I have done a little bit in the past, on the side, but now the time has come to pursue this full time.

I have used machine learning to solve a large variety of problems such as:

  • text recognition (OCR) from receipts
  • anomaly detection on monitoring data
  • understanding how people use calendar software
  • room booking recommendations
  • chatbots
  • real time surveillance video analysis
  • time series forecasting
  • personally identifiable information (PII) detection

So if you have a hairy machine learning problem and you need advice on how to move forward, I can help you find the best way.

If you are a software company that wants to start developing machine learning projects, I can provide training for your team so that they can develop these projects. I can also give presentations and explain to managers and executives how machine learning projects are developed, what can be done with it (it's not a silver bullet that will solve all known problems) and how ML projects are different from traditional software projects, both during development and in deployment.

If you are a company that wants to know if machine learning is the right solution for a problem you have, such as automating a process that currently is very labor intensive, I can help you make this decision and develop a strategy for making the transition towards automation.

Are you a company that is looking to acquire a machine learning solution and you want some independent appraisal of the cost, duration and feasibility of the project? I can help you with this as well.

So if you need a machine learning advisor, consultant or trainer, feel free to reach out to me.

]]>
<![CDATA[Tailscale]]>I want to share about an awesome piece of technology that I started using recently: Tailscale. It's a sort of VPN, meaning it sets up a private network for you, but only between your devices, without a central server.

It does so without you having to open any ports on

]]>
https://rolisz.ro/2020/11/15/tailscale/5f8aef7aebd40d0556a6e307Sun, 15 Nov 2020 20:18:28 GMT

I want to share about an awesome piece of technology that I started using recently: Tailscale. It's a sort of VPN, meaning it sets up a private network for you, but only between your devices, without a central server.

It does so without you having to open any ports on your router or configure firewalls. It's pretty much powered by JustWorks™️ technology. Each device you register gets an IP address of the form 100.x.y.z. Using this address than you can connect to that device from anywhere in the world, as long as both are connected to the Internet, because Tailscale automatically performs NAT traversal.

My use case for Tailscale is to connect my various devices (desktop, laptop, NAS, Raspberry PI and Android phone) and be able to access them, regardless of where I am.

For my NAS I had setup a DynDNS system, but occasionally it would still glitch and I would lose connectivity. With Tailscale, now I have connection pretty much always. I still keep the DNS address, because I have everything set up for it, but I know that debugging stuff will be much easier.

Similarly, the Raspberry Pi at my parents place is behind a crappy router which sometimes doesn't do the port forwarding properly. Now I can login via Tailscale and fix the issue.

There is a bit of overhead with using Tailscale. When I set up the initial backup to the RPI, I tried going through it, but the bandwith was only 3MB/s, while if I connected directly, it would be 10 MB/s. Because borg encrypts the data, I don't need the additional security provided by Tailscale.

All in all: I can strongly recommend Tailscale. It's a great product if you have any sort of home servers. It's developed by some great guys, including Avery Pennarun, David Crawshaw and Brad Fitzpatrick. I wish more startups did cool stuff like this, which work so well.

]]>
<![CDATA[Multiple cursors]]>Multiple cursors are a feature that has been around for several years. I have heard of it first from SublimeText fans, but I always thought I don't need it. At the time, I was a huge Vim fan and I thought I could get away with crafting fancy regular expressions,

]]>
https://rolisz.ro/2020/10/26/multiple-cursors/5f9691b8ebd40d0556a6e32fMon, 26 Oct 2020 09:22:16 GMT

Multiple cursors are a feature that has been around for several years. I have heard of it first from SublimeText fans, but I always thought I don't need it. At the time, I was a huge Vim fan and I thought I could get away with crafting fancy regular expressions, XKCD style.

But I'm converted now: turns out that multiple cursors are more useful and easier to use. Can I craft a regex for what I'm searching for? Yes. Will it take a lot longer? Oh yes. Is using multiple cursors super simple? Absolutely. Does it work in a lot more places? Yes.

What convinced me was the fact that it's working in Obsidian too. Several UI toolkits for the web offer this out of the box, so it's working on many websites too. I can use the same "shortcut" in PyCharm, Visual Studio Code, etc, without having to set up a Vim mode.

What other cool shortcuts do you use?

]]>
<![CDATA[Backing up 4: The Raspberry Pi]]>More than a year ago I described how I used Syncthing to backup folders from my NAS to an external harddrive attached to my parents PC. This was supposed to be my offline backup. Unfortunately, it didn't prove to be a very reliable solution. The PC ran Windows, I had

]]>
https://rolisz.ro/2020/10/11/raspi-backups/5f71897a5ad1bb49f64c71b8Sun, 11 Oct 2020 20:31:12 GMT

More than a year ago I described how I used Syncthing to backup folders from my NAS to an external harddrive attached to my parents PC. This was supposed to be my offline backup. Unfortunately, it didn't prove to be a very reliable solution. The PC ran Windows, I had trouble getting SSH to work reliably, I would often had to fix stuff through Teamviewer. Often the PC would not be turned on for days, so I couldn't even do the backups without asking them to turn it on. And Syncthing turned out to be finicky and sometimes didn't sync.

Then it finally dawned on me: I have two Raspberry Pi 3s at home that are just collecting dust. How about I put one of them to good use?

So I took one of the Pis, set it up at my parents place and after some fiddling, it works. Here's what I did:

I used the latest Raspbian image. It sits at my parent's home, which has a dynamic IP address. The address usually changes only if the router is restarted, but it can still cause issues. At first I thought I would set up a reverse SSH tunnel from the Raspberry Pi to my NAS, but I couldn't get autossh to work with systemd.

Then I tried another option: I set up a Dynamic DNS entry on a subdomain, with ddclient on the Raspberry Pi to update the IP address regularly. I had to open a port on my parents router for this. I added public key authentication through SSH, while restricting password based authentication to LAN networks only, in the /etc/ssh/sshd_config:

PasswordAuthentication no
ChallengeResponseAuthentication no

Match Address 10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
    PasswordAuthentication yes

It has worked for two weeks, so that's pretty good.

Now that I have a stable connection to the Pi, it was time to set up the actual backups. I looked around and there are several options. I ended up choosing BorgBackup. It has builtin encryption for the archive, so I don't need to muck around with full disk encryption. It also does deduplication, compression and deltas, so after an initial full backup, it only backs up changes, so it's quite efficient.

BorgBackup is quite simple to use. First you have to initialize a repository, which will contain your backups:

 > borg init ssh://user@pi_hostname:port/path/to/backups -e authenticated

This will prompt you for a passphrase. It will also generate a keyfile, which you should export and keep safe on other machines:

> borg key export ssh://user@pi_hostname:port/path/to/backups

Then, to start the actual backup process:

> borg create --stats --progress --exclude "pattern_to_exclude*" ssh://user@pi_hostname:port/path/to/backups::archive_name ./folder1 ./folder2 ./folder3

The archive_name corresponds to one instance when you backed up everything. If the next day you rerun the command with archive_name2, it will compare all the chunks and transmit only the ones that have changed or which are new. Then you will be able to restore both archives, with BorgBackup doing the right thing in the background to show you only the items that were backed up in that archive.

The cool thing about Borg is that if a backup stops while in progress, it can easily resume at any time.

I added command to a cron job (actually, the Synology Task Scheduler) to run it daily and now I have daily, efficient backups.

#/bin/sh
# Archive name schema
DATE=$(date --iso-8601)
echo "Starting backups for $DATE"
export BORG_PASSCOMMAND="cat ~/.borg-passphrase"
/usr/local/bin/borg create --stats --exclude "pattern_to_exclude*" ssh://user@pi_hostname:port/path/to/backups::$DATE ./folder1 ./folder2 ./folder3

The .borg-passphrase file contains my passphrase and has the permission set to 400 (read only by my user). Borg then reads the passphrase from that environment variable, so no user input is necessary.

Now I get the following report by email every morning:

Duration: 4 minutes 22.54 seconds
Number of files: 281990

			Original size      Compressed size    Deduplicated size
This archive:              656.97 GB            646.90 GB             12.51 MB

Not bad. Borg sweeps 656 GB of data in 4.5 minutes, determines that there is only 13 MB of new data and sends only that over the network.

I feel much more confident about this solution than about the previous one! Here's to not changing it too often!

]]>
<![CDATA[Playing Codenames in Rust with word vectors]]>In a previous post I implemented the game of Codenames in Rust, allowing a human player to interact with the computer playing randomly. Now let's implement a smarter computer agent, using word vectors.

Word vectors (or word embeddings) are a way of converting words into a high dimensional vector of

]]>
https://rolisz.ro/2020/09/26/playing-codenames-in-rust-with-word-vectors/5f3adc474f71eb12e0abb8caSat, 26 Sep 2020 18:32:01 GMT

In a previous post I implemented the game of Codenames in Rust, allowing a human player to interact with the computer playing randomly. Now let's implement a smarter computer agent, using word vectors.

Word vectors (or word embeddings) are a way of converting words into a high dimensional vector of numbers. This means that each word will have a long list of numbers associated with it and those numbers aren't completely random. Words that are related usually have values closer to each other in the vector space as well. Getting those numbers from raw data takes a long time, but there are many pretrained embeddings on the internet you can just use and there are also libraries that help you find other words that are close to a target word.

Word vectors in Rust

Machine learning has embraced the Python programming language, so most ML tools, libraries and frameworks are in Python, but some are starting to show up in Rust as well. Rust's performance focus attracts people, because ML is usually computationally intensive.

There is one library in Rust that does exactly what we want: FinalFusion. It has a set of pretrained word vectors (quite fancy ones, with subword embeddings) and it has a library to load them and to make efficient nearest neighbor queries.

The pretrained embeddings come in a 5 GB file, because they pretty much literally have everything, including the kitchen sink, so the download will take a while. Let's start using the library (after adding it to our Cargo.toml file) to get the nearest neighboring words for "cat":

use std::io::BufReader;
use std::fs::File;
use finalfusion::prelude::*;
use finalfusion::similarity::WordSimilarity;

fn main() {
    let mut reader = BufReader::new(File::open("resources/ff.fifu").unwrap());

    // Read the embeddings.
    let embeddings: Embeddings<VocabWrap, StorageViewWrap> =
        Embeddings::read_embeddings(&mut reader)
            .unwrap();
    println!("{:?}", embeddings.word_similarity("cat", 5).unwrap());
}

After running it we get the following output:

[WordSimilarityResult { similarity: NotNan(0.81729543), word: "cats" },
WordSimilarityResult { similarity: NotNan(0.812261), word: "kitten" },
WordSimilarityResult { similarity: NotNan(0.7768222), word: "feline" },
WordSimilarityResult { similarity: NotNan(0.7760824), word: "kitty" },
WordSimilarityResult { similarity: NotNan(0.7667354), word: "dog" }]
Sidenote:

Loading a 5GB file from disk will take some time. If you have enough RAM, it should be in the OS's file cache after the first run, so it will load faster. Also, compiling this program with --release (turning on optimizations and removing debug information) will speed it up significantly.

One of the rules of Codenames is that hints can't be any of the words on the board or direct derivatives of them (such as plural forms). The finalfusion library has support for masking some words out, but to get plural forms I resorted to another library called inflector which has a method called to_plural, which does exactly what's written on the box.

use std::io::BufReader;
use std::fs::File;
use finalfusion::prelude::*;
use finalfusion::similarity::EmbeddingSimilarity;
use std::collections::HashSet;
use inflector::string::pluralize::to_plural;

fn main() {
    let mut reader = BufReader::new(File::open("resources/ff.fifu").unwrap());

    // Read the embeddings.
    let embeddings: Embeddings<VocabWrap, StorageViewWrap> =
        Embeddings::read_embeddings(&mut reader)
            .unwrap();
    let word = "cat";
    let embed = embeddings.embedding(word).unwrap();
    let mut skip: HashSet<&str> = HashSet::new();
    skip.insert(&word);
    let pluralized = to_plural(word);
    skip.insert(&pluralized);
    let words = embeddings.embedding_similarity_masked(embed.view(), 5, &skip).unwrap();
    println!("{:?}", words);
}

This is a slightly lower level interface, where we first have to obtain the embedding of the word, we build the set of words to skip and then we search for the most similar words to the vector that we give it. The output is:

[WordSimilarityResult { similarity: NotNan(0.812261), word: "kitten" },
WordSimilarityResult { similarity: NotNan(0.7768222), word: "feline" },
WordSimilarityResult { similarity: NotNan(0.7760824), word: "kitty" }, 
WordSimilarityResult { similarity: NotNan(0.7667354), word: "dog" }, 
WordSimilarityResult { similarity: NotNan(0.7471396), word: "kittens" }]

It's better. Ideally, we could also somehow remove all composite words based on words from the table, but that's a bit more complicated.

This can be wrapped in a function, because it's a common use case:

fn find_similar_words<'a>(word: &str, embeddings: &'a Embedding, limit: usize) -> Vec<WordSimilarityResult<'a>> {
    let embed = embeddings.embedding(&word).unwrap();
    let mut skip: HashSet<&str> = HashSet::new();
    skip.insert(&word);
    let pluralized = to_plural(&word);
    skip.insert(&pluralized);
    embeddings.embedding_similarity_masked(embed.view(), limit, &skip).unwrap()
}

Implementing the first spymaster

Let's implement our first spymaster which uses word vectors! First, let's define a type alias for the embedding type, because it's long and we'll use it many times.

type Embedding = Embeddings<VocabWrap, StorageViewWrap>;

Our spymaster will have two fields: the embeddings and the color of the player. The Spymaster trait requires only the give_hint function to be implemented.

pub struct BestWordVectorSpymaster<'a> {
    pub embeddings: &'a Embedding,
    pub color: Color,
}

impl Spymaster for BestWordVectorSpymaster<'_> {
    fn give_hint(&mut self, map: &Map) -> Hint {
        let enemy_color = opposite_player(self.color);
        let remaining_words = map.remaining_words_of_color(enemy_color);
        let mut best_sim = NotNan::new(-1f32).unwrap();
        let mut best_word = "";
        for word in remaining_words {
            let words = find_similar_words(&word, self.embeddings, 1);
            let hint = words.get(0).unwrap();
            if hint.similarity > best_sim {
                best_sim = hint.similarity;
                best_word = hint.word;
            }
        }
        return Hint{count: 1, word: best_word.to_string()};
    }
}

This spymaster uses a simple greedy algorithm. It takes each word that has to be guessed and find's the most similar word to it, while keeping track of the similarity. It returns as hint the word that had the highest similarity to any of the words that belong to the opposite team.

How does it do? I drew some random boards with a fixed seed and ran this spymaster on them. If you hover over the hint, it shows what are the words it's based on.

We have a problem: the word embeddings we use are a bit too noisy. The word embeddings are usually trained on large text corpora crawled from the internet, such as Wikipedia, the Common Crawl project or from CoNLL 2017 dataset (this is the one used above). The problem with these large corpuses is that they are not perfectly cleaned. For example "-pound" is considered a word. Let's try the CC embeddings:

Unfortunately, the CC embeddings give even worse results.

Cleaning up the embeddings

My fix for this was to write a script to prune down the embeddings to only "real" words (ones made of only letters). First, I had to get a set of all these words.

    let words = embeddings.vocab().words();
    let mut total = 0;
    let mut lowercase = 0;
    let mut select = HashSet::new();
    for w in words {
        total += 1;
        if w.chars().all(char::is_lowercase) {
            lowercase +=1;
            select.insert(w.clone());
        }
    }
    println!("{} {}", total,  lowercase);

Then I had to get the embeddings and the norms for each of these words:

    let mut selected_vocab = Vec::new();
    let mut selected_storage = Array2::zeros((select.len(), embeddings.dims()));
    let mut selected_norms = Array1::zeros((select.len(),));

    for (idx, word) in select.into_iter().enumerate() {
        match embeddings.embedding_with_norm(&word) {
            Some(embed_with_norm) => {
                selected_storage
                    .row_mut(idx)
                    .assign(&embed_with_norm.embedding);
                selected_norms[idx] = embed_with_norm.norm;
            }
            None => panic!("Cannot get embedding for: {}", word),
        }

        selected_vocab.push(word);
    }

And finally write the now much smaller embedding file:

    let new_embs = Embeddings::new(
        None,
        SimpleVocab::new(selected_vocab),
        NdArray::from(selected_storage),
        NdNorms::new(selected_norms),
    );
    let f = File::create("resources/smaller_embs.fifu").unwrap();
    let mut reader = BufWriter::new(f);
    new_embs.write_embeddings(&mut reader);

On the embeddings trained on the CoNLL dataset the reduction is about 6x: from 1336558 to 233453.

Let's give our Spymaster another shot with these embeddings, simply by changing the file from which we load the embeddings:

"superheroic" and "marched" look kinda against the rules, being too close to one of the words on the board, but "movie" is a really good one word hint.

Implementing a field operative

Now let's implement the other part of the AI team: the field operative which has to guess which words from the board belong to the enemy, based on the hints the spymaster gave.

pub struct SimpleWordVectorFieldOperative<'a> {
    embeddings: &'a Embedding,
}

impl FieldOperative for SimpleWordVectorFieldOperative<'_> {
    fn choose_words<'a>(&mut self, hint: &Hint, words: &[&'a str]) -> Vec<&'a str> {
        let hint_emb = self.embeddings.embedding(&hint.word).unwrap();
        let hint_embedding = hint_emb.view();
        let mut similarities = vec![];
        for w in words {
            let new_embed = self.embeddings.embedding(&w).unwrap();
            let similarity: f32 = new_embed.view().dot(&hint_embedding);
            similarities.push((w, similarity));
        }
        similarities.iter()
            .sorted_by(|(_, e), (_, e2)| e.partial_cmp(e2).unwrap())
            .rev().take(hint.count).map(|x| *x.0).collect()
    }
}

The field operative is even simpler: we go through all the words that are still on the board and get a similarity score between them and the hint. Sort the words by the similarity and take the top "count" ones.

Let's see how it does on the same maps as before. If you hover over the field operative, you can see the guesses it makes.

It looks like the two AI players are a good fit for each other: the field operative always guesses the word that the spymaster based the hint on. Now, let's try to make the spymaster give hints for multiple words.

Improving the spymaster

My first idea would be to generate the top n closest embeddings for all words of a color and see if there are any which are in common. n will be a tuneable parameter: lower values will give hints that are closer to the words, but they will match fewer words, higher values will match more words, potentially worse.

impl Spymaster for DoubleHintVectorSpymaster<'_> {
    fn give_hint(&mut self, map: &Map) -> Hint {
        let enemy_color = opposite_player(self.color);
        let remaining_words = map.remaining_words_of_color(enemy_color);

        let mut sim_words = HashMap::new();
        for word in remaining_words {
            let words = find_similar_words(&word, self.embeddings, self.n);
            for w in words {
                let count = sim_words.entry(w.word).or_insert(0);
                *count +=1;
            }
        }
        let mut best_word = sim_words.iter()
        		.max_by_key(|(&x, &y)| y).unwrap();

        return Hint{count: *best_word.1 as usize, word: best_word.0.to_string()};
    }
}

We store how often each suggested word is found in all suggestions in a hashmap, with the key being the word and the value being the occurence count. After we add all words there, we simply read out the maximum by occurence count. In Rust, this sort is nondeterministic, so if there are multiple words that occur the same number of times, which one will be returned is not guaranteed.

Around n=40, we start seeing hints for multiple words. At n=80 we have hints for two words for all three maps. At n=400 we have triple hints for two of the maps. But starting with n=80, the field agent no longer guesses correctly all of source words. Sometimes it's because the associations in the words is weird, but more often it's because the spy master only takes into account the words to which it should suggest related hints and it doesn't take into account the words from which the hint should be dissimilar.

There are several ways to address this issue, from simply seeing if the hint word is too close to a bad word and rejecting it, to more complex approaches such as finding the max-margin plane that separates the good words from the bad words and looking for hint words near it. But this post is already long, so this will come in part 3.

The whole code I have so far can be seen on Github.

]]>
<![CDATA[Tenet]]>After many delays, the movie Tenet has finally come to the cinema. It's amazing. I loved every single bit of it. Almost 2.5 hours of pure awesomeness. #christopernolan4president This is what a good movie is supposed to be like. Go on, go see it. The rest of the post

]]>
https://rolisz.ro/2020/09/17/tenet/5f63ce5e5ad1bb49f64c7098Thu, 17 Sep 2020 21:44:09 GMT

After many delays, the movie Tenet has finally come to the cinema. It's amazing. I loved every single bit of it. Almost 2.5 hours of pure awesomeness. #christopernolan4president This is what a good movie is supposed to be like. Go on, go see it. The rest of the post will wait.

At least, that's my opinion. I went to see it at the cinema with 5 friends. One left halfway through the movie. One fell asleep. One admitted he lost track of what was going on. One said it was good movie. The last one is as excited as I am.

Tenet
This happens after 5 minutes

The movie is a packed action movie. It starts with 10 seconds of logos, the you have actual footage and then 15 seconds later the shooting starts. And it keeps a similar pace for the rest of the movie, sometimes stopping to try and explain what's going on.

It's a philosophical movie. I mean, it's about going forwards and backwards in time. It screws with causality. It explores the grandfather paradox. It does make your mind bend, even more so than previous Nolan movies. Just think about the nested temporal double pincers.

Tenet
Bullet holes from bullets that haven't been fired yet

The heists/breakins/missions are really well thought out and you actually get to see some of them twice :D

Tenet
This happened for real

The visuals are stunning. I mean Nolan bought a plane and blew it up for real, rather than doing it with CGI. The sequences where there are both normal and inverted people (so going both forward and backward in time) are... wow. The locations where they filmed are gorgeous.

The soundtrack is brilliant. Ludwig Göransson managed to capture the feel of the movie perfectly. The score fits perfectly to the scenes. He made the melodies by researching retrograde composition, so they would sound (approximately) the same forward and backward.

The acting is great. I don't know where Nolan found the John David Washington, the actor who plays "The Protagonist", but he made a really great choice. I'm looking forward to seeing him in more movies. And it makes me want to watch BlacKkKlansman.

Tenet
Occasionally the movie slows down to make a joke about suits

Did I mention I love this movie? Is it obvious from my review? No? Well, score: 11/10. There, if it wasn't clear enough so far.

This is a movie that needs to be seen in cinemas. There's big explosions, bullets flying, beautiful landscapes, all of which benefit from a big screen and good sound system. So go watch it in the cinema, so that Christopher Nolan will get money to make more awesome movies.

]]>
<![CDATA[Visiting Romania: The Black Sea]]>Fun fact about me: until this year, I've never been to the Black Sea for touristy, sun bathing and relaxing purposes. I've been there 3 times before: the first time in an April, for a Math Olympiad, when it was cold, the second time in a February, for a Physics

]]>
https://rolisz.ro/2020/09/10/visiting-romania-the-black-sea/5f5a69604f71eb12e0abba52Thu, 10 Sep 2020 20:08:33 GMT

Fun fact about me: until this year, I've never been to the Black Sea for touristy, sun bathing and relaxing purposes. I've been there 3 times before: the first time in an April, for a Math Olympiad, when it was cold, the second time in a February, for a Physics Olympiad, when it was extremely cold and a third time in August, on a tour of historic christian sites, which was fun, but exhausting and I didn't have time to relax.

Last weekend, I finally got the chance to enjoy a couple of lazy days and the shade of an umbrella on the beach. And more excitingly for me: I got to fly again. I so missed it.

Visiting Romania: The Black Sea

We flew from Oradea to Bucharest, rented a car and drove 3 hours to Neptun. Surprise: TAROM is an airline company just like any other. It might have been worse in the past, but the flight was very ok. The airport in Oradea is growing: it now has two terminals. Too bad it doesn't have too many flights 😄

Visiting Romania: The Black Sea
Driving on the not so sunny "Sun Highway"

I rented the car from FMNRent and I was very impressed. It was raining pretty bad when we arrived and I didn't know where their office would be, so I was afraid I'd have to walk quite a bit in rain. Well, they waited for us at the airport, took us to the office by car, I signed the papers quickly and we were on our way. Really awesome customer service!

On our way to Neptun we did a small detour to have lunch at Forest M. The pictures on Google Maps don't do it justice. Really nice location, in the middle of a forest, really good atmosphere and great food as well. I want to go to the sea again so that I can stop and eat there again.

Visiting Romania: The Black Sea

And then we got to the seaside finally. I didn't expect Neptun to be so green. Just 2 minutes away from the beach there are nice parks and forests and lots of places to hide from the sun. I liked it. It also has an interesting mix of older communist style buildings and more modern hotels. It used to be Ceaușescu's seaside retreat, so it has a lot of fancy older villas and that's why there are so many parks.

Visiting Romania: The Black Sea

The sea was quite agitated, with many waves. Two observations: first, it's really fun to jump into waves. I don't know why, but it's just fun. Second: there is a certain mystic/transcendent quality to waves washing up on the shore, only for other ones to come again and again. While I know the theory behind the waves and the tide, I can't help but wonder how marvelous and inspiring it must have been for people living ages ago.

Visiting Romania: The Black Sea
I'm not really sure what it was, but even the garlic was delicious

Before going, I heard from multiple people that the Romanian seaside is expensive and that Bulgaria, Greece or Croatia are cheaper. Well, we found a restaurant 2 minutes away from the beach with great food and normal prices. I don't know where everyone else went.

Visiting Romania: The Black Sea
View from the hotel. Eastern Europe in one picture. 

The hotel was meh. To quote Dyatlov, not great, not terrible. It was 5 minutes from the beach, which was the important part.

And then our short trip ended. We flew back, circled the Oradea airport several times, because of unfavorable air currents, saw our house that is still in construction from the air and went back to work.

]]>
<![CDATA[Visiting Romania: Șuior]]>I have to admit that I haven't seen too much of my home country. I've set foot in at most half of the counties and I've actually visited and done touristy things in even fewer of them. I've visited more countries than that as a tourist.

But after a recent

]]>
https://rolisz.ro/2020/08/11/visiting-romania-suior/5f32dea14f71eb12e0abb808Tue, 11 Aug 2020 19:37:26 GMT

I have to admit that I haven't seen too much of my home country. I've set foot in at most half of the counties and I've actually visited and done touristy things in even fewer of them. I've visited more countries than that as a tourist.

But after a recent trip to Maramureș, a county in the Northern part of Romania, I was so impressed by how beautiful it looked, that I want to fix this deficiency I have.

Visiting Romania: Șuior
Lake Bodi

Last weekend we went to the Șuior area with some friends who had been there before. First we stopped by Lake Bodi to have lunch there. It's a very calm lake, with very nice surroundings, good for sunbathing, swimming or taking a nice stroll.

Then we went to the Șuior peak. It's a short drive from the lake. During winter, it's a ski resort, so they have a chairlift, which is also operating during summer.

I always enjoy going on chairlifts. It's so quiet up there, as you slowly go up the mountain, with just the trees around you.

Reaching the peak, where there is a weather station, is done by a 40 minute hike, on a 30% slope. As we were going up, we started hearing thunders. Looking around, we see some dark clouds forming. Should we go back or should we press on? Despite me barely catching my breath, we decided to reach the peak, no matter what. I was soaking wet anyway, what's some rain going to do?

Visiting Romania: Șuior
Fresh bilberries

We got to the top safely, without only a couple of drops falling on us. We could see in the distance where the rain was flooding everything, but we were safe. We picked some bilberries and then we started going back.

Visiting Romania: Șuior
Me walking up barefoot

It rained on us a little bit going down and the grass was more slippery, but we got down safe. When we got there, the chairlift operator told us that the previous group rode the chairlift during the rain and got soaking wet. There was even hail where we left our car. But fortunately, we avoided all of that.

After finishing what the leftovers from lunch, we started the two hour drive back. The other great thing about Maramureș is that even the roads are really scenic. At least the roads we took were really good and often we would go through forests.

All in all, it was a really relaxing day. Thank you our dear "godparents" for taking us out for this fun day and thank you for all the really good talks as well!

I’m publishing this as part of 100 Days To Offload - Day 34.

]]>
<![CDATA[Boardgames Party: Azul]]>Last weekend I played a fun game called Azul, which has won many awards, among which the famed Spiel des Jahres in 2018. The name comes from "azulejos", Portuguese tiles made by the Moors. A portugese king fell in love with them and tasks the players with making the most

]]>
https://rolisz.ro/2020/08/03/boardgames-party-azul/5f248b5b4f71eb12e0abb7b8Mon, 03 Aug 2020 19:38:54 GMT

Last weekend I played a fun game called Azul, which has won many awards, among which the famed Spiel des Jahres in 2018. The name comes from "azulejos", Portuguese tiles made by the Moors. A portugese king fell in love with them and tasks the players with making the most beautiful decoration in his palace.

Boardgames Party: Azul
The factories making the tiles

There are a number of factories "making" four tiles each. On their turn, players take all the tiles of one color from a factory and put the others in the middle. Using these tiles, they have to prepare to decorate the wall.

Boardgames Party: Azul
Top: scoring area. Left: staging area. Right: the wall to be decorated. Bottom: penalty area

Take too many tiles, which don't fit the staging area, you have to put the surplus ones in the penalty area. Don't gather enough tiles of one color to fill one row in the staging area and you are not able to decorate the wall.

Boardgames Party: Azul
Second row is filled and will decorate the wall. Fourth row is missing one tile, so it won't be used in this turn. 

The game lasts until someone manages to fill one horizontal row on the wall. You get extra points if you fill out vertical rows or if you manage to place all the tiles of one color.

The game is quite simple to explain and it doesn't last long. On our first play, with reading the instructions and understanding the rules, it took one hour, of which the actual game play was half. Despite it's simplicity, there are several interesting strategies to explore, whether to go for the shorter rows first, or for the longer ones, whether to try to get extra points by doing the bonus stuff or whether to try to finish first. It was a nice brain teaser for a Friday night.

Score: nine

Boardgames Party: Azul
The tiles are kept in this really cute bag

I’m publishing this as part of 100 Days To Offload - Day 33.

]]>
<![CDATA[Getting rid of vmmem]]>A while ago I noticed in Task Manager that there is a process called Vmmem running and it's using about 5-15% CPU constantly. A quick duckduckgoing revealed that it's a virtual process that represents the total system usage by VMs.

Alright, so it's not a malware. But I was not

]]>
https://rolisz.ro/2020/07/31/getting-rid-of-vmmem/5f23f7dd4f71eb12e0abb780Fri, 31 Jul 2020 17:52:17 GMT

A while ago I noticed in Task Manager that there is a process called Vmmem running and it's using about 5-15% CPU constantly. A quick duckduckgoing revealed that it's a virtual process that represents the total system usage by VMs.

Alright, so it's not a malware. But I was not running any VMs. Where does it come from? My first suspicion was Docker, which runs Linux containers in a VM. But, I closed the Docker for Windows application, and the Vmmem was still chugging along, burning CPU. Then I suspected that the Windows Subsystem for Linux might be doing something funny, but no, it wasn't that, because I'm still using version 1, not version 2, which is the one that runs in a VM.

Well, after some more searching, it turns out that in some cases, Docker doesn't clean up properly when quitting and it leaves the VM open. To kill it, you must open the application called Hyper-V Manager and turn off the VM there manually.

Getting rid of vmmem

To paraphrase Homer Simpson, "Docker, the cause of, and solution to, all our problems".

I’m publishing this as part of 100 Days To Offload - Day 32.

]]>
<![CDATA[Note taking with Obsidian]]>After starting to use spaced repetition more actively and being more consistent with my journaling, now I've tackled improving my note taking skills.

Over the last couple of months, I've read about various note-taking methods, different approaches and diverse goals for them and they helped me change my perspective on

]]>
https://rolisz.ro/2020/07/28/obsidian/5efb2d5717253e7fe6dd64caTue, 28 Jul 2020 07:37:56 GMT

After starting to use spaced repetition more actively and being more consistent with my journaling, now I've tackled improving my note taking skills.

Over the last couple of months, I've read about various note-taking methods, different approaches and diverse goals for them and they helped me change my perspective on collecting ideas.

My previous approaches to taking notes

While I was still in school, I didn't take too good notes. I guess the Romanian educational system is not set up to encourage that kind of critical thinking which leads to taking good notes. In most classes, the teacher would dictate a lesson, I would write that down and then memorize it, without needing to create my own summarized version of the lesson.

Then I took notes in Google Keep. When it came out, I was impressed by how simple and snappy it was, compared to Evernote for example. I've used it for around 6 years, so I have a lot of notes jotted down there. But Keep doesn't offer good ways to organize notes, having only tags and search. But to search, you have to remember something to look for, so it's almost impossible to find notes which you have forgotten.

As I've started moving my notes from Keep to Obsidian, I found a lot of old notes which I had totally forgotten about. Some of those notes were ideas I had, but I never followed up on them. Looking back they were good ideas: I know because in the meantime other people have implemented them.

Goals of a note taking system

A goal of a note taking system should be to store information, so that your brain can do more fun/useful stuff.

But another goal is to help you connect ideas, even from different domains. As you read things in books, blogs, or come up with ideas on your own, eventually some of them will be related and can be combined to make something even better. The note taking system should facilitate creating these connections.

It should also help you process your notes. A note is not a static thing. It's not something you just write down and then never touch again. Almost all ideas will eventually need refining. When you read a book and find something noteworthy, you write some notes, but in time you find that you can reword it better, as you understand things better. For your own ideas, in time you add something; you cut some things, hopefully making them better and then you create something with them.

Lastly, the note organizing system should help you organize and browse your notes. It should encourage serendipitous rediscovery of forgotten notes. It should help visualize them. It should make it easy to see all notes that are related to a certain note or to a certain topic.

Enter Obsidian

Note taking with Obsidian
Meta: Writing this post in Obsidian. Left: folder pane; middle: raw markdown; right: Rendered note

Obsidian is a new desktop app for creating a "A second brain, for you, forever.". It's quite new, it was released publicly two months ago and it's not even at version 1.0, but development is progressing quite quickly, with a new version coming out almost weekly.

Obsidian is built around Markdown files, which means that the notes are portable and won't get stuck in an old program, should the company go out of business. It enhances the Markdown syntax with the some shortcuts for creating connections between pages using [[name of other page]] syntax. You can see for each page what other pages link back to it. There's also a way to embed one file (or part of a file) into another one.

Note taking with Obsidian
The graph view

One cool looking feature in Obsidian is the graph view, where you can visualize how your notes are connected. This leads to some interesting looking "constellations".

There are also plugins available. For now, the API is internal, so all the plugins are made by the same company, but they say that once they reach 1.0, they will make the API public.

Because Obsidian is based on Markdown files, you can store them wherever you want and you can sync them across devices with your favorite syncing tool. So far, I've used GitHub.

One disadvantage is that it doesn't have a mobile app so far (or a remote interface in general). On one hand, it's not a big issue for me, because my goals for notes require a big screen and a nice keyboard. In practice, I still like to have access to my notes even on the go, so I just use an Android Git client with a Markdown editor. Maybe one day I'll change this to expose my notes as a static website.

My note taking workflow

My workflow is inspired by (but only inspired, not fully copied from) the Zettelkasten method of Niklas Luhman, which is the trending note taking framework du jour, sprinkled with ideas from Tiago Forte and others. When my note-taking workflow grows up, it wants to be like Andy Matuschak's notes.

I have a loose categorization of my notes into folders. I'm not very strict about them and I don't want to spend much time organizing a hierarchy. Some notes also have tags (simply words prefixed with # and Obsidian is smart enough to start a search for them if you click on one).

Many of my notes don't start their life in Obsidian, but in my bullet journal. When I hear something new, or I read something interesting, often my bullet journal is closer to me, so I jot down the main ideas there. Then, when I get to my computer and I have time I copy it into Obsidian and I flesh out the ideas fully. Then I try to find any other notes that are relevant and add connections to them.

One thing that I try to do is to make my notes my own. This means that if I read an interesting article, I don't just copy paste the interesting parts, but I actually reword them and write them down as I understood them. This helps both comprehension and retention.

For certain topics, I create index maps, where I list all the notes that I have related to that topic. So notes belong to multiple index maps, because they are relevant in different areas.

As I add new notes and create connections between them, I end up revisiting old notes and updating them. Sometimes it's with a negative update (it didn't pan out, it was a wrong idea), but sometimes it's a positive one (a further development or something similar someone else has done).

The most interesting part is when notes from very different areas start to "touch" each other. Because of the graph view, it's easy to see how close notes are to each other. I also use the graph view to see what notes are isolated and then I try to find them a place. As you can see in the above screenshot, I still have a lot of work to do.

So far I have added 160 notes in Obsidian, so it's a small knowledge base. I still have many notes in Google Keep to move over. But I feel like Obsidian has already helped me (I feel on top of my notes) and I hope something nice (and useful) will come out of it.

Top photo by Pogány Péter - Egen Wark.

I’m publishing this as part of 100 Days To Offload - Day 31.

]]>
<![CDATA[Battlestar Galactica]]>I finally did it: I watched one of the best TV shows made in the last 20 years (at least according to a list made by the New York Times). For some reason, I didn't like the concept back when it aired, but I decided to give it a shot

]]>
https://rolisz.ro/2020/07/25/bsg/5efb1f3917253e7fe6dd6465Sat, 25 Jul 2020 18:15:03 GMT

I finally did it: I watched one of the best TV shows made in the last 20 years (at least according to a list made by the New York Times). For some reason, I didn't like the concept back when it aired, but I decided to give it a shot after I finished Travelers. And I was hooked.

Battlerstar Galactica (BSG) is nominally a sci-fi show about a war between humans and the robots they created. But the show actually is more about all kinds of philosophical, political, religious and metaphysical debates.

It's a 15 year old show, but somehow I have avoided spoilers and I had pretty much no idea about anything that would happen. But this post will have spoilers :P

One of the recurring themes in the show is that of a cyclical repetition of history: "All this has happened before, and all this will happen again" is a motto oft repeated in the show. There is a recurring theme of humans creating robots (cylons), cylons rebelling against their makers, cylons almost wiping out the humans and then this would repeat again, several thousand years later. The show tries to end on an optimistic note, that maybe now the cycle might be broken, due to the "law of large numbers".

Battlestar Galactica
The eponymous battlestar

BSG also explores how a civilization should be lead. The humans are initially governed as a democracy, with elections, fair trials and so on. But those things tend to get in the way of quick and decisive action, which is needed during war. There are several military coups, rebellions, sham trials and so on. They get pretty close to exploring communism, to get rid of class warfare. They are willing to do genocide against the cylons, initially not being able to consider getting to a peaceful agreement with them. Even though initially the two leaders, Admiral Adama and President Roslin are very likeable, after four seasons, during which they had to make many questionable decisions, they lose a lot of their charisma.

Religion has an important role for the people (and machines) of BSG. Humans start out with a polytheistic religion ("coincidentally", with names from the Roman and Greek pantheon). Cylons have a monotheistic religion, I'd say a bit inspired from Christianity. There are religious writing that make prophecies which (seem) to come true and which guide the humans in their search for a new home. I am a bit conflicted about this part. On one hand, my personal beliefs are somewhat similar to what the ending of the show ("it's part of God's plan"), but at the same time, I like my sci-fi with less religion.

Over the course of 4 seasons, all the characters evolve. As I said before, the leadership gets stained by all the hard decisions they've had to made and by the end of the show they are tired and sick of it all. Apollo is hilarious as he gets fat and lazy and then has to work extra hard to get back in shape (though maybe I shouldn't be laughing at this...). But the character development of Gaius Baltar is a bit too extreme for me. He goes from a completely selfish and narcisistic person to being a guy preaching love and forgiveness and who's ready to give his life for the greater good. I don't know, it feels too fishy. On the other, Starbuck being revealed to be an "angel" or whatever she was... scratches head confused.

Battlestar Galactica
A cylon centurion

The cliffhangers in the show are great. I had the privilege of binge watching, but I think 13 years, the midseason breaks and the season finales were brutal. The one at the middle of the 4th season, when they find Earth and it's a radioactive pile of rubble, wow, that gave some really nice twists to everything, especially about the origins of the Final Five cylons.

The technology used in-universe is weird. On one hand, they have very advanced stuff, like faster than light travel, on the other hand, they use really old-school analog scales, you know, the ones where you have to adjust the weights. The computers are not networked, but at least they have a good explanation for that: so that Cylon viruses can't spread from one system to another. But their papers are weird: they have their corners cut off. The process for refining tyllium (their fuel) is extremely manual, almost like coal mining 100 years ago.

The show was made before Netflix came and changed the format of TV shows. While the first season is short (10 episodes), the others have 20 episodes. I have to say, I'm glad TV shows nowadays are shorter. Seasons 2 and 3 have a lot of filler episodes. Season 4 is better, because the writers knew the series was going to end, so they could plan accordingly.

The acting is quite good. In particular, James Callis does a remarkable job with Gaius Baltar, exhibiting over four seasons a comprehensive range of human emotions. Tricia Helfer also has a quite challenging role, having to play the many clones of the Number 6 Cylon, in different places and different postures.

I really liked this show. It is one of the most captivating shows I've watched in the last 2-3 years. It has some flaws, but it's a really great space opera.

Grade: 10

As a side note: I watched this on Amazon Prime and the English subtitles are so bad. Sometimes the subtitles are off by 1 second (I haven't had this issue anywhere else in the last 10 years) and sometimes it seems like the subtitles were written by ear by someone who doesn't have good English.

I’m publishing this as part of 100 Days To Offload - Day 30.

]]>
<![CDATA[Showing current Kubernetes cluster in Powershell prompt]]>After nearly clearing the wrong Kubernetes cluster, I decided to add the name of the currently active cluster to my Powershell prompt. There are plenty of plugins to do this in Bash/Zsh/fish, but not as many for Powershell.

It's not hard to do, but the syntax and tools

]]>
https://rolisz.ro/2020/07/14/showing-current-kubernetes-cluster-in-powershell-prompt/5f0db64751d8dc2b1662a985Tue, 14 Jul 2020 19:28:49 GMT

After nearly clearing the wrong Kubernetes cluster, I decided to add the name of the currently active cluster to my Powershell prompt. There are plenty of plugins to do this in Bash/Zsh/fish, but not as many for Powershell.

It's not hard to do, but the syntax and tools you use are definitely different from Unix tools.

First, let's get the name of the currently active cluster. We can look through the ~/.kube/config file for the field called current-context. On Linux, I would use grep to extract this, on Windows we use Select-String, which receives a regex to match and outputs the matches, as objects. We look for the first match and for the second group (which would be the first and only capture group in our regex) and we put it's value in $ctx variable. This should output the current cluster name.

$K8sContext=$(Get-Content ~/.kube/config | Select-String -Pattern "current-context: (.*)")
$ctx=$K8sContext.Matches[0].Groups[1].Value
Write-Host $ctx

And now, to edit the prompt, you must modify your PS1 profile. If you don't have one, then create a file in C:\Users\<USERNAME>\Documents\WindowsPowerShell\profile.ps1. If you already have a profile (which might be a global one or a per user one), edit it and add the following:

function _OLD_PROMPT {""}
copy-item function:prompt function:_OLD_PROMPT
function prompt {
	$K8sContext=$(Get-Content ~/.kube/config | Select-String -Pattern "current-context: (.*)")
	If ($K8sContext) {
		$ctx=$K8sContext.Matches[0].Groups[1].Value
		# Set the prompt to include the cluster name
		Write-Host -NoNewline -ForegroundColor Yellow "[$ctx] "
	}
	_OLD_PROMPT
}

The prompt function is called to write out the prompt. First we save a copy of the original prompt into the _OLD_PROMPT variable and then we define our new prompt function. In the new function we do the above snippet, with an added check to make sure we add something to the prompt if there was a match for our regex. I put the name of the cluster in square brackets to make it visually distinct from the Python virtual environment name, which comes afterward in parenthesis.

The result is as follows:

Showing current Kubernetes cluster in Powershell prompt

Good luck with not nuking the wrong Kubernetes cluster!

I’m publishing this as part of 100 Days To Offload - Day 29.

]]>
<![CDATA[Going camping]]>I have never been camping in my life. I slept once in a tent in my aunt's backyard, with my cousins, but that doesn't count. This year I've decided it's about time to fill that gap in my life and actually sleep in a tent in the forest.

I actually

]]>
https://rolisz.ro/2020/07/11/going-camping/5f09cb7651d8dc2b1662a90dSat, 11 Jul 2020 15:21:03 GMT

I have never been camping in my life. I slept once in a tent in my aunt's backyard, with my cousins, but that doesn't count. This year I've decided it's about time to fill that gap in my life and actually sleep in a tent in the forest.

I actually wanted to do this last year. I even bought a tent, sleeping bags and other necessary items. Unfortunately, my wife and I were quite busy last summer, so we didn't get around to going camping.

But, this weekend, the stars aligned, and the group of guys I go hiking with decided to go camping. We wanted to do Via Ferrata and rafting too, but the water level was too low for rafting :( But at least I got to sleep in a tent in the forest.

Of course, first I tested my tent in my in-laws backyard. I didn't want to  figure out how to assemble it out in the field, while bugs are biting me and there is not much sunlight left. And only then did I dare to install it in the forest.

Going camping
Our camping spot, before it was cleared out

We had eyed a camping spot about 30 metres from the river, next to a rock cliff. My friends arrived there first and they cleared it out, because it was quite overrun by vegetation. We made a fire, but we forgot to gather enough wood while it was still light out, so we couldn't keep it going for too long. It didn't matter too much, because by midnight all of us were sleepy and we headed to our tents.

And I have to say I loved it. I was warned that I'll be shivering towards the morning - with a proper sleeping bag, that's not a problem at all. I was told it would be uncomfortable and I wouldn't be able to find a good position to sleep in - well, the others in my group did report such problems in the morning, but I had a foam pad and a self inflating pad, so I slept really well, with no aches in the morning, only a fresh feeling.

Going camping
The narrow path to our spot, almost completely overgrown by vegetation

I'm a man who enjoys comfort, so if I tried to do a hiking + camping trip, where I would have to carry my tent on foot for a long distance, I would probably enjoy things much less. But if I can stuff the car full of things that make it more comfortable, I think camping is really great, no need to go back to the "stone ages".

I can't wait to go camping again, this time with my wife!

I’m publishing this as part of 100 Days To Offload - Day 28.

]]>