<![CDATA[rolisz's blog]]>https://rolisz.ro/https://rolisz.ro/favicon.pngrolisz's bloghttps://rolisz.ro/Ghost 4.37Fri, 25 Mar 2022 10:59:21 GMT60<![CDATA[On the usefulness of a little bit of programming]]>While I am a professional software developer (or machine learning consultant, to be more exact) and I have started learning programming more than 10 years ago, I went to college to study it, I spend a lot of my free time learning more about this domain (because I love it)

]]>
https://rolisz.ro/2022/03/15/on-the-usefulness-of-a-little-bit-of-programming/62307a8aecbf7104750afc7dTue, 15 Mar 2022 14:07:44 GMT

While I am a professional software developer (or machine learning consultant, to be more exact) and I have started learning programming more than 10 years ago, I went to college to study it, I spend a lot of my free time learning more about this domain (because I love it), I'm starting to realize that knowing a bit of programming is a super power, from which many people could benefit, if only they knew just a bit of coding.

For example, about 6 years ago I wrote a 2 line JavaScript bookmarklet for a recruiter, that reduced the number of clicks he had to do, from 10 clicks to 3, for each of several thousand candidates. It was a simple script that looked for an HTML tag, extracted a value and put it in the clipboard (from where he pasted it into a spreadsheet). The guy was so happy that he gave me a peer bonus, because it saved him hours of boring work.

Another example is when I have to replace something many times in a text document and I can quickly write a regex to do it. My dad sometimes calls me to do that for him when he is typesetting books.

This week I wanted to read a transcribed sermon that came in a .doc format. For more pleasant reading, I copy pasted it into a markdown file, quickly fixed some Markdown issues (mostly lists and headers), ran it through Pandoc and got a beautiful HTML, that was easy on the eyes. I sprinkled in some copy pasted JavaScript and I had automatic popups for Bible verses when hovering over references.

Julia Evans recently wrote about some tiny personal programs she wrote that both had marginal utility in her life and also brought her joy while writing them.

Doing this kind of programming doesn't require knowing a lot about programming, much less how to develop software professionally, but it does require you to have a knowledge of what is possible with code and either a low enough friction to get started (such as a text editor with a built-in regex search and replace), or a high enough motivation (such as wanting to avoid tens of hours of boring work) to do it.

But unfortunately, I don't know if this is something that can be learned quickly. All these examples I gave took me 5-10 minutes to implement, but that's because I already knew a fair bit of JavaScript, I've had to process Markdown in the past, I've written regexes many times for work and I've had a general awareness of Pandoc, even if I didn't use it much before. If I had to search Kagi for all of these (or if I didn't even know what to search for), it would take a lot longer.

In my opinion, this approach is much more useful than the no-code approach that is popular today (at least among startups) and it would be great to see more startups trying to reduce the friction for doing just a bit of coding, not eliminating coding at all. Grist seems like a pretty good approach in this direction.

What's your favorite quick "hack" where you used a bit of coding knowledge to bring quality of life improvements for yourself?

]]>
<![CDATA[Facilitating Code Retreat in Oradea]]>After 8 years, I finally went again to a Code Retreat, but this time as a facilitator, not as a participant. As far as I know, it was the first time it was done in Oradea, so kudos to the Oradea Tech Hub team for organizing it.

My co-facilitator was

]]>
https://rolisz.ro/2022/02/27/facilitating-code-retreat-in-oradea/621a73fee4d316d11048ac95Sun, 27 Feb 2022 14:47:00 GMT

After 8 years, I finally went again to a Code Retreat, but this time as a facilitator, not as a participant. As far as I know, it was the first time it was done in Oradea, so kudos to the Oradea Tech Hub team for organizing it.

My co-facilitator was Bogdan Bota, who is the co-founder of OptiOffer, with whom I share a surprising number of views on technology and programming. Some of these common things are that neither of us is a big fans of OOP and TDD, so we didn't push that angle too much.

One surprise for me was that the mix of languages that people knew has changed a lot. Java and C# were much rarer, while Javascript was ubiquitous. There were a couple of high school students, who were working with C++ (because that's what's taught in Romania in programming classes).

Facilitating Code Retreat in Oradea

The participants enjoyed the different challenges we gave them, even if some of them were annoying (but realistic), such as changing requirements or partners mid-session. Bogdi and I had the most fun though, guiding them through this.

We got really good feedback, people had fun and said they learned a lot, and almost everyone was asking about when is the next event going to be. Naomi, I hope you'll organize many more great events!

]]>
<![CDATA[To changes]]>During the last 1 year I've had a lot of changes in my life. I became an independent machine learning consultant. Other people consider me an entrepreneur. I'm still in disbelief about that label, but it's becoming true, especially since I launched my first

]]>
https://rolisz.ro/2022/01/01/to-change/61d0896bbb23f815486168e0Sat, 01 Jan 2022 20:32:00 GMT

During the last 1 year I've had a lot of changes in my life. I became an independent machine learning consultant. Other people consider me an entrepreneur. I'm still in disbelief about that label, but it's becoming true, especially since I launched my first side-project (or is still a side project if it's just one of the many things you do?). Sometimes I feel like I barely recognize myself and how I think.

The machine learning consultant part has changed a lot as well. When I started off a year ago, I had many ideas of what I would do: trainings, online courses, consulting, train lots of ML models. Drawing the line at the end of the year has revealed that I did few of the things I thought I would do and that actually, the most profitable things were things I never thought I would do. I barely did any teaching, I did some consulting, but developing a full stack proof of concept for a startup was by far the best thing financially (and really fun too). Hmmmm.... maybe I should update the list of services I offer.

I started out as a machine learning consultant. But almost all of my work has been related to natural language processing (aka: working with text). Hmmmm.... maybe I should niche down to just NLP.

Some things didn't change. I still have a pathological fear of picking up the phone to call someone. Luckily, most of my clients came to me, instead of me having to go to them.

I thought I would have to do a lot of marketing, lots of blog posts, tweets, Youtube videos. I had one good blog post that brought in two clients. All the other clients came through referral and word of mouth. Hmmmm.... who knew networking is that important? And maybe, just maybe, my blog is not as important.

For a long time I was a lurker on forums, rarely posting anything. This year I've discovered several communities where for some reason, I started being more active. Hmmm... maybe I'm not that introverted.

The biggest change was becoming a daddy. I now have a very different reason to look forward to every day: Gloria. Of course, it comes with it's own challenges, including some curveballs, but boy, is it great.

To changes
My beautiful giggly daughter, who melts my heart whenver I look at her

There are some negative changes as well: it was the first year in a long time when I didn't fly at all :(

Here's to more changes in 2020 v2 and to being nimble, as God leads me!

]]>
<![CDATA[Learning in Public: Exploring the BT iPay API]]>Dan Luu and Jamie Brandon have argued quite successfully that increasing your productivity and velocity as a developer can lead to good return on investment. So I've been thinking about doing the same and I was inspired by Michael Lynch to record myself while coding and then to

]]>
https://rolisz.ro/2021/11/16/learning-in-public-exploring-the-bt-ipay-api/6193efe8bb23f8154861688cTue, 16 Nov 2021 19:09:28 GMT

Dan Luu and Jamie Brandon have argued quite successfully that increasing your productivity and velocity as a developer can lead to good return on investment. So I've been thinking about doing the same and I was inspired by Michael Lynch to record myself while coding and then to analyze the mistakes I've made.

That was quite fun and I got some very useful feedback from it, so I thought I'd share this video, as a way of learning in public.

First lesson: my green screen doesn't play nice with my IKEA chair that has a mesh in the back. My apologies for the awful looking webcam overlay.

And some development lessons:

  • I go with the mouse several times to the menu to select "Format code". I should learn the shortcut for that, or even better, I should set PyCharm to auto format the file when saving.
  • I spent a lot of time on figuring out the parameters for the first call and on formatting them as a dictionary. I could have sped up understanding how to send the parameters by URL decoding the provided example and for the formatting I could have used a multi cursor for quicker editing.
  • BurpSuite/HTTP Toolkit was recommended as a way to explore HTTP APIs.
  • When I was writing the client.register_payment call I had to hover a lot over the function definition to see the order of parameters. Copying the function definition would have been faster and it would have made it easier to define keyword arguments, which are clearer for a function with so many parameters.

Thas was quite fun and useful. Thank you Michael and Catalin for the feedback!

]]>
<![CDATA[The right time for every habit]]>I've written many times about various goals and plans I've had over the last couple of years. I blogged publicly about it because I had heard that precommitment helps with realizing goals. But, considering that many of the goals on those lists didn't get

]]>
https://rolisz.ro/2021/10/09/the-right-time-for-every-habit/616091a7e8b5e067fe3b8323Sat, 09 Oct 2021 09:34:50 GMT

I've written many times about various goals and plans I've had over the last couple of years. I blogged publicly about it because I had heard that precommitment helps with realizing goals. But, considering that many of the goals on those lists didn't get touched at all, even though I repeated them every year for three years, precommitment didn't give the promised results.

So since 2019, I haven't set big, public yearly goals, even though I still try to carve out some directions for myself every year.

But I am glad to report that despite focusing less on such goals, I've managed to create two habits that have been on my list for a long time, through very different means.

Working out

The right time for every habit

During the last 52 weeks (so 1 year basically), I have done 113 workouts, meaning just a bit more often than 2 times a week.

While I was working at Google, I managed to work out quite often, but there it was easier because I had to go down only 2 floors from my desk to get to the office. After moving to Romania, I would have had to go to a gym. I had one attempt to do that, but my car didn't start just when I wanted to go to the gym and it was raining so I couldn't go by bike, so I gave up. Then I bought barbells, dumbells and a squat rack, but, for 2 years, I used them at most 10 times.

But last year in October, after much internal mulling, I realized I need to change how I approach my body and my health. I need to start prioritizing it, and not just in a "New Year's Eve resolution" kind of way. My health has to be a priority and that means other things have to go. For this, I had to learn how to workout, what exercises to do to achieve the outcomes that I wanted. It also meant closing my work laptop at 6 PM and getting into my gym clothes. Other times, instead of enjoying a slow morning, I would lift some weights and get sweaty. To make more efficient use of time, I would sometimes work out during the mid-week online church service (I still ask myself if that's good or bad).

The ongoing pandemic helped a bit, because I didn't travel so much, at most a couple of days here and there. But even then I would try to get in a bodyweight workout.

The conclusion remains the same: I need to prioritize my health, keeping an eye on the longterm. I still have much work to do in this area, especially on the nutrition side, but even there I'm slowly learning better how my body works and how I need to fuel it.

Memorizing Bible verses

The right time for every habit
Statistics from my Anki deck for Bible verses

This goal was on my list every year from 2016 to 2018 and every year I would utterly fail it. I would learn 1-2 verses and that's it.

But in 2019, someone from Brazil visited our church and told us how some young people from his church each memorized a book of the New Testament. And I decided, I'll do the same.

What had changed was that I had learned more about the science of memorization. I knew of some people who had memorized vast amounts of information, so my first step was to inform myself what would be the best way to do that. I tried putting that into practice and I've never stopped. For two years I have been using Anki to memorize Bible verses. I have memorized all of Galatians and I'm halfway through Ephesians.

While I am convinced that this habit is very beneficial for my spiritual life, I think the key here was breaking the requirement into small pieces that can be done easily. There are very few days when this takes more than 10 minutes. It's something that I can easily do in bed, right before I go to sleep. Or I can do while waiting in a queue at a shop.

The other thing that helps is to see how good my memory can become with enough practice. There are verses that Anki estimates I'll have to revisit only in 3 years. I notice that my short term memory is better. Memorizing more things is so useful, that I have started memorizing other things as well, from my wife's phone number to PowerShell commands to my business's registration number.

More work to be done

Roland is still a work in progress. God still has much to do in my life. But I want to celebrate every win, even as I look forward to what I'll learn in the coming months.

]]>
<![CDATA[Opening Jupyter Notebooks in the right browser from WSL]]>I mentioned last year that I've slowly moved back to using Windows more and more. In the mean time, my transition is almost completely done. This year I've pretty much booted into ArchLinux to update it, about once a month and that's it. I

]]>
https://rolisz.ro/2021/09/23/opening-jupyter-notebooks-in-the-right-browser-from-wsl/614c740d56c059246e90e2f3Thu, 23 Sep 2021 19:10:57 GMT

I mentioned last year that I've slowly moved back to using Windows more and more. In the mean time, my transition is almost completely done. This year I've pretty much booted into ArchLinux to update it, about once a month and that's it. I am otherwise very happy with WSL1 for when I need to run Linux only tools, such as auto-sklearn.

There was one small hickup: when opening a Jupyter Notebook from WSL, it would try to open the notebooks in the Linux environment, which is a CLI environment, so it opened them in Lynx, not in the Firefox instance that runs on the Windows side of things. While Lynx is cute, it's not the most useful interface for a Jupyter Notebook.

Opening Jupyter Notebooks in the right browser from WSL
Jupyter Notebook opening in Lynx

I could quit Lynx by pressing q and then I would CTRL-Click on the link showed in the terminal and Jupyter would open in Firefox. But hey, I'm a programmer and I don't want to do extra clicks. Today I learned how to fix this problem.

First, we need to tell WSL to use the browser from Linux. This can be done by setting the BROWSER environment variable to point to the location of Firefox in Windows, but with the path as seen by WSL:

 export BROWSER=/mnt/c/Program\ Files/Mozilla\ Firefox/firefox.exe

Running jupyter notebook after this will correctly open a window in Firefox, but it will open it with a Linux path towards a redirect file that does the authentication for Jupyter. Because Firefox runs in Windows, it can't access the path on the Linux side.

But there is a way to tell Jupyter to open the normal localhost links, not the ones that point to a local redirect file. For this, you have to create a Jupyter config (unless you already have one):

> jupyter notebook --generate-config
Writing default config to: /home/rolisz/.jupyter/jupyter_notebook_config.py

Then edit this file and change the use_redirect_file parameter to be true (and uncomment it if needed):

c.NotebookApp.use_redirect_file = True

From now, running jupyter notebook in WSL will open properly!

]]>
<![CDATA[Half a year as an indie consultant]]>It's hard to believe it's been more than half an year since I started my own company and became an independent machine learning consultant. It's been a very interesting ride.

There have been plenty of moments where the predominant feeling was "what now?

]]>
https://rolisz.ro/2021/07/09/half-a-year-as-an-indie-consultant/60d58d48a0cb673c37e93d2bFri, 09 Jul 2021 14:57:00 GMT

It's hard to believe it's been more than half an year since I started my own company and became an independent machine learning consultant. It's been a very interesting ride.

There have been plenty of moments where the predominant feeling was "what now?". How am I going to find more clients? How to negotiate with this client? The Dip, as it's called by Seth Godin, is very real and very scary. When you draw the line and see how much you've earned over six months... you start getting serious doubts. Was it worth it? Wouldn't it have been better (and much easier) to just find a nice job?

But there are other moments: when I realize I have freedom to choose my clients and the projects that I work on; after working for a whole day on something that I love, ML, without any useless meetings; when deciding with almost complete freedom the tech stack which will be used to build the ML side of things; when I take a day off almost whenever I want, just because I don't feel like working on that particular project on that particular day. Or when I realize that I am a consultant, that my clients look to me for advice and that they actually take my advice seriously. If I say that the way they did things previously won't work and they should do things differently? They'll get to it right away.

And then there are moments when I realize I barely have time to read any state of the art machine learning papers and instead I have to learn the basics of marketing, branding, business development, communication, coaching, explaining, teaching - and to put all of this into practice. Most of my clients don't care if I'm using the latest state of the art Transformer architecture (and don't even know what on earth that is). They don't even know what machine learning is. But they need someone to explain it to them - to people who have built successful companies in their own fields - and to help them understand if it's something that they need or not.

I am thankful to God for guiding me on this new path, of which I have dreamed for a long time. Faith in his faithfulness is what has kept me steady when my knees wavered.

I am grateful to my dear wife who was willing to take this risk alongside me and has been very supportive all along the way.

I am very glad I have a good accountant who can help me with all the paperwork of the company.

I am grateful to the whole team from Oradea Tech Hub, who have helped me get my name out there, and especially to my friend David Achim with whom I did many rounds of business strategy discussions.

And I am thankful to many others who have cheered me on, who have encouraged me and who have put in a good word for me to potential clients.

]]>
<![CDATA[Happy 11th Birthday!]]>My blog has circled the Sun for another year. You got 37 more posts in the meantime. The Obsidian post was very popular, as was the Rust Codenames series. Vmmem issues are finding a solution on my blog as well.  The second half of last year was slower than

]]>
https://rolisz.ro/2021/06/08/happy-11th-birthday/60bfba22a3c3ed7839d6f32eTue, 08 Jun 2021 19:17:05 GMT

My blog has circled the Sun for another year. You got 37 more posts in the meantime. The Obsidian post was very popular, as was the Rust Codenames series. Vmmem issues are finding a solution on my blog as well.  The second half of last year was slower than the first one, but it's ok.

I kinda split my blog into two: personal posts stayed here, anything related to machine learning goes to my new domain, which is for my consulting business. I still want to post some technical content here and I do hope I'll make it to the front page of HN again :D

I haven't had as much time to write posts because I've been busy with all kinds of other content: an in person machine learning course here in Oradea, several presentations, some about machine learning, some about quick iteration, some locally, some online. It turns I only have so much creative juice in me every day.

I've resumed my goals to blog again, but at a much more humble rate. Sometimes I'm tempted to try daily blogging, but I'm a bit afraid of that commitment and of the quality of the posts that would result from that. Some people say that writing daily turns on the faucets of creativity and you'll have plenty of ideas. But for now I'll stick to a more reasonable goal of two posts per month.

]]>
<![CDATA[Working across multiple machines]]>Until this year, I usually had a laptop from my employer, on which I did work stuff and I had a personal desktop and laptop. The two personal devices got far too little usage coding wise, so I didn't really have a need to make sure I have

]]>
https://rolisz.ro/2021/05/19/working-across-multiple-machines/60a401e6a3c3ed7839d6f28eWed, 19 May 2021 10:00:11 GMT

Until this year, I usually had a laptop from my employer, on which I did work stuff and I had a personal desktop and laptop. The two personal devices got far too little usage coding wise, so I didn't really have a need to make sure I have access to the same files on both places.

But since becoming self-employed at the beginning of this year, I find myself using both the desktop and the laptop a lot more and I need to sync files between them. I go to work from a co-working space 2-3 days a week. Sometimes I go to have a meeting with a client at their office. My desktop has a GPU and is much more powerful, so when at home I strongly prefer to work from it, instead of from a laptop that gets thermal throttling pretty fast.

I could transfer code using Github, I'd rather not have to do a WIP commit every time I get up from the desk. But I also need to sync things like business files (PDFs) and machine learning models.  The most common solution for this is to use Dropbox, OneDrive or something similar, but I would like to avoid sending all my files to a centralized service run by a big company.

Trying Syncthing again

I've tried using Syncthing in the past for backups, but it didn't work out at the time. Probably because it's not meant for backups. But it is meant for syncing files between devices!

I've been using Syncthing for this purpose for 3 months now and it just works™️. It does NAT punching really well and syncing is super speedy. I've had problems with files not showing up right away on my laptop only once and I'm pretty sure it was because my laptop's Wifi sometimes acts weird.

My setup

I have three devices talking to each other on Syncthing: my desktop, my laptop and my NAS. The NAS is there to be the always-on replica of my data and it makes it easier to backup things. The desktop has the address of the NAS hardcoded because they are in the same LAN, but all the other devices uses dynamic IP discovery to talk to each other.

I have several folders set up for syncing. Some of them go to all three devices, some of them are only between the desktop and the NAS.

For the programming folders I use ignore patterns generously: I don't sync virtual env folders or node_modules folders, because they usually don't play nice if they end up on a different device with different paths (or worse, different OS). Because of this, I set up my environment on each device separately and I only sync  requirements.txt and then run pip install -r requirements.txt.

What do you use for syncronizing your workspace across devices? Do you have anything better than Syncthing?

]]>
<![CDATA[Productivity Tips: Time Blocks]]>As I've started my freelance machine learning consulting business this year, I found I need better ways to organize my time. When I was employed as a software engineer, there was a task board I would choose what to work on. The tasks would be mostly decided at

]]>
https://rolisz.ro/2021/04/04/productivity-tips/606a03ac88041c04f2008470Sun, 04 Apr 2021 19:51:13 GMT

As I've started my freelance machine learning consulting business this year, I found I need better ways to organize my time. When I was employed as a software engineer, there was a task board I would choose what to work on. The tasks would be mostly decided at the beginning of the spring, so it was quite clear what to focus on most of the time. Of course, sometimes unexpected issues would come up, but usually those are urgent, so it's easy to decide to switch over to them.

But now, I have to juggle between working for different clients, talking to leads and doing marketing or administrative tasks. My to-do list just keeps growing longer and it's getting harder to pick something to work on. Should I write a new blog post? Should I work on a video? Should I do some exploratory data analysis for a client? Should I look into preparing an MLOps report for a client? Or maybe write a blog post so that my friends know I'm still alive?

Having to make a choice about this every time I want to start working is tiring, leading to choice paralysis. Often I have to work on 3-4 tasks a day. If I context switch between them too often, my efficiency drops.  So last month I started applying a variant of time blocking, about which I read from Cal Newport.

Productivity Tips: Time Blocks
Blue events are meetings, green ones are time blocks

Instead of using a paper based method like he suggests, I create an event in Google Calendar when I want to block off some time. Ideally I schedule them the day before, but sometimes I either forget or something comes up and I have to change what I'll work on for the same day. I try to create blocks of one or two hours. Shorter blocks don't give you enough time to get immersed in deep work, while longer blocks are usually too tiring. I also make sure to leave some breaks between the time blocks.

I use a separate calendar so that I can easily toggle the visibility, leaving in the Calendar app only those events which have to take place at a given time (such as client meetings) and so that the time blocks don't interfere with Calendly, a meeting scheduling service I use.

I'm not very strict about the time blocks. If I find that I'm in the flow when a block ends, then I'll continue working on it. If something else is more urgent or I'm simply in a very strong mood for another task, I'll work on that and I'll simply move the calendar event to another time.

How do you organize your time and decide what to work on?

]]>
<![CDATA[Learning to machine learn]]>tl;dr: I'm launching an introductory course about machine learning in Romanian. It's aimed not just at developers, but at a more general audience.

De ceva timp mă bate gândul să trec la următorul nivel de creare de conț

]]>
https://rolisz.ro/2021/02/19/learning-to-machine-learn/602fdaddf2fbd3222b45f000Fri, 19 Feb 2021 15:51:29 GMT

tl;dr: I'm launching an introductory course about machine learning in Romanian. It's aimed not just at developers, but at a more general audience.

De ceva timp mă bate gândul să trec la următorul nivel de creare de conținut. Scriu pe blog de 10 ani și îmi place asta. Unele posturi pe care le-am scris despre programare și machine learning au avut succes. Așa că m-am gândit să fac un curs de machine learning.

Pe net sunt o mulțime de resurse de machine learning, cursuri care mai de care. Și eu am învățat din ele, deci sunt și cursuri bune și foarte bune printre ele. Dar pentru început, aș vrea să încep prin a face un curs în limba română, unde nu cred că sunt suficiente resurse de calitate. Bine, practic va fi o romgleză, că abia pot să pronunț „învățare automată”, „machine learning” alunecă mult mai bine. Ce să mai zic de deep learning...

O altă lacună pe care am identificat-o e că majoritatea cursurilor sunt pentru programatori care scriu cod în fiecare zi și vor să știe folosi și unealta numită machine learning. Dar este o lipsă mare de înțelegere a modului cum funcționează machine learning și inteligența artificială în rândul managerilor și, de ce nu, a oamenilor non tehnici. Dacă te iei doar după ce citești la știri, imediat urmează scenariul Terminator, când în realitate toate sistemele de ML au slăbiciuni mari și ușor de găsit.

Asta duce la unele așteptări nerealiste din partea conducerii unor firme, care vor să devină mai „hipsteri” și să folosească ML, dar vin cu idei complet greșite, care nu pot fi făcute să meargă suficient de bine. Sper să pot să ajut și astfel de persoane.

Mulți oameni cred că trebuie cunoștiințe tehnice foarte avansate ca să folosești chestii de inteligență artificială. Dar bariera scade tot mai mult și apar aplicații și în domenii creative, cum ar fi generare de imagini sau de text și care pot fi folosite relativ simplu, odată ce înțelegi conceptele de bază.

Dacă vă surâde ce ați citit mai sus, intrați pe pagina cursului.

]]>
<![CDATA[Design patterns in real life]]>In programming there are so called design patterns, which are basically commonly repeated pieces of code that occur often enough that people thought it would be helpful to give them a name so that’s it’s easier to talk about them. One example is the iterator pattern,

]]>
https://rolisz.ro/2021/01/26/design-patterns-in-real-life/601075f3f896ad697fe0fe06Tue, 26 Jan 2021 20:07:52 GMT

In programming there are so called design patterns, which are basically commonly repeated pieces of code that occur often enough that people thought it would be helpful to give them a name so that’s it’s easier to talk about them. One example is the iterator pattern, which is about an efficient method of traversing the elements of a container, whether they are an array, a hash table or something else. The builder pattern is used for building objects when we don’t know all their required parameters upfront.

Sometimes, if you don’t know about a pattern and you read code that uses it, it might seem strange. Why is this extra layer of abstraction here? Why is this API broken down into these pieces? After learning about the pattern, you might learn that the extra layer of abstraction is needed because the layer that’s below changes often. Or that the API is broken into those specific pieces because this makes it easy to cover more use cases in an efficient way.

As I’ve started diving head first into the world of running my own consulting business, I’m starting to learn about a whole other world of “design patterns”, unrelated to programming. And suddenly many things that I’ve seen before started to make sense.

My friend David has been bugging me to start a community for people passionate about machine learning in Oradea, where I live, for almost two years. For a long time I was thinking, why does he push so much for this? Well, after taking Seth Godin’s Freelancer Workshop, now I know that being the person who organizes a community is one of the best ways to make yourself known.

Another example is that I saw website offering a sort of business networking thing for a very high membership cost (or at least it seemed expensive at the time). Why would anyone do that? Then I learned about a thing called alchemy network 1 and how if it’s done well it can bring great value to it’s members.

All my friends who are freelancers charge by the hour. That’s what I thought was normal. But then I heard about Value based pricing by Jonathan Stark. A different pricing “design pattern”, which aligns the incentives of the client and of the service provider in a much better way. Let’s see if I can pull it off though.

Just like in programming, design patterns help us find the correct solution faster and communicate more efficiently. The more patterns you know, the faster you can recognize a situation and react better to it.

What are your favorite design patterns?

]]>
<![CDATA[How to ML - Monitoring]]>As much as machine learning developers like to think that once they've got a good enough model, the job is done, it's not quite so.

The first couple of weeks after deployment are critical. Is the model really as good as offline tests said they are?

]]>
https://rolisz.ro/2021/01/22/how-to-ml-monitoring/600b34dcf896ad697fe0fdf3Fri, 22 Jan 2021 20:29:24 GMT

As much as machine learning developers like to think that once they've got a good enough model, the job is done, it's not quite so.

The first couple of weeks after deployment are critical. Is the model really as good as offline tests said they are? Maybe something is different in production then in all your test data. Maybe the data you collected for offline predictions includes pieces of data that are not available at inference time. For example, if trying to predict click through rates for items in a list and use that to rank the items, when building the training dataset it's easy to include the rank of the item in the data, but the model won't have that when making predictions, because it's what you're trying to infer. Surprise, the model will perform very poorly in production.

Or maybe simply A/B testing reveals that the fancy ML model doesn't really perform better in production than the old rules written with lots of elbow grease by lots of developers and business analysts, using lots of domain knowledge and years of experience.

But even if the model does well at the beginning, will it continue to do so? Maybe there will be an external change in user behavior and they will start searching for other kinds of queries, which your model was not developed for. Or maybe your model will introduce a "positive" feedback loop: it suggests some items, users click on them, so those items get suggested more often, so more users click on them. This leads to a "rich get richer" kind of situation, but the algorithm is actually not making better and better suggestions.

Maybe you are on top of this and you keep retraining your model weekly to keep it in step with user behavior. But then you need to have a staggered release of the model, to make sure that the new one is really performing better across all relevant dimensions. Is inference speed still good enough? Are predictions relatively stable, meaning we don't recommend only action movies one week and then only comedies next week? Are models even comparable from one week to another or is there a significant random component to them which makes it really hard to see how they improved? For example, how are the clusters from the user post data built up? K-means starts with random centroids and clusters from one run have only passing similarity to the ones from another run. How will you deal with that?

]]>
<![CDATA[GPT-3 and AGI]]>One of the most impressive/controversial papers from 2020 was GPT-3 from OpenAI. It's nothing particularly new, it's mostly a bigger version of GPT-2, which came out in 2019. It's a much bigger version, being by far the largest machine learning model at the

]]>
https://rolisz.ro/2021/01/21/gpt3-agi/5f3517c94f71eb12e0abb8bfThu, 21 Jan 2021 20:13:00 GMT

One of the most impressive/controversial papers from 2020 was GPT-3 from OpenAI. It's nothing particularly new, it's mostly a bigger version of GPT-2, which came out in 2019. It's a much bigger version, being by far the largest machine learning model at the time it was release, with 175 billion parameters.

It's a fairly simple algorithm: it's learning to predict the next word in a text[1]. It learns to do this by training on several hundred gigabytes of text gathered from the Internet. Then to use it, you give it a prompt (a starting sequence of words) and then it will start generating more words and eventually it will decide to finish the text by emitting a stop token.

Using this seemingly stupid approach, GPT-3 is capable of generating a wide variety of interesting texts: it can write poems (not prize winning, but still), write news articles, imitate other well know authors, make jokes, argue for it's self awareness, do basic math and, shockingly to programmers all over the world, who are now afraid the robots will take their jobs, it can code simple programs.

That's amazing for such a simple approach. The internet was divided upon seeing these results. Some were welcoming our GPT-3 AI overlords, while others were skeptical, calling it just fancy parroting, without a real understanding of what it says.

I think both sides have a grain of truth. On one hand, it's easy to find failure cases, make it say things like "a horse has five legs" and so on, where it shows it doesn't really know what a horse is. But are humans that different? Think of a small child who is being taught by his parents to say "Please" before his requests. I remember being amused by a small child saying "But I said please" when he was refused by his parents. The kid probably thought that "Please" is a magic word that can unlock anything. Well, not really, in real life we just use it because society likes polite people, but saying please when wishing for a unicorn won't make it any more likely to happen.

And it's not just little humans who do that. Sometimes even grownups parrot stuff without thinking about it, because that's what they heard all their life and they never questioned it. It actually takes a lot of effort to think, to ensure consistency in your thoughts and to produce novel ideas. In this sense, expecting an artificial intelligence that is around human level might be a disappointment.

On the other hand, I believe there is a reason why this amazing result happened in the field of natural language processing and not say, computer vision. It has been long recognized that language is a powerful tool, there is even a saying about it: "The pen is mightier than the sword". Human language is so powerful that we can encode everything that there is in this universe into it, and then some (think of all the sci-fi and fantasy books). More than that, we use language to get others to do our bidding, to motivate them, to cooperate with them and to change their inner state, making them happy or inciting them to anger.

While there is a common ground in the physical world, often times that is not very relevant to the point we are making: "A rose by any other name would smell as sweet". Does it matter what a rose is when the rallying call is to get more roses? As long as the message gets across and is understood in the same way by all listeners, no, it doesn’t. Similarly, if GPTx can affect the desired change in it's readers, it might be good enough, even if doesn't have a mythical understanding of what those words mean.


Technically, the next byte pair encoded token ↩︎

]]>
<![CDATA[How to ML - Deploying]]>So the ML engineer presented the model to the business stakeholders and they agreed that it performed well enough on the key metrics in testing that it's time to deploy it to production.

So now we have to make sure the models run reliably in production. We have

]]>
https://rolisz.ro/2021/01/20/how-to-ml-deploying/60084bc7165bd14e3b33595dWed, 20 Jan 2021 15:28:54 GMT

So the ML engineer presented the model to the business stakeholders and they agreed that it performed well enough on the key metrics in testing that it's time to deploy it to production.

So now we have to make sure the models run reliably in production. We have to answer some more questions, in order to make some trade offs.

How important is latency? Is the model making an inference in response to a user action, so it's crucial to have the answer in tens of milliseconds? Then it's time to optimize the model: quantize weights, distill knowledge to a smaller model, weight pruning and so on. Hopefully, your metrics won't go down due to the optimization.

Can the results be precomputed? For example, if you want to make movie recommendations, maybe there can be a batch job that runs every night that does the inference for every user and stores them in a database. Then when the user makes a request, they are simply quickly loaded from the database. This is possible only if you have finite range of predictions to make.

Where are you running the model? On big beefy servers with a GPU? On mobile devices, which are much less powerful? Or on some edge devices that don't even have an OS? Depending on the answer, you might have to convert the model to a different format or optimize it to be able to fit in memory.

Even in the easy case where you are running the model on servers and latency can be several seconds, you still have to do the whole dance of making it work there. "Works on my machine" is all to often a problem. Maybe production runs a different version of Linux, which has a different BLAS library and the security team won't let you update things. Simple, just use Docker, right? Right, better hope you are good friends with the DevOps team to help you out with setting up the CI/CD pipelines.

But you've killed all the dragons, now it's time to keep watch... aka monitoring the models performance in production.

]]>