<![CDATA[rolisz's blog]]>https://rolisz.ro/https://rolisz.ro/favicon.pngrolisz's bloghttps://rolisz.ro/Ghost 4.18Tue, 16 Nov 2021 19:52:15 GMT60<![CDATA[Learning in Public: Exploring the BT iPay API]]>Dan Luu and Jamie Brandon have argued quite successfully that increasing your productivity and velocity as a developer can lead to good return on investment. So I've been thinking about doing the same and I was inspired by Michael Lynch to record myself while coding and then to

]]>
https://rolisz.ro/2021/11/16/learning-in-public-exploring-the-bt-ipay-api/6193efe8bb23f8154861688cTue, 16 Nov 2021 19:09:28 GMT

Dan Luu and Jamie Brandon have argued quite successfully that increasing your productivity and velocity as a developer can lead to good return on investment. So I've been thinking about doing the same and I was inspired by Michael Lynch to record myself while coding and then to analyze the mistakes I've made.

That was quite fun and I got some very useful feedback from it, so I thought I'd share this video, as a way of learning in public.

First lesson: my green screen doesn't play nice with my IKEA chair that has a mesh in the back. My apologies for the awful looking webcam overlay.

And some development lessons:

  • I go with the mouse several times to the menu to select "Format code". I should learn the shortcut for that, or even better, I should set PyCharm to auto format the file when saving.
  • I spent a lot of time on figuring out the parameters for the first call and on formatting them as a dictionary. I could have sped up understanding how to send the parameters by URL decoding the provided example and for the formatting I could have used a multi cursor for quicker editing.
  • BurpSuite/HTTP Toolkit was recommended as a way to explore HTTP APIs.
  • When I was writing the client.register_payment call I had to hover a lot over the function definition to see the order of parameters. Copying the function definition would have been faster and it would have made it easier to define keyword arguments, which are clearer for a function with so many parameters.

Thas was quite fun and useful. Thank you Michael and Catalin for the feedback!

]]>
<![CDATA[The right time for every habit]]>I've written many times about various goals and plans I've had over the last couple of years. I blogged publicly about it because I had heard that precommitment helps with realizing goals. But, considering that many of the goals on those lists didn't get

]]>
https://rolisz.ro/2021/10/09/the-right-time-for-every-habit/616091a7e8b5e067fe3b8323Sat, 09 Oct 2021 09:34:50 GMT

I've written many times about various goals and plans I've had over the last couple of years. I blogged publicly about it because I had heard that precommitment helps with realizing goals. But, considering that many of the goals on those lists didn't get touched at all, even though I repeated them every year for three years, precommitment didn't give the promised results.

So since 2019, I haven't set big, public yearly goals, even though I still try to carve out some directions for myself every year.

But I am glad to report that despite focusing less on such goals, I've managed to create two habits that have been on my list for a long time, through very different means.

Working out

The right time for every habit

During the last 52 weeks (so 1 year basically), I have done 113 workouts, meaning just a bit more often than 2 times a week.

While I was working at Google, I managed to work out quite often, but there it was easier because I had to go down only 2 floors from my desk to get to the office. After moving to Romania, I would have had to go to a gym. I had one attempt to do that, but my car didn't start just when I wanted to go to the gym and it was raining so I couldn't go by bike, so I gave up. Then I bought barbells, dumbells and a squat rack, but, for 2 years, I used them at most 10 times.

But last year in October, after much internal mulling, I realized I need to change how I approach my body and my health. I need to start prioritizing it, and not just in a "New Year's Eve resolution" kind of way. My health has to be a priority and that means other things have to go. For this, I had to learn how to workout, what exercises to do to achieve the outcomes that I wanted. It also meant closing my work laptop at 6 PM and getting into my gym clothes. Other times, instead of enjoying a slow morning, I would lift some weights and get sweaty. To make more efficient use of time, I would sometimes work out during the mid-week online church service (I still ask myself if that's good or bad).

The ongoing pandemic helped a bit, because I didn't travel so much, at most a couple of days here and there. But even then I would try to get in a bodyweight workout.

The conclusion remains the same: I need to prioritize my health, keeping an eye on the longterm. I still have much work to do in this area, especially on the nutrition side, but even there I'm slowly learning better how my body works and how I need to fuel it.

Memorizing Bible verses

The right time for every habit
Statistics from my Anki deck for Bible verses

This goal was on my list every year from 2016 to 2018 and every year I would utterly fail it. I would learn 1-2 verses and that's it.

But in 2019, someone from Brazil visited our church and told us how some young people from his church each memorized a book of the New Testament. And I decided, I'll do the same.

What had changed was that I had learned more about the science of memorization. I knew of some people who had memorized vast amounts of information, so my first step was to inform myself what would be the best way to do that. I tried putting that into practice and I've never stopped. For two years I have been using Anki to memorize Bible verses. I have memorized all of Galatians and I'm halfway through Ephesians.

While I am convinced that this habit is very beneficial for my spiritual life, I think the key here was breaking the requirement into small pieces that can be done easily. There are very few days when this takes more than 10 minutes. It's something that I can easily do in bed, right before I go to sleep. Or I can do while waiting in a queue at a shop.

The other thing that helps is to see how good my memory can become with enough practice. There are verses that Anki estimates I'll have to revisit only in 3 years. I notice that my short term memory is better. Memorizing more things is so useful, that I have started memorizing other things as well, from my wife's phone number to PowerShell commands to my business's registration number.

More work to be done

Roland is still a work in progress. God still has much to do in my life. But I want to celebrate every win, even as I look forward to what I'll learn in the coming months.

]]>
<![CDATA[Opening Jupyter Notebooks in the right browser from WSL]]>I mentioned last year that I've slowly moved back to using Windows more and more. In the mean time, my transition is almost completely done. This year I've pretty much booted into ArchLinux to update it, about once a month and that's it. I

]]>
https://rolisz.ro/2021/09/23/opening-jupyter-notebooks-in-the-right-browser-from-wsl/614c740d56c059246e90e2f3Thu, 23 Sep 2021 19:10:57 GMT

I mentioned last year that I've slowly moved back to using Windows more and more. In the mean time, my transition is almost completely done. This year I've pretty much booted into ArchLinux to update it, about once a month and that's it. I am otherwise very happy with WSL1 for when I need to run Linux only tools, such as auto-sklearn.

There was one small hickup: when opening a Jupyter Notebook from WSL, it would try to open the notebooks in the Linux environment, which is a CLI environment, so it opened them in Lynx, not in the Firefox instance that runs on the Windows side of things. While Lynx is cute, it's not the most useful interface for a Jupyter Notebook.

Opening Jupyter Notebooks in the right browser from WSL
Jupyter Notebook opening in Lynx

I could quit Lynx by pressing q and then I would CTRL-Click on the link showed in the terminal and Jupyter would open in Firefox. But hey, I'm a programmer and I don't want to do extra clicks. Today I learned how to fix this problem.

First, we need to tell WSL to use the browser from Linux. This can be done by setting the BROWSER environment variable to point to the location of Firefox in Windows, but with the path as seen by WSL:

 export BROWSER=/mnt/c/Program\ Files/Mozilla\ Firefox/firefox.exe

Running jupyter notebook after this will correctly open a window in Firefox, but it will open it with a Linux path towards a redirect file that does the authentication for Jupyter. Because Firefox runs in Windows, it can't access the path on the Linux side.

But there is a way to tell Jupyter to open the normal localhost links, not the ones that point to a local redirect file. For this, you have to create a Jupyter config (unless you already have one):

> jupyter notebook --generate-config
Writing default config to: /home/rolisz/.jupyter/jupyter_notebook_config.py

Then edit this file and change the use_redirect_file parameter to be true (and uncomment it if needed):

c.NotebookApp.use_redirect_file = True

From now, running jupyter notebook in WSL will open properly!

]]>
<![CDATA[Half a year as an indie consultant]]>It's hard to believe it's been more than half an year since I started my own company and became an independent machine learning consultant. It's been a very interesting ride.

There have been plenty of moments where the predominant feeling was "what now?

]]>
https://rolisz.ro/2021/07/09/half-a-year-as-an-indie-consultant/60d58d48a0cb673c37e93d2bFri, 09 Jul 2021 14:57:00 GMT

It's hard to believe it's been more than half an year since I started my own company and became an independent machine learning consultant. It's been a very interesting ride.

There have been plenty of moments where the predominant feeling was "what now?". How am I going to find more clients? How to negotiate with this client? The Dip, as it's called by Seth Godin, is very real and very scary. When you draw the line and see how much you've earned over six months... you start getting serious doubts. Was it worth it? Wouldn't it have been better (and much easier) to just find a nice job?

But there are other moments: when I realize I have freedom to choose my clients and the projects that I work on; after working for a whole day on something that I love, ML, without any useless meetings; when deciding with almost complete freedom the tech stack which will be used to build the ML side of things; when I take a day off almost whenever I want, just because I don't feel like working on that particular project on that particular day. Or when I realize that I am a consultant, that my clients look to me for advice and that they actually take my advice seriously. If I say that the way they did things previously won't work and they should do things differently? They'll get to it right away.

And then there are moments when I realize I barely have time to read any state of the art machine learning papers and instead I have to learn the basics of marketing, branding, business development, communication, coaching, explaining, teaching - and to put all of this into practice. Most of my clients don't care if I'm using the latest state of the art Transformer architecture (and don't even know what on earth that is). They don't even know what machine learning is. But they need someone to explain it to them - to people who have built successful companies in their own fields - and to help them understand if it's something that they need or not.

I am thankful to God for guiding me on this new path, of which I have dreamed for a long time. Faith in his faithfulness is what has kept me steady when my knees wavered.

I am grateful to my dear wife who was willing to take this risk alongside me and has been very supportive all along the way.

I am very glad I have a good accountant who can help me with all the paperwork of the company.

I am grateful to the whole team from Oradea Tech Hub, who have helped me get my name out there, and especially to my friend David Achim with whom I did many rounds of business strategy discussions.

And I am thankful to many others who have cheered me on, who have encouraged me and who have put in a good word for me to potential clients.

]]>
<![CDATA[Happy 11th Birthday!]]>My blog has circled the Sun for another year. You got 37 more posts in the meantime. The Obsidian post was very popular, as was the Rust Codenames series. Vmmem issues are finding a solution on my blog as well.  The second half of last year was slower than

]]>
https://rolisz.ro/2021/06/08/happy-11th-birthday/60bfba22a3c3ed7839d6f32eTue, 08 Jun 2021 19:17:05 GMT

My blog has circled the Sun for another year. You got 37 more posts in the meantime. The Obsidian post was very popular, as was the Rust Codenames series. Vmmem issues are finding a solution on my blog as well.  The second half of last year was slower than the first one, but it's ok.

I kinda split my blog into two: personal posts stayed here, anything related to machine learning goes to my new domain, which is for my consulting business. I still want to post some technical content here and I do hope I'll make it to the front page of HN again :D

I haven't had as much time to write posts because I've been busy with all kinds of other content: an in person machine learning course here in Oradea, several presentations, some about machine learning, some about quick iteration, some locally, some online. It turns I only have so much creative juice in me every day.

I've resumed my goals to blog again, but at a much more humble rate. Sometimes I'm tempted to try daily blogging, but I'm a bit afraid of that commitment and of the quality of the posts that would result from that. Some people say that writing daily turns on the faucets of creativity and you'll have plenty of ideas. But for now I'll stick to a more reasonable goal of two posts per month.

]]>
<![CDATA[Working across multiple machines]]>Until this year, I usually had a laptop from my employer, on which I did work stuff and I had a personal desktop and laptop. The two personal devices got far too little usage coding wise, so I didn't really have a need to make sure I have

]]>
https://rolisz.ro/2021/05/19/working-across-multiple-machines/60a401e6a3c3ed7839d6f28eWed, 19 May 2021 10:00:11 GMT

Until this year, I usually had a laptop from my employer, on which I did work stuff and I had a personal desktop and laptop. The two personal devices got far too little usage coding wise, so I didn't really have a need to make sure I have access to the same files on both places.

But since becoming self-employed at the beginning of this year, I find myself using both the desktop and the laptop a lot more and I need to sync files between them. I go to work from a co-working space 2-3 days a week. Sometimes I go to have a meeting with a client at their office. My desktop has a GPU and is much more powerful, so when at home I strongly prefer to work from it, instead of from a laptop that gets thermal throttling pretty fast.

I could transfer code using Github, I'd rather not have to do a WIP commit every time I get up from the desk. But I also need to sync things like business files (PDFs) and machine learning models.  The most common solution for this is to use Dropbox, OneDrive or something similar, but I would like to avoid sending all my files to a centralized service run by a big company.

Trying Syncthing again

I've tried using Syncthing in the past for backups, but it didn't work out at the time. Probably because it's not meant for backups. But it is meant for syncing files between devices!

I've been using Syncthing for this purpose for 3 months now and it just works™️. It does NAT punching really well and syncing is super speedy. I've had problems with files not showing up right away on my laptop only once and I'm pretty sure it was because my laptop's Wifi sometimes acts weird.

My setup

I have three devices talking to each other on Syncthing: my desktop, my laptop and my NAS. The NAS is there to be the always-on replica of my data and it makes it easier to backup things. The desktop has the address of the NAS hardcoded because they are in the same LAN, but all the other devices uses dynamic IP discovery to talk to each other.

I have several folders set up for syncing. Some of them go to all three devices, some of them are only between the desktop and the NAS.

For the programming folders I use ignore patterns generously: I don't sync virtual env folders or node_modules folders, because they usually don't play nice if they end up on a different device with different paths (or worse, different OS). Because of this, I set up my environment on each device separately and I only sync  requirements.txt and then run pip install -r requirements.txt.

What do you use for syncronizing your workspace across devices? Do you have anything better than Syncthing?

]]>
<![CDATA[Productivity Tips: Time Blocks]]>As I've started my freelance machine learning consulting business this year, I found I need better ways to organize my time. When I was employed as a software engineer, there was a task board I would choose what to work on. The tasks would be mostly decided at

]]>
https://rolisz.ro/2021/04/04/productivity-tips/606a03ac88041c04f2008470Sun, 04 Apr 2021 19:51:13 GMT

As I've started my freelance machine learning consulting business this year, I found I need better ways to organize my time. When I was employed as a software engineer, there was a task board I would choose what to work on. The tasks would be mostly decided at the beginning of the spring, so it was quite clear what to focus on most of the time. Of course, sometimes unexpected issues would come up, but usually those are urgent, so it's easy to decide to switch over to them.

But now, I have to juggle between working for different clients, talking to leads and doing marketing or administrative tasks. My to-do list just keeps growing longer and it's getting harder to pick something to work on. Should I write a new blog post? Should I work on a video? Should I do some exploratory data analysis for a client? Should I look into preparing an MLOps report for a client? Or maybe write a blog post so that my friends know I'm still alive?

Having to make a choice about this every time I want to start working is tiring, leading to choice paralysis. Often I have to work on 3-4 tasks a day. If I context switch between them too often, my efficiency drops.  So last month I started applying a variant of time blocking, about which I read from Cal Newport.

Productivity Tips: Time Blocks
Blue events are meetings, green ones are time blocks

Instead of using a paper based method like he suggests, I create an event in Google Calendar when I want to block off some time. Ideally I schedule them the day before, but sometimes I either forget or something comes up and I have to change what I'll work on for the same day. I try to create blocks of one or two hours. Shorter blocks don't give you enough time to get immersed in deep work, while longer blocks are usually too tiring. I also make sure to leave some breaks between the time blocks.

I use a separate calendar so that I can easily toggle the visibility, leaving in the Calendar app only those events which have to take place at a given time (such as client meetings) and so that the time blocks don't interfere with Calendly, a meeting scheduling service I use.

I'm not very strict about the time blocks. If I find that I'm in the flow when a block ends, then I'll continue working on it. If something else is more urgent or I'm simply in a very strong mood for another task, I'll work on that and I'll simply move the calendar event to another time.

How do you organize your time and decide what to work on?

]]>
<![CDATA[Learning to machine learn]]>tl;dr: I'm launching an introductory course about machine learning in Romanian. It's aimed not just at developers, but at a more general audience.

De ceva timp mă bate gândul să trec la următorul nivel de creare de conț

]]>
https://rolisz.ro/2021/02/19/learning-to-machine-learn/602fdaddf2fbd3222b45f000Fri, 19 Feb 2021 15:51:29 GMT

tl;dr: I'm launching an introductory course about machine learning in Romanian. It's aimed not just at developers, but at a more general audience.

De ceva timp mă bate gândul să trec la următorul nivel de creare de conținut. Scriu pe blog de 10 ani și îmi place asta. Unele posturi pe care le-am scris despre programare și machine learning au avut succes. Așa că m-am gândit să fac un curs de machine learning.

Pe net sunt o mulțime de resurse de machine learning, cursuri care mai de care. Și eu am învățat din ele, deci sunt și cursuri bune și foarte bune printre ele. Dar pentru început, aș vrea să încep prin a face un curs în limba română, unde nu cred că sunt suficiente resurse de calitate. Bine, practic va fi o romgleză, că abia pot să pronunț „învățare automată”, „machine learning” alunecă mult mai bine. Ce să mai zic de deep learning...

O altă lacună pe care am identificat-o e că majoritatea cursurilor sunt pentru programatori care scriu cod în fiecare zi și vor să știe folosi și unealta numită machine learning. Dar este o lipsă mare de înțelegere a modului cum funcționează machine learning și inteligența artificială în rândul managerilor și, de ce nu, a oamenilor non tehnici. Dacă te iei doar după ce citești la știri, imediat urmează scenariul Terminator, când în realitate toate sistemele de ML au slăbiciuni mari și ușor de găsit.

Asta duce la unele așteptări nerealiste din partea conducerii unor firme, care vor să devină mai „hipsteri” și să folosească ML, dar vin cu idei complet greșite, care nu pot fi făcute să meargă suficient de bine. Sper să pot să ajut și astfel de persoane.

Mulți oameni cred că trebuie cunoștiințe tehnice foarte avansate ca să folosești chestii de inteligență artificială. Dar bariera scade tot mai mult și apar aplicații și în domenii creative, cum ar fi generare de imagini sau de text și care pot fi folosite relativ simplu, odată ce înțelegi conceptele de bază.

Dacă vă surâde ce ați citit mai sus, intrați pe pagina cursului.

]]>
<![CDATA[Design patterns in real life]]>In programming there are so called design patterns, which are basically commonly repeated pieces of code that occur often enough that people thought it would be helpful to give them a name so that’s it’s easier to talk about them. One example is the iterator pattern,

]]>
https://rolisz.ro/2021/01/26/design-patterns-in-real-life/601075f3f896ad697fe0fe06Tue, 26 Jan 2021 20:07:52 GMT

In programming there are so called design patterns, which are basically commonly repeated pieces of code that occur often enough that people thought it would be helpful to give them a name so that’s it’s easier to talk about them. One example is the iterator pattern, which is about an efficient method of traversing the elements of a container, whether they are an array, a hash table or something else. The builder pattern is used for building objects when we don’t know all their required parameters upfront.

Sometimes, if you don’t know about a pattern and you read code that uses it, it might seem strange. Why is this extra layer of abstraction here? Why is this API broken down into these pieces? After learning about the pattern, you might learn that the extra layer of abstraction is needed because the layer that’s below changes often. Or that the API is broken into those specific pieces because this makes it easy to cover more use cases in an efficient way.

As I’ve started diving head first into the world of running my own consulting business, I’m starting to learn about a whole other world of “design patterns”, unrelated to programming. And suddenly many things that I’ve seen before started to make sense.

My friend David has been bugging me to start a community for people passionate about machine learning in Oradea, where I live, for almost two years. For a long time I was thinking, why does he push so much for this? Well, after taking Seth Godin’s Freelancer Workshop, now I know that being the person who organizes a community is one of the best ways to make yourself known.

Another example is that I saw website offering a sort of business networking thing for a very high membership cost (or at least it seemed expensive at the time). Why would anyone do that? Then I learned about a thing called alchemy network 1 and how if it’s done well it can bring great value to it’s members.

All my friends who are freelancers charge by the hour. That’s what I thought was normal. But then I heard about Value based pricing by Jonathan Stark. A different pricing “design pattern”, which aligns the incentives of the client and of the service provider in a much better way. Let’s see if I can pull it off though.

Just like in programming, design patterns help us find the correct solution faster and communicate more efficiently. The more patterns you know, the faster you can recognize a situation and react better to it.

What are your favorite design patterns?

]]>
<![CDATA[How to ML - Monitoring]]>As much as machine learning developers like to think that once they've got a good enough model, the job is done, it's not quite so.

The first couple of weeks after deployment are critical. Is the model really as good as offline tests said they are?

]]>
https://rolisz.ro/2021/01/22/how-to-ml-monitoring/600b34dcf896ad697fe0fdf3Fri, 22 Jan 2021 20:29:24 GMT

As much as machine learning developers like to think that once they've got a good enough model, the job is done, it's not quite so.

The first couple of weeks after deployment are critical. Is the model really as good as offline tests said they are? Maybe something is different in production then in all your test data. Maybe the data you collected for offline predictions includes pieces of data that are not available at inference time. For example, if trying to predict click through rates for items in a list and use that to rank the items, when building the training dataset it's easy to include the rank of the item in the data, but the model won't have that when making predictions, because it's what you're trying to infer. Surprise, the model will perform very poorly in production.

Or maybe simply A/B testing reveals that the fancy ML model doesn't really perform better in production than the old rules written with lots of elbow grease by lots of developers and business analysts, using lots of domain knowledge and years of experience.

But even if the model does well at the beginning, will it continue to do so? Maybe there will be an external change in user behavior and they will start searching for other kinds of queries, which your model was not developed for. Or maybe your model will introduce a "positive" feedback loop: it suggests some items, users click on them, so those items get suggested more often, so more users click on them. This leads to a "rich get richer" kind of situation, but the algorithm is actually not making better and better suggestions.

Maybe you are on top of this and you keep retraining your model weekly to keep it in step with user behavior. But then you need to have a staggered release of the model, to make sure that the new one is really performing better across all relevant dimensions. Is inference speed still good enough? Are predictions relatively stable, meaning we don't recommend only action movies one week and then only comedies next week? Are models even comparable from one week to another or is there a significant random component to them which makes it really hard to see how they improved? For example, how are the clusters from the user post data built up? K-means starts with random centroids and clusters from one run have only passing similarity to the ones from another run. How will you deal with that?

]]>
<![CDATA[GPT-3 and AGI]]>One of the most impressive/controversial papers from 2020 was GPT-3 from OpenAI. It's nothing particularly new, it's mostly a bigger version of GPT-2, which came out in 2019. It's a much bigger version, being by far the largest machine learning model at the

]]>
https://rolisz.ro/2021/01/21/gpt3-agi/5f3517c94f71eb12e0abb8bfThu, 21 Jan 2021 20:13:00 GMT

One of the most impressive/controversial papers from 2020 was GPT-3 from OpenAI. It's nothing particularly new, it's mostly a bigger version of GPT-2, which came out in 2019. It's a much bigger version, being by far the largest machine learning model at the time it was release, with 175 billion parameters.

It's a fairly simple algorithm: it's learning to predict the next word in a text[1]. It learns to do this by training on several hundred gigabytes of text gathered from the Internet. Then to use it, you give it a prompt (a starting sequence of words) and then it will start generating more words and eventually it will decide to finish the text by emitting a stop token.

Using this seemingly stupid approach, GPT-3 is capable of generating a wide variety of interesting texts: it can write poems (not prize winning, but still), write news articles, imitate other well know authors, make jokes, argue for it's self awareness, do basic math and, shockingly to programmers all over the world, who are now afraid the robots will take their jobs, it can code simple programs.

That's amazing for such a simple approach. The internet was divided upon seeing these results. Some were welcoming our GPT-3 AI overlords, while others were skeptical, calling it just fancy parroting, without a real understanding of what it says.

I think both sides have a grain of truth. On one hand, it's easy to find failure cases, make it say things like "a horse has five legs" and so on, where it shows it doesn't really know what a horse is. But are humans that different? Think of a small child who is being taught by his parents to say "Please" before his requests. I remember being amused by a small child saying "But I said please" when he was refused by his parents. The kid probably thought that "Please" is a magic word that can unlock anything. Well, not really, in real life we just use it because society likes polite people, but saying please when wishing for a unicorn won't make it any more likely to happen.

And it's not just little humans who do that. Sometimes even grownups parrot stuff without thinking about it, because that's what they heard all their life and they never questioned it. It actually takes a lot of effort to think, to ensure consistency in your thoughts and to produce novel ideas. In this sense, expecting an artificial intelligence that is around human level might be a disappointment.

On the other hand, I believe there is a reason why this amazing result happened in the field of natural language processing and not say, computer vision. It has been long recognized that language is a powerful tool, there is even a saying about it: "The pen is mightier than the sword". Human language is so powerful that we can encode everything that there is in this universe into it, and then some (think of all the sci-fi and fantasy books). More than that, we use language to get others to do our bidding, to motivate them, to cooperate with them and to change their inner state, making them happy or inciting them to anger.

While there is a common ground in the physical world, often times that is not very relevant to the point we are making: "A rose by any other name would smell as sweet". Does it matter what a rose is when the rallying call is to get more roses? As long as the message gets across and is understood in the same way by all listeners, no, it doesn’t. Similarly, if GPTx can affect the desired change in it's readers, it might be good enough, even if doesn't have a mythical understanding of what those words mean.


Technically, the next byte pair encoded token ↩︎

]]>
<![CDATA[How to ML - Deploying]]>So the ML engineer presented the model to the business stakeholders and they agreed that it performed well enough on the key metrics in testing that it's time to deploy it to production.

So now we have to make sure the models run reliably in production. We have

]]>
https://rolisz.ro/2021/01/20/how-to-ml-deploying/60084bc7165bd14e3b33595dWed, 20 Jan 2021 15:28:54 GMT

So the ML engineer presented the model to the business stakeholders and they agreed that it performed well enough on the key metrics in testing that it's time to deploy it to production.

So now we have to make sure the models run reliably in production. We have to answer some more questions, in order to make some trade offs.

How important is latency? Is the model making an inference in response to a user action, so it's crucial to have the answer in tens of milliseconds? Then it's time to optimize the model: quantize weights, distill knowledge to a smaller model, weight pruning and so on. Hopefully, your metrics won't go down due to the optimization.

Can the results be precomputed? For example, if you want to make movie recommendations, maybe there can be a batch job that runs every night that does the inference for every user and stores them in a database. Then when the user makes a request, they are simply quickly loaded from the database. This is possible only if you have finite range of predictions to make.

Where are you running the model? On big beefy servers with a GPU? On mobile devices, which are much less powerful? Or on some edge devices that don't even have an OS? Depending on the answer, you might have to convert the model to a different format or optimize it to be able to fit in memory.

Even in the easy case where you are running the model on servers and latency can be several seconds, you still have to do the whole dance of making it work there. "Works on my machine" is all to often a problem. Maybe production runs a different version of Linux, which has a different BLAS library and the security team won't let you update things. Simple, just use Docker, right? Right, better hope you are good friends with the DevOps team to help you out with setting up the CI/CD pipelines.

But you've killed all the dragons, now it's time to keep watch... aka monitoring the models performance in production.

]]>
<![CDATA[How to ML - Models]]>So we finally got our data and we can get to machine learning. Without the data, there is no machine learning, there is at best human learning, where somebody tries to write an algorithm by hand to do the task at hand.

This is the part that most people who

]]>
https://rolisz.ro/2021/01/18/how-to-ml-models/6005e7293e8fc062a027dbe3Mon, 18 Jan 2021 19:55:44 GMT

So we finally got our data and we can get to machine learning. Without the data, there is no machine learning, there is at best human learning, where somebody tries to write an algorithm by hand to do the task at hand.

This is the part that most people who want to do machine learning are excited about. I read Bishop's and Murphy's textbooks, watched Andrew Ng's online course about ML and learned about different kinds of ML algorithms and I couldn't wait to try them out and to see which one is the best for the data at hand.

You start off with a simple one, a linear or logistic regression, to get a baseline. Maybe you even play around with the hyperparameters. Then you move on to a more complicated model, such as a random forest. You spend more time fiddling with it, getting 20% better results. Then you switch to the big guns, neural networks. You start with a simple one, with just 3 layers, and progressively end up with 100 ReLU and SIREN layers, dropout, batchnorm, ADAM, convolutions, attention mechanism and finally you get to 99% accuracy.

And then you wake up from your nice dream.

In practice, playing around with ML algorithms is just 10% of the job for an ML engineer. You do try out different algorithms, but you rarely write new ones from scratch. For most production projects, if it's not in one of the sklearn, Tensorflow or Pytorch libraries, it won't fly. For proof of concept projects you might try to use the GitHub repo that accompanies a paper, but that path is full of pain, trying to find all the dependencies of undocumented code and to make it work.

For the hyperparameter tuning, there are libraries to help you with that, and anyway, the time it takes to finish the training runs is much larger than the time you spend coding it up, for any real life datasets.

And in practice, you run into many issues with the data. You'll find that some of the columns in the data have lots of missing values. Or some of the datapoints that come from different sources have different meanings for the same columns. You'll find conflicting or invalid labels. And that means going back to the data pipelines and fixing that bugs that occur there.

If you do get a model that is good enough, it's time to deploy it, which comes with it's own fun...

]]>
<![CDATA[2020 in Review]]>2020 might have been a bad year outside, but it was a good year for my blog. I wrote 62 posts, almost as many as in the previous 3 years combined (63). Part of it was due to more time because of Covid, part of it was because of the

]]>
https://rolisz.ro/2020/12/31/2020-in-review/5fee192e08f8f65d8ac7c91cThu, 31 Dec 2020 23:02:17 GMT

2020 might have been a bad year outside, but it was a good year for my blog. I wrote 62 posts, almost as many as in the previous 3 years combined (63). Part of it was due to more time because of Covid, part of it was because of the 100 Days to Offload Challenge (which I didn't finish), part of it was because I have an interest to take my blog in a new direction, to help get leads for my consulting business.

Visits were up: 60.000 sessions compared to 10.000 in 2019. Most of my sessions were from unique visitors, because those were around 53.000, compared to 8700. Pageviews are at 73500, versus 16300.

Most of this is due to some posts that got very popular. The Moving away from Gmail post is now my most popular blog post ever, dethroning the neural network post that is 7 years old and is still getting 2000 views per year. It was on the front page of HackerNews and it got 36000 pageviews in 3 days. The Obsidian post was also quite popular, having been suggested in the Google app, getting to 8000 views. My Rust posts all got over 800 views, with the web crawler one getting over 2400. Surprisingly, how to bridge networks with a Synology NAS is a very interesting topic, because that also got 1000 views.

The Ghost platform has worked ok during the last year, but it has some small friction points, so I'm thinking about changing again. But regardless of how I'll post, I definitely plan to keep post more content.

]]>
<![CDATA[World's best phone case]]>Yesterday I enjoyed the Australian Șuncuiuș Christmas weather while doing another Via Ferrata trail. It was much harder than the one I did last year. But as I finished the vertical ascent that is seen in the top picture, my phone slipped from my pocket, and fell about

]]>
https://rolisz.ro/2020/12/31/worlds-best-phone-case/5fedc94d08f8f65d8ac7c8c8Thu, 31 Dec 2020 13:10:44 GMT

Yesterday I enjoyed the Australian Șuncuiuș Christmas weather while doing another Via Ferrata trail. It was much harder than the one I did last year. But as I finished the vertical ascent that is seen in the top picture, my phone slipped from my pocket, and fell about 15m.

I immediately thought I'd have to buy myself a late Christmas present. After we finished the hike, we went to search for the phone. The case had come off the phone and we found it pretty quickly. The phone was on vibrate, so calling it didn't help. It had slipped under some rocks, so we had to look harder for it. But after we found it, we were all shocked that it was intact, without a scratch on it.

The case has some very minor scratches on it. Ladies and gentleman, if until now I was a big fan of SupCase Unicorn Beetle Pro cases, from now on I probably won't buy a phone without a case from them. Kudos to the SupCase team!

]]>