<![CDATA[rolisz consulting]]>https://rolisz.ro/https://rolisz.ro/favicon.pngrolisz consultinghttps://rolisz.ro/Ghost 3.38Sat, 20 Feb 2021 14:35:36 GMT60<![CDATA[Learning to machine learn]]>tl;dr: I'm launching an introductory course about machine learning in Romanian. It's aimed not just at developers, but at a more general audience.

De ceva timp mă bate gândul să trec la următorul nivel de creare de conținut. Scriu pe blog de 10 ani și îmi place asta. Unele

]]>
https://rolisz.ro/2021/02/19/learning-to-machine-learn/602fdaddf2fbd3222b45f000Fri, 19 Feb 2021 15:51:29 GMT

tl;dr: I'm launching an introductory course about machine learning in Romanian. It's aimed not just at developers, but at a more general audience.

De ceva timp mă bate gândul să trec la următorul nivel de creare de conținut. Scriu pe blog de 10 ani și îmi place asta. Unele posturi pe care le-am scris despre programare și machine learning au avut succes. Așa că m-am gândit să fac un curs de machine learning.

Pe net sunt o mulțime de resurse de machine learning, cursuri care mai de care. Și eu am învățat din ele, deci sunt și cursuri bune și foarte bune printre ele. Dar pentru început, aș vrea să încep prin a face un curs în limba română, unde nu cred că sunt suficiente resurse de calitate. Bine, practic va fi o romgleză, că abia pot să pronunț „învățare automată”, „machine learning” alunecă mult mai bine. Ce să mai zic de deep learning...

O altă lacună pe care am identificat-o e că majoritatea cursurilor sunt pentru programatori care scriu cod în fiecare zi și vor să știe folosi și unealta numită machine learning. Dar este o lipsă mare de înțelegere a modului cum funcționează machine learning și inteligența artificială în rândul managerilor și, de ce nu, a oamenilor non tehnici. Dacă te iei doar după ce citești la știri, imediat urmează scenariul Terminator, când în realitate toate sistemele de ML au slăbiciuni mari și ușor de găsit.

Asta duce la unele așteptări nerealiste din partea conducerii unor firme, care vor să devină mai „hipsteri” și să folosească ML, dar vin cu idei complet greșite, care nu pot fi făcute să meargă suficient de bine. Sper să pot să ajut și astfel de persoane.

Mulți oameni cred că trebuie cunoștiințe tehnice foarte avansate ca să folosești chestii de inteligență artificială. Dar bariera scade tot mai mult și apar aplicații și în domenii creative, cum ar fi generare de imagini sau de text și care pot fi folosite relativ simplu, odată ce înțelegi conceptele de bază.

Dacă vă surâde ce ați citit mai sus, intrați pe pagina cursului.

]]>
<![CDATA[Design patterns in real life]]>In programming there are so called design patterns, which are basically commonly repeated pieces of code that occur often enough that people thought it would be helpful to give them a name so that’s it’s easier to talk about them. One example is the iterator pattern, which is

]]>
https://rolisz.ro/2021/01/26/design-patterns-in-real-life/601075f3f896ad697fe0fe06Tue, 26 Jan 2021 20:07:52 GMT

In programming there are so called design patterns, which are basically commonly repeated pieces of code that occur often enough that people thought it would be helpful to give them a name so that’s it’s easier to talk about them. One example is the iterator pattern, which is about an efficient method of traversing the elements of a container, whether they are an array, a hash table or something else. The builder pattern is used for building objects when we don’t know all their required parameters upfront.

Sometimes, if you don’t know about a pattern and you read code that uses it, it might seem strange. Why is this extra layer of abstraction here? Why is this API broken down into these pieces? After learning about the pattern, you might learn that the extra layer of abstraction is needed because the layer that’s below changes often. Or that the API is broken into those specific pieces because this makes it easy to cover more use cases in an efficient way.

As I’ve started diving head first into the world of running my own consulting business, I’m starting to learn about a whole other world of “design patterns”, unrelated to programming. And suddenly many things that I’ve seen before started to make sense.

My friend David has been bugging me to start a community for people passionate about machine learning in Oradea, where I live, for almost two years. For a long time I was thinking, why does he push so much for this? Well, after taking Seth Godin’s Freelancer Workshop, now I know that being the person who organizes a community is one of the best ways to make yourself known.

Another example is that I saw website offering a sort of business networking thing for a very high membership cost (or at least it seemed expensive at the time). Why would anyone do that? Then I learned about a thing called alchemy network 1 and how if it’s done well it can bring great value to it’s members.

All my friends who are freelancers charge by the hour. That’s what I thought was normal. But then I heard about Value based pricing by Jonathan Stark. A different pricing “design pattern”, which aligns the incentives of the client and of the service provider in a much better way. Let’s see if I can pull it off though.

Just like in programming, design patterns help us find the correct solution faster and communicate more efficiently. The more patterns you know, the faster you can recognize a situation and react better to it.

What are your favorite design patterns?

]]>
<![CDATA[How to ML - Monitoring]]>As much as machine learning developers like to think that once they've got a good enough model, the job is done, it's not quite so.

The first couple of weeks after deployment are critical. Is the model really as good as offline tests said they are? Maybe something is different

]]>
https://rolisz.ro/2021/01/22/how-to-ml-monitoring/600b34dcf896ad697fe0fdf3Fri, 22 Jan 2021 20:29:24 GMT

As much as machine learning developers like to think that once they've got a good enough model, the job is done, it's not quite so.

The first couple of weeks after deployment are critical. Is the model really as good as offline tests said they are? Maybe something is different in production then in all your test data. Maybe the data you collected for offline predictions includes pieces of data that are not available at inference time. For example, if trying to predict click through rates for items in a list and use that to rank the items, when building the training dataset it's easy to include the rank of the item in the data, but the model won't have that when making predictions, because it's what you're trying to infer. Surprise, the model will perform very poorly in production.

Or maybe simply A/B testing reveals that the fancy ML model doesn't really perform better in production than the old rules written with lots of elbow grease by lots of developers and business analysts, using lots of domain knowledge and years of experience.

But even if the model does well at the beginning, will it continue to do so? Maybe there will be an external change in user behavior and they will start searching for other kinds of queries, which your model was not developed for. Or maybe your model will introduce a "positive" feedback loop: it suggests some items, users click on them, so those items get suggested more often, so more users click on them. This leads to a "rich get richer" kind of situation, but the algorithm is actually not making better and better suggestions.

Maybe you are on top of this and you keep retraining your model weekly to keep it in step with user behavior. But then you need to have a staggered release of the model, to make sure that the new one is really performing better across all relevant dimensions. Is inference speed still good enough? Are predictions relatively stable, meaning we don't recommend only action movies one week and then only comedies next week? Are models even comparable from one week to another or is there a significant random component to them which makes it really hard to see how they improved? For example, how are the clusters from the user post data built up? K-means starts with random centroids and clusters from one run have only passing similarity to the ones from another run. How will you deal with that?

]]>
<![CDATA[GPT-3 and AGI]]>One of the most impressive/controversial papers from 2020 was GPT-3 from OpenAI. It's nothing particularly new, it's mostly a bigger version of GPT-2, which came out in 2019. It's a much bigger version, being by far the largest machine learning model at the time it was release, with 175

]]>
https://rolisz.ro/2021/01/21/gpt3-agi/5f3517c94f71eb12e0abb8bfThu, 21 Jan 2021 20:13:00 GMT

One of the most impressive/controversial papers from 2020 was GPT-3 from OpenAI. It's nothing particularly new, it's mostly a bigger version of GPT-2, which came out in 2019. It's a much bigger version, being by far the largest machine learning model at the time it was release, with 175 billion parameters.

It's a fairly simple algorithm: it's learning to predict the next word in a text[1]. It learns to do this by training on several hundred gigabytes of text gathered from the Internet. Then to use it, you give it a prompt (a starting sequence of words) and then it will start generating more words and eventually it will decide to finish the text by emitting a stop token.

Using this seemingly stupid approach, GPT-3 is capable of generating a wide variety of interesting texts: it can write poems (not prize winning, but still), write news articles, imitate other well know authors, make jokes, argue for it's self awareness, do basic math and, shockingly to programmers all over the world, who are now afraid the robots will take their jobs, it can code simple programs.

That's amazing for such a simple approach. The internet was divided upon seeing these results. Some were welcoming our GPT-3 AI overlords, while others were skeptical, calling it just fancy parroting, without a real understanding of what it says.

I think both sides have a grain of truth. On one hand, it's easy to find failure cases, make it say things like "a horse has five legs" and so on, where it shows it doesn't really know what a horse is. But are humans that different? Think of a small child who is being taught by his parents to say "Please" before his requests. I remember being amused by a small child saying "But I said please" when he was refused by his parents. The kid probably thought that "Please" is a magic word that can unlock anything. Well, not really, in real life we just use it because society likes polite people, but saying please when wishing for a unicorn won't make it any more likely to happen.

And it's not just little humans who do that. Sometimes even grownups parrot stuff without thinking about it, because that's what they heard all their life and they never questioned it. It actually takes a lot of effort to think, to ensure consistency in your thoughts and to produce novel ideas. In this sense, expecting an artificial intelligence that is around human level might be a disappointment.

On the other hand, I believe there is a reason why this amazing result happened in the field of natural language processing and not say, computer vision. It has been long recognized that language is a powerful tool, there is even a saying about it: "The pen is mightier than the sword". Human language is so powerful that we can encode everything that there is in this universe into it, and then some (think of all the sci-fi and fantasy books). More than that, we use language to get others to do our bidding, to motivate them, to cooperate with them and to change their inner state, making them happy or inciting them to anger.

While there is a common ground in the physical world, often times that is not very relevant to the point we are making: "A rose by any other name would smell as sweet". Does it matter what a rose is when the rallying call is to get more roses? As long as the message gets across and is understood in the same way by all listeners, no, it doesn’t. Similarly, if GPTx can affect the desired change in it's readers, it might be good enough, even if doesn't have a mythical understanding of what those words mean.


Technically, the next byte pair encoded token ↩︎

]]>
<![CDATA[How to ML - Deploying]]>So the ML engineer presented the model to the business stakeholders and they agreed that it performed well enough on the key metrics in testing that it's time to deploy it to production.

So now we have to make sure the models run reliably in production. We have to answer

]]>
https://rolisz.ro/2021/01/20/how-to-ml-deploying/60084bc7165bd14e3b33595dWed, 20 Jan 2021 15:28:54 GMT

So the ML engineer presented the model to the business stakeholders and they agreed that it performed well enough on the key metrics in testing that it's time to deploy it to production.

So now we have to make sure the models run reliably in production. We have to answer some more questions, in order to make some trade offs.

How important is latency? Is the model making an inference in response to a user action, so it's crucial to have the answer in tens of milliseconds? Then it's time to optimize the model: quantize weights, distill knowledge to a smaller model, weight pruning and so on. Hopefully, your metrics won't go down due to the optimization.

Can the results be precomputed? For example, if you want to make movie recommendations, maybe there can be a batch job that runs every night that does the inference for every user and stores them in a database. Then when the user makes a request, they are simply quickly loaded from the database. This is possible only if you have finite range of predictions to make.

Where are you running the model? On big beefy servers with a GPU? On mobile devices, which are much less powerful? Or on some edge devices that don't even have an OS? Depending on the answer, you might have to convert the model to a different format or optimize it to be able to fit in memory.

Even in the easy case where you are running the model on servers and latency can be several seconds, you still have to do the whole dance of making it work there. "Works on my machine" is all to often a problem. Maybe production runs a different version of Linux, which has a different BLAS library and the security team won't let you update things. Simple, just use Docker, right? Right, better hope you are good friends with the DevOps team to help you out with setting up the CI/CD pipelines.

But you've killed all the dragons, now it's time to keep watch... aka monitoring the models performance in production.

]]>
<![CDATA[How to ML - Models]]>So we finally got our data and we can get to machine learning. Without the data, there is no machine learning, there is at best human learning, where somebody tries to write an algorithm by hand to do the task at hand.

This is the part that most people who

]]>
https://rolisz.ro/2021/01/18/how-to-ml-models/6005e7293e8fc062a027dbe3Mon, 18 Jan 2021 19:55:44 GMT

So we finally got our data and we can get to machine learning. Without the data, there is no machine learning, there is at best human learning, where somebody tries to write an algorithm by hand to do the task at hand.

This is the part that most people who want to do machine learning are excited about. I read Bishop's and Murphy's textbooks, watched Andrew Ng's online course about ML and learned about different kinds of ML algorithms and I couldn't wait to try them out and to see which one is the best for the data at hand.

You start off with a simple one, a linear or logistic regression, to get a baseline. Maybe you even play around with the hyperparameters. Then you move on to a more complicated model, such as a random forest. You spend more time fiddling with it, getting 20% better results. Then you switch to the big guns, neural networks. You start with a simple one, with just 3 layers, and progressively end up with 100 ReLU and SIREN layers, dropout, batchnorm, ADAM, convolutions, attention mechanism and finally you get to 99% accuracy.

And then you wake up from your nice dream.

In practice, playing around with ML algorithms is just 10% of the job for an ML engineer. You do try out different algorithms, but you rarely write new ones from scratch. For most production projects, if it's not in one of the sklearn, Tensorflow or Pytorch libraries, it won't fly. For proof of concept projects you might try to use the GitHub repo that accompanies a paper, but that path is full of pain, trying to find all the dependencies of undocumented code and to make it work.

For the hyperparameter tuning, there are libraries to help you with that, and anyway, the time it takes to finish the training runs is much larger than the time you spend coding it up, for any real life datasets.

And in practice, you run into many issues with the data. You'll find that some of the columns in the data have lots of missing values. Or some of the datapoints that come from different sources have different meanings for the same columns. You'll find conflicting or invalid labels. And that means going back to the data pipelines and fixing that bugs that occur there.

If you do get a model that is good enough, it's time to deploy it, which comes with it's own fun...

]]>
<![CDATA[2020 in Review]]>2020 might have been a bad year outside, but it was a good year for my blog. I wrote 62 posts, almost as many as in the previous 3 years combined (63). Part of it was due to more time because of Covid, part of it was because of the

]]>
https://rolisz.ro/2020/12/31/2020-in-review/5fee192e08f8f65d8ac7c91cThu, 31 Dec 2020 23:02:17 GMT

2020 might have been a bad year outside, but it was a good year for my blog. I wrote 62 posts, almost as many as in the previous 3 years combined (63). Part of it was due to more time because of Covid, part of it was because of the 100 Days to Offload Challenge (which I didn't finish), part of it was because I have an interest to take my blog in a new direction, to help get leads for my consulting business.

Visits were up: 60.000 sessions compared to 10.000 in 2019. Most of my sessions were from unique visitors, because those were around 53.000, compared to 8700. Pageviews are at 73500, versus 16300.

Most of this is due to some posts that got very popular. The Moving away from Gmail post is now my most popular blog post ever, dethroning the neural network post that is 7 years old and is still getting 2000 views per year. It was on the front page of HackerNews and it got 36000 pageviews in 3 days. The Obsidian post was also quite popular, having been suggested in the Google app, getting to 8000 views. My Rust posts all got over 800 views, with the web crawler one getting over 2400. Surprisingly, how to bridge networks with a Synology NAS is a very interesting topic, because that also got 1000 views.

The Ghost platform has worked ok during the last year, but it has some small friction points, so I'm thinking about changing again. But regardless of how I'll post, I definitely plan to keep post more content.

]]>
<![CDATA[World's best phone case]]>Yesterday I enjoyed the Australian Șuncuiuș Christmas weather while doing another Via Ferrata trail. It was much harder than the one I did last year. But as I finished the vertical ascent that is seen in the top picture, my phone slipped from my pocket, and fell about 15m.

I

]]>
https://rolisz.ro/2020/12/31/worlds-best-phone-case/5fedc94d08f8f65d8ac7c8c8Thu, 31 Dec 2020 13:10:44 GMT

Yesterday I enjoyed the Australian Șuncuiuș Christmas weather while doing another Via Ferrata trail. It was much harder than the one I did last year. But as I finished the vertical ascent that is seen in the top picture, my phone slipped from my pocket, and fell about 15m.

I immediately thought I'd have to buy myself a late Christmas present. After we finished the hike, we went to search for the phone. The case had come off the phone and we found it pretty quickly. The phone was on vibrate, so calling it didn't help. It had slipped under some rocks, so we had to look harder for it. But after we found it, we were all shocked that it was intact, without a scratch on it.

The case has some very minor scratches on it. Ladies and gentleman, if until now I was a big fan of SupCase Unicorn Beetle Pro cases, from now on I probably won't buy a phone without a case from them. Kudos to the SupCase team!

]]>
<![CDATA[How to ML - Data]]>So we've decided what metrics we want to track for our machine learning project. Because ML needs data, we need to get it.

In some cases we get lucky and we already have it. Maybe we want to predict the failure of pieces of equipment in a factory. There are

]]>
https://rolisz.ro/2020/12/29/how-to-ml-data/5feb735d2bc2360ef49da332Tue, 29 Dec 2020 18:22:09 GMT

So we've decided what metrics we want to track for our machine learning project. Because ML needs data, we need to get it.

In some cases we get lucky and we already have it. Maybe we want to predict the failure of pieces of equipment in a factory. There are already lots of sensors measuring the performance of the equipment and there are service logs saying what was replaced for each equipment. In theory, all we need is a bit of a big data processing pipeline, say with Apache Spark, and we can get the data in the form of (input, output) pairs that can be fed into a machine learning classifiers that predicts if an equipment will fail based on the last 10 values measures from its sensors. In practice, we'll find that sensors of the same time that come from different manufacturers have different ranges of possible values, so they will all have to be normalized. Or that the service logs are filled out differently by different people, so that will have to be standardized as well. Or worse, the sensor data is good, but it's kept only for 1 month to save on storage costs so we have to fix that and wait a couple of months for more training data to accumulate.

The next best case is that we don't have the data, but we can get it somehow. Maybe there are already datasets on the internet that we can download for free. This is the case for most face recognition applications: there are plenty of annotated face datasets out there, with various licenses. In some cases the dataset must be bought, for example, if we want to start a new ad network, there are plenty of datasets available online of personal data about everyone, which can be used then to predict the likelihood of clicking on an ad. That's the business model of many startups...

The worst case is that we don't have data and we can't find it out there. Maybe it's because we have a very specific niche, such as we want to find defects in the manufacturing process of our specific widgets, so we can't use random images from the internet to learn this. Or maybe we want to do something that is really new (or very valuable), in which case we will have to gather the data ourselves. If we want to solve something in the physical world, that will mean installing sensors to gather data. After we get the raw data, such as images of our widgets coming of the production line, we will have to annotate those images. This means getting them in front of humans who know how to tell if a widget is good or defective. There needs to be a Q&A process in this, because even humans have an error rate, so each image will have to be labeled by at least three humans. We need several thousand samples, so this will take some time to set up, even if we can use crowdsourcing websites such as AWS Mechanical Turk to distribute the tasks to many workers across the world.

All this is done, we finally have data. Time to start doing the actual ML...

]]>
<![CDATA[How to ML - Metrics]]>We saw that machine learning algorithms process large amounts of data to find patterns. But how exactly do they do that?

The first step in a machine learning project is establishing metrics. What exactly do we want to do and how do we know we're doing it well?

Are we

]]>
https://rolisz.ro/2020/12/28/how-to-ml/5fea21292bc2360ef49da324Mon, 28 Dec 2020 18:19:07 GMT

We saw that machine learning algorithms process large amounts of data to find patterns. But how exactly do they do that?

The first step in a machine learning project is establishing metrics. What exactly do we want to do and how do we know we're doing it well?

Are we trying to predict a number? How much will Bitcoin cost next year? That's a regression problem. Are we trying to predict who will win the election? That's a binary classification problem (at least in the USA). Are we trying to recognize objects in an image? That's a multi class classification problem.

Another question that has to be answered is what kind of mistakes are worse. Machine learning is not all knowing, so it will make mistakes, but there are trade-offs to be made. Maybe we are building a system to find tumors in X-rays: in that case it might be better that we call wolf too often and have false positives, rather than missing out on a tumor. Or maybe it's the opposite: we are trying to implement a facial recognition system. If the system recognizes a burglar incorrectly, then the wrong person will get sent to jail, which is a very bad consequence for a mistake made by "THE algorithm".

These are not just theoretical concerns, but they actually matter a lot in building machine learning systems. Because of this, many ML projects are human-in-the-loop, meaning the model doesn't decide by itself what to do, it merely makes a suggestion which a human will then confirm. In many cases, that is valuable enough, because it makes the human much more efficient. For example, the security guard doesn't have to look at 20 screens at once, but can only look at the footage that was flagged as anomalous.

Tomorrow we'll look at the next step: gathering the data.

]]>
<![CDATA[What is ML? part 3]]>Yesterday we saw that machine learning is behind some successful products and it does have the potential to bring many more changes to our life.

So what is it?

Well, the textbook definition is that it's the building of algorithms that can perform tasks they were not explicitly programmed to

]]>
https://rolisz.ro/2020/12/24/what-is-ml-3/5fe4b0ab2bc2360ef49da308Thu, 24 Dec 2020 15:18:51 GMT

Yesterday we saw that machine learning is behind some successful products and it does have the potential to bring many more changes to our life.

So what is it?

Well, the textbook definition is that it's the building of algorithms that can perform tasks they were not explicitly programmed to do. In practice, this means that we have algorithms that analyze large quantities of data to learn some patterns in the data, which can then be used to make predictions about new data points.

This is in contrast with the classical way of programming computers, where a programmer would use either their domain knowledge or they would analyze the data themselves and then write the program that has the correct output.

So one of the crucial distinctions is that in machine learning, the machine has to learn from the data. If a human being figures out the pattern and writes a regular expression to find addresses in text, that's human learning, and we all go to school to do that.

Now does that mean that machine learning is a solution for everything? No. In some cases, it's easier or cheaper to have a data analyst or a programmer find the pattern and code it up.

But there are plenty of cases where despite decades long efforts of big teams of researchers, humans haven't been able to find an explicit pattern. The simplest example of this would be recognizing dogs in pictures. 99.99% of humans over the age of 5 have no problem recognizing a dog, whether a puppy, a golden retriever or a Saint Bernard, but they have zero insight into how they do it, what makes a bunch of pixels on the screen a dog and not a cat. And this is where machine learning shines: you give it a lot of photos (several thousands at least), pair each photo with a label of what it contains and the neural network will learn by itself what makes a dog a dog and not a cat.

Machine learning is just one tool that is available at our disposal, among many other tool. It's a very powerful tool and it's one that gets "sharpened" all the time, with lots of research being done all around the world to find better algorithms, to speed up their training and to make them more accurate.

Come back tomorrow to find out how the sausage is made, on a high level.

]]>
<![CDATA[What is ML? part 2]]>Yesterday I wrote how AI made big promises in the past but it failed to deliver, but that now it's different.

What's changed?

Well, now we have several products that work well with machine learning. My favorite example is Google Photos, Synology Moments and PhotoPrism. They are all photo management

]]>
https://rolisz.ro/2020/12/23/what-is-ml-part-2/5fe301b952484d7aadd5c620Wed, 23 Dec 2020 08:42:31 GMT

Yesterday I wrote how AI made big promises in the past but it failed to deliver, but that now it's different.

What's changed?

Well, now we have several products that work well with machine learning. My favorite example is Google Photos, Synology Moments and PhotoPrism. They are all photo management applications which use machine learning to automatically recognize all faces in pictures (easy, we had this for 15 years), recognize automatically which pictures are of the same person (hard, but doable by hand if you had too much time) and more than that, index photos by all kinds of objects that are found in them, so that you can search by what items appear in your photos (really hard, nobody had time to do that manually).

I have more than 10 years of photos uploaded to my Synology and one of my favorite party tricks when talking to someone is to whip out my phone and show them all the photos I have of them, since they were kids, or the last time that we met, or that funny thing that happened to them and I have photographic evidence of. Everyone is amazed by that (and some are horrified and deny that they looked like that when they were children). And there is not one, but at least three options to do this, one of which is open source, so that anyone can run in at home on their computer, for free, so there is demand for such a product.

Other successful examples are in the domain of recommender systems, YouTube being a good example. I have a love/hate relationship with it: on one hand, I wasted so many hours of my life to the recommendations it makes (which is proof of how good it is at making personalized suggestions), on the other hand, I found plenty of cool videos with it. This deep learning based recommender system is one of the factors behind the growth of the watch time on YouTube, which is basically the key metric behind revenue (more watch time, more ads).

These are just two examples that are available for everyone to use, and which serve as evidence that machine learning based AI now is not just hot air.

But I still haven't answered the question what is ML... tomorrow, I promise.

]]>
<![CDATA[What is ML?]]>Machine learning is everywhere these days. Mostly in newspapers, but it's seeping into many real life, actual use cases. But what is it actually?

If you read only articles on TechCrunch, Forbes, Business Insider or even MIT Technology Review, you'd think it's something that brings Model T800 to life soon,

]]>
https://rolisz.ro/2020/12/21/what-is-ml/5fe0d5bd1012ed2b469ca47fMon, 21 Dec 2020 17:17:34 GMT

Machine learning is everywhere these days. Mostly in newspapers, but it's seeping into many real life, actual use cases. But what is it actually?

If you read only articles on TechCrunch, Forbes, Business Insider or even MIT Technology Review, you'd think it's something that brings Model T800 to life soon, or that it will cure cancer and make radiologists useless, or that it will enable humans to upload their minds to the cloud and live forever, or that it will bring fully self driving cars by the end of the year (every year for the last 5 years).

Many companies want to get in on the ML bandwagon. It's understandable: 1) that's where the money is (some 10 billion dollars were invested in it in 2018) and 2) correctly done, applied to the right problems, ML can actually be really valuable, either by automating things that were previously done with manual labor or even by enabling things that were previously unfeasible.

But at the same time, a lot of ML projects make unrealistic promises, eat a lot of money and then deliver something that doesn't work well enough to have a positive ROI. The ML engineers and researchers are happy, they got payed, analyzed the data and played around with building ML models, and maybe even published a paper or two. But the business is not happy, because they are not better off in any way.

This is not a new phenomenon. Artificial Intelligence, of which Machine Learning is a subdomain of, has been plagued by similar bubbles ever since it was founded. AI has gone through several AI winters already, in the 60s, 80s and late 90s. Big promises, few results.

To paraphrase Battlestar Galactica, "All this has happened before, all this will happen again but this time it's different". But why is it different? More about that tomorrow.

]]>
<![CDATA[Machine Learning stories: Misunderstood suggestions]]>A couple of years ago I was working on a calendar application, on the machine learning team, to make it smarter. We had many great ideas, one of them being that once you indicated you wanted to meet with a group of people, the app would automatically suggest you a

]]>
https://rolisz.ro/2020/11/30/machine-learning-stories/5fc5247f53c65419dc54f518Mon, 30 Nov 2020 17:00:59 GMT

A couple of years ago I was working on a calendar application, on the machine learning team, to make it smarter. We had many great ideas, one of them being that once you indicated you wanted to meet with a group of people, the app would automatically suggest you a time slot for the meeting.

We worked on it for several months. We couldn't just use simple hand-coded rules, because we wanted to do things like learn every users working hours, which could vary based on many things. In the end, we implemented this feature using a combination of both hand coded rules (to avoid some bad edge cases) and machine learning. We did lots of testing, both automated and manually in our team.

Once the UI was ready, we did some user testing, where the new prototype was put in front of real users, unrelated to our team, who were recorded while they tried to use it and then were asked questions about the product. When the reports came in, the whole team banged their heads against the desk: most users thought we were suggesting times when the meeting couldn't take place!

What happened? If you included either many people or even only one very busy person, there will be no empty slot which is good for everyone. So our algorithm would make three suggestions, saying that for each there would be a different person who might not be able to make the meeting.

In our own testing, it was obvious to us what was happening, so we didn't consider it a big problem. But users who didn't know the system, found it confusing and kept going to the classic grid to manually find a slot for the meeting.

Lesson: machine learning algorithms are never perfect and every project needs to be prepared to deal with mistakes.

How will your machine learning project handle failures? How will you explain to the end users the decisions the algorithm made? If you need help answering these questions, let's talk.

]]>
<![CDATA[My next steps]]>https://rolisz.ro/2020/11/29/my-next-steps/5fc3f41febd40d0556a6e3b8Sun, 29 Nov 2020 19:19:40 GMT

Consulting is something I have dreamed of for a long time. I have done a little bit in the past, on the side, but now the time has come to pursue this full time.

I have used machine learning to solve a large variety of problems such as:

  • text recognition (OCR) from receipts
  • anomaly detection on monitoring data
  • understanding how people use calendar software
  • room booking recommendations
  • chatbots
  • real time surveillance video analysis
  • time series forecasting
  • personally identifiable information (PII) detection

So if you have a hairy machine learning problem and you need advice on how to move forward, I can help you find the best way.

If you are a software company that wants to start developing machine learning projects, I can provide training for your team so that they can develop these projects. I can also give presentations and explain to managers and executives how machine learning projects are developed, what can be done with it (it's not a silver bullet that will solve all known problems) and how ML projects are different from traditional software projects, both during development and in deployment.

If you are a company that wants to know if machine learning is the right solution for a problem you have, such as automating a process that currently is very labor intensive, I can help you make this decision and develop a strategy for making the transition towards automation.

Are you a company that is looking to acquire a machine learning solution and you want some independent appraisal of the cost, duration and feasibility of the project? I can help you with this as well.

So if you need a machine learning advisor, consultant or trainer, feel free to reach out to me.

]]>