rolisz's site

The Master Algorithm

The Master Algorithm

Having a goal to read books really helps with reading them. I finally started reading through the mountain of machine learning books that have been sitting on my shelves for half a year.

The Master Algorithm by Pedro Domingos, a professor at the University of Washington, is an user friendly book for introducing lay people to machine learning. It contains maybe 2-3 lines of math, 0 code, but a lot of explanations.

Professor Domingos divides up all of machine learning into five "tribes": symbolists, connectionists, evolutionaries, bayesians and analogizers. He describes various algorithms that are in each tribe, usually managing to make it quite understandable. However, for some reason, continue.


După cum am promis în urmă cu câteva zile, voi prezenta proiectul meu cu care am participat la Imprezzio Software Contest și pe care probabil îl voi prezenta la licență.

Ideea aplicației a pornit în anul 1 de facultate, când în primele luni tot rămâneam uimit că îmi zboară banii, așa că am început să notez în Excel toate chel­tu­ielile mele. Cu timpul, tabela Excel a devenit tot mai complexă și până la urmă am zis că mai bine îmi fac propria aplicație, cu care să pot scana bonurile și să îmi facă OCR pe ele.

Partea de scanat de bonuri... nu mi-a ieșit așa de bine cum aș continue.

Character segmentation overfitting

I'm doing a project about doing OCR on receipts and today, while trying to do character seg­men­ta­tion, I made a pretty stupid mistake that led to my model over­fit­ting almost perfectly pretty neatly (in some cases I got 100% correct clas­si­fi­ca­tion accuracy).

I already had my own data about letters (with the help of my parents, I labeled 7000 letters, with their bounding boxes in about 25 receipts) and my classifier (a simple linear SVM) on individual letters did pretty good: between 90-94% accuracy. For something obtained with almost 0 fiddling, it's pretty good, and good enough for my purposes. Also it's pretty much impossible to tell apart 0 continue.

Neural Networks in Python

Artificial Neural Networks are a math­e­mat­i­cal model, inspired by the brain, that is often used in machine learning. It was initially proposed in the '40s and there was some interest initially, but it waned soon due to the in­ef­fi­cient training algorithms used and the lack of computing power. More recently however they have started to be used again, especially since the in­tro­duc­tion of au­toen­coders, con­vo­lu­tion­al nets, dropout reg­u­lar­iza­tion and other techniques that improve their per­for­mance sig­nif­i­cant­ly.

Here I will present a simple multi-layer perceptron, im­ple­ment­ed in Python using numpy.

Neural networks are formed by neurons that are connected to each others and that send each other signals. If the number of continue.