rolisz's site

Indexing IM logs with Elasticsearch

Remember my old project for processing instant messaging logs? Probably, because I wrote about it five years ago. Well, the project is only mostly dead, every once in a while I still oc­ca­sion­al­ly work on it.

I mostly use it as an excuse to learn tech­nolo­gies that are used outside of the Google bubble. One thing that really impressed me with how well it works and how easy it is to set up was Elas­tic­search. Elas­tic­search is a search engine. You give it your documents and it indexes them and enables you to query them fast. There are other projects that do this for you, but ES can continue.

Searching for something?

This post is "cu dedicatie pentru Ciprian de la Bistrita", who has asked for a search feature for some time now

I didn't have a search on my blog for quite some time, because it's a static website, without any dynamic backend (except for comments, but those are well isolated, on a subdomain). But Javascript and the browsers are getting more and more features every day, so it's now possible to do all of this clientside. You just have to go to the search page (also linked in the menu).

I had three options:

  • Acrylamid has a builtin feature that builds up at com­pi­la­tion some continue.

Acrylamid image gallery

Acrylamid by default doesn't have image galleries. As I oc­ca­sion­al­ly like to post the images that I take, especially now that I'm visiting nice places in Zurich and the sur­round­ing area, this is quite annoying, as I don't want to manually create all the links for photos.

In the official docs, there is a suggestion to use Jinja2 inside posts to create a local gallery. While it does save you from writing all the image links yourself, you still have to do this for every post in which you want to insert a gallery.

So I wanted to make something more generic, where I could insert with one continue.

Automating using Python

A friend of mine who has a restaurant kept asking me to update the menu on his site. Each week he would send me the new menu, in .doc format, I would upload it to Google Drive (I don't have Office on my computer), I would take 3 screen­shots (one for each page), rename them and upload them using FTP to his server.

But, being a programmer, possibly the only profession in the world where laziness is a quality, not a defect, I decided to automate this as much as possible (teaching him how to write the tables in HTML and uploading files to FTP was too hard).

At continue.

Neural Networks in Python

Artificial Neural Networks are a math­e­mat­i­cal model, inspired by the brain, that is often used in machine learning. It was initially proposed in the '40s and there was some interest initially, but it waned soon due to the in­ef­fi­cient training algorithms used and the lack of computing power. More recently however they have started to be used again, especially since the in­tro­duc­tion of au­toen­coders, con­vo­lu­tion­al nets, dropout reg­u­lar­iza­tion and other techniques that improve their per­for­mance sig­nif­i­cant­ly.

Here I will present a simple multi-layer perceptron, im­ple­ment­ed in Python using numpy.

Neural networks are formed by neurons that are connected to each others and that send each other signals. If the number of continue.

Single table inheritance in Camelot

Recently, while working on a class project, I had a small problem with in­her­i­tance in Camelot, an excellent framework for rapid ap­pli­ca­tion de­vel­op­ment in Python.

In the ap­pli­ca­tion we needed to be able to define different kinds of projects, with similar needs, but which were each available only to some user groups. Single table in­her­i­tance sounds like a natural fit for this.

I couldn't find anything in the Camelot doc­u­men­ta­tion about this, but I figured that since it uses SQLAlchemy for its models, I could try the example given there:

 class Employee(Base):
    __tablename__ = 'employee'
    id = Column(Integer, primary_key=True)

Tutorial Camelot

Nu sunt Arthur sau Merlin, așa că nu voi vorbi despre orașul Camelot, ci despre frame­workul Python, care îi pretty much everything but the kitchen sink.

Cum tutorialul acesta este mai mult pentru colegii de grupă cu care lucrăm la proiect colectiv, voi presupune că este deja instalat Camelot.

Proiectul pe care îl vom crea este o mică chestie pe care eu vreau s-o fac de mai mult timp, și aceasta este opor­tu­ni­tatea perfectă: vom face un program cu care să pot urmări când se întâmplă anumite lucruri, în speranța ca mai încolo să pot extrage informații utile din când se întâmplă acele lucruri. Ce am de gând să măsor continue.

Processing IM logs

For a few years now, I've always kept my IM archives. I didn't really have a purpose, I just thought that it might be fun to one day look back and see what kind of dis­cus­sions I had. Well, now I have 150 Mb of logs from Digsby, Trillian and Pidgin and there is no way I'm ever going to read that again. But in light of a few things I learned recently (the Coursera NLP and ML courses) I am going to try to visualize and analyze my archives in a math­e­mat­i­cal way. That's right, I'm reducing you to numbers. :D. At least what we've discussed continue.

Simplu calculator în Python - partea 3

Acest post face parte dintr-o serie în care eu fac un mic calculator în Python.

Data trecută am construit arborele sintactic core­spun­ză­tor expresiei, iar acuma îl vom evalua. Aceasta este mult mai simplu decât parsarea, așa că hai să scăpăm. În final vom face apoi un REPL (read, evaluate, print loop), deci vom face o „consola” pentru cal­cu­la­torul nostru. Aceasta este utilă pentru a putea beneficia de atribuirea de valori, că altfel ele se pierd după fiecare execuție.

Să începem cu testele:

from interpreter import interpreter
from tokenizer import Tokenizer
from tree import parseTree

pT = parseTree()
tok = Tokenizer()
interp = interpreter()
Tree = pT.buildParseTree(tok.tokenize("1+2"))
assert(interp.evaluate(Tree) == 3)
Tree = pT.buildParseTree(tok.tokenize("(5+(2\*3+2))-3\*((5+6)/2-4)"))
assert(interp.evaluate(Tree) == 8.5)
Tree = pT.buildParseTree(tok.tokenize("x = 2"))

Simplu calculator în Python - partea 2

Acest post face parte dintr-o serie în care eu fac un mic calculator în Python Data trecută ne-am ocupat de împărțirea stringului rezultat în bucăți elementare. Acum vom trece la următoarea parte: parsarea. Aceasta ne va da structura expresiei pe care vrem să o evaluăm. Dacă în urma to­k­enizării obțineam erori dacă stringul conținea caractere invalide, acum vom obține erori dacă tokenele noastre nu sunt în ordinea potrivită (dacă avem doi operatori unul după altul de exemplu) sau dacă nu sunt suficienți operanzi (numărul de paranteze deschise nu coincide cu cele închise). Este important de menționat că deocamdată nu atribuim semantică tokenelor, ci doar sintaxă. Regulile de ordine de efectuare a op­er­ați­ilor continue.