rolisz's site

Regular Expressions for Objects

For work I recently needed to do something that is very similar to regexes, but with a twist: it should operate on lists of objects, not only on strings. Luckily, Python came to the rescue with REfO, a library for doing just this.

My usecase was selecting phrases from Part-of-Speech (POS) annotated text. The text was lemmatized and tagged using SpaCy and it resulted in lists of the following form:

s = [['i', 'PRON'], ['look', 'VERB'], ['around', 'ADP'], ['me', 'PRON'],
 ['and', 'CCONJ'], ['see', 'VERB'], ['that', 'ADP'], ['everyone', 'NOUN'],
 ['be', 'VERB'], ['run', 'VERB'], ['around', 'ADV'], ['in', 'ADP'],
 ['a', 'DET'], ['hurry', 'NOUN']]

From these sentences we want to extract human continue.

Real estate cohort analysis

After in the last post we looked how to get the data, now we are going to start analyzing it. The first question we are interested in is how quickly do houses sell. We don't have access to actual contracts, so we will use a proxy to measure this: how long is an ad­ver­tis­ment for a house still displayed. We are going to estimate that this is roughly the time it takes to sell a house.

We will do a cohort analysis, where each cohort will be composed of ads that were shown for the first time on that day and we will track what continue.

Scraping for houses

Having moved back to Romania, I decided I would need a place to live in, ideally to buy. So we started looking online for various places, we went to see a lot of them. Lots of work, especially footwork. But, being the data nerd that I am, I wanted to get smart about it and analyze the market.

For that, I needed data. For data, I turned to scraping. For scraping, I turned to Scrapy. While I did write a scraper 5 years ago, I didn't want to reinvent the wheel yet again, so I turned to Scrapy because it's a well-known, much used scraping framework in continue.

Indexing IM logs with Elasticsearch

Remember my old project for processing instant messaging logs? Probably, because I wrote about it five years ago. Well, the project is only mostly dead, every once in a while I still oc­ca­sion­al­ly work on it.

I mostly use it as an excuse to learn tech­nolo­gies that are used outside of the Google bubble. One thing that really impressed me with how well it works and how easy it is to set up was Elas­tic­search. Elas­tic­search is a search engine. You give it your documents and it indexes them and enables you to query them fast. There are other projects that do this for you, but ES can continue.

Searching for something?

This post is "cu dedicatie pentru Ciprian de la Bistrita", who has asked for a search feature for some time now

I didn't have a search on my blog for quite some time, because it's a static website, without any dynamic backend (except for comments, but those are well isolated, on a subdomain). But Javascript and the browsers are getting more and more features every day, so it's now possible to do all of this clientside. You just have to go to the search page (also linked in the menu).

I had three options:

  • Acrylamid has a builtin feature that builds up at com­pi­la­tion some continue.

Acrylamid image gallery

Acrylamid by default doesn't have image galleries. As I oc­ca­sion­al­ly like to post the images that I take, especially now that I'm visiting nice places in Zurich and the sur­round­ing area, this is quite annoying, as I don't want to manually create all the links for photos.

In the official docs, there is a suggestion to use Jinja2 inside posts to create a local gallery. While it does save you from writing all the image links yourself, you still have to do this for every post in which you want to insert a gallery.

So I wanted to make something more generic, where I could insert with one continue.

Automating using Python

A friend of mine who has a restaurant kept asking me to update the menu on his site. Each week he would send me the new menu, in .doc format, I would upload it to Google Drive (I don't have Office on my computer), I would take 3 screen­shots (one for each page), rename them and upload them using FTP to his server.

But, being a programmer, possibly the only profession in the world where laziness is a quality, not a defect, I decided to automate this as much as possible (teaching him how to write the tables in HTML and uploading files to FTP was too hard).

At continue.

Neural Networks in Python

Artificial Neural Networks are a math­e­mat­i­cal model, inspired by the brain, that is often used in machine learning. It was initially proposed in the '40s and there was some interest initially, but it waned soon due to the in­ef­fi­cient training algorithms used and the lack of computing power. More recently however they have started to be used again, especially since the in­tro­duc­tion of au­toen­coders, con­vo­lu­tion­al nets, dropout reg­u­lar­iza­tion and other techniques that improve their per­for­mance sig­nif­i­cant­ly.

Here I will present a simple multi-layer perceptron, im­ple­ment­ed in Python using numpy.

Neural networks are formed by neurons that are connected to each others and that send each other signals. If the number of continue.

Single table inheritance in Camelot

Recently, while working on a class project, I had a small problem with in­her­i­tance in Camelot, an excellent framework for rapid ap­pli­ca­tion de­vel­op­ment in Python.

In the ap­pli­ca­tion we needed to be able to define different kinds of projects, with similar needs, but which were each available only to some user groups. Single table in­her­i­tance sounds like a natural fit for this.

I couldn't find anything in the Camelot doc­u­men­ta­tion about this, but I figured that since it uses SQLAlchemy for its models, I could try the example given there:

 class Employee(Base):
    __tablename__ = 'employee'
    id = Column(Integer, primary_key=True)

Tutorial Camelot

Nu sunt Arthur sau Merlin, așa că nu voi vorbi despre orașul Camelot, ci despre frame­workul Python, care îi pretty much everything but the kitchen sink.

Cum tutorialul acesta este mai mult pentru colegii de grupă cu care lucrăm la proiect colectiv, voi presupune că este deja instalat Camelot.

Proiectul pe care îl vom crea este o mică chestie pe care eu vreau s-o fac de mai mult timp, și aceasta este opor­tu­ni­tatea perfectă: vom face un program cu care să pot urmări când se întâmplă anumite lucruri, în speranța ca mai încolo să pot extrage informații utile din când se întâmplă acele lucruri. Ce am de gând să măsor continue.