rolisz's site

Regular Expressions for Objects

For work I recently needed to do something that is very similar to regexes, but with a twist: it should operate on lists of objects, not only on strings. Luckily, Python came to the rescue with REfO, a library for doing just this.

My usecase was selecting phrases from Part-of-Speech (POS) annotated text. The text was lemmatized and tagged using SpaCy and it resulted in lists of the following form:

s = [['i', 'PRON'], ['look', 'VERB'], ['around', 'ADP'], ['me', 'PRON'],
 ['and', 'CCONJ'], ['see', 'VERB'], ['that', 'ADP'], ['everyone', 'NOUN'],
 ['be', 'VERB'], ['run', 'VERB'], ['around', 'ADV'], ['in', 'ADP'],
 ['a', 'DET'], ['hurry', 'NOUN']]

From these sentences we want to extract human continue.