rolisz's site en Wed, 30 Jan 2019 21:26:00 GMT acrylamid 0.7.10 NAS Outage #1 <dl> <dt>Incident scope</dt> <dd>NAS user not able to access Web Interface for NAS</dd> <dt>Incident duration</dt> <dd>At least 30 hours, up to 72 hours</dd> <dt>Incident resolution</dt> <dd>Change router port </dd> </dl> <p>Last weekend I was not at home. Some time on Saturday afternoon, I wanted to access some of the self hosted services I host on my NAS. I entered the URL and waited. I knew that sometimes the first request takes some time (maybe the NAS powers down the hard disks), so after waiting some time, I just hit refresh. Still nothing. I tried another service. Nothing. I tried all of them. Nothing. I tried going to the IP address of the NAS directly. I get the default page from the web server, which comes up when I don't use the right hostname. </p> <p>Okay, we have a problem. My first suspicion is that the DNS of my domain is broken somehow. I do a DNS query with <code>dig</code>. I freak out a bit when I see two IP addresses there, but after some searching, I find that one of them belongs to my DNS provider and is needed because I have DynamicDNS. I log into my provider, <a href="">Namecheap</a> from whom I bought the domain and who is managing my DNS. Everything looks normal. </p> <p>Then I try to ssh into the NAS. This works pretty well, although it seems to be pretty slow. I try to look around in the logs, nothing suspicious. I restart the nginx server. I restart several other packages on the NAS that I think might have anything to do with this. Nothing. </p> <p>Then I think about trying to run <code>curl</code> to download the main page from my NAS. And lo and behold, I see the HTML of the login page to the Web Interface appear. Okay, let's try again in the browser. After several minutes, the loading stops, but nothing shows up. I checkout the DOM Inspector and indeed things do show up there as well. What? There are no errors in the console, but the network tab does show that everything is really slow. </p> <p>After several hours of investigations, I give up and enjoy my time with my in-laws. I put my SRE hat back on only after I get back home the next day. Unsurprisingly, I still can't connect even from the LAN. It's actually a bit better, because some things open, but it's very flaky. And even more surprisingly, I discovered that I had Monica open on my phone browser and I can navigate it there! What is going on? </p> <p>I had recently moved the NAS to another room and I had tested the speed of my home network with <a href="">iperf3</a>. I decided to test it again. On the NAS I ran the following command: </p> <pre><code>sudo docker run -it --rm -p 5201:5201 networkstatic/iperf3 -s </code></pre> <p>And on my desktop I ran:</p> <pre><code>iperf3 -c -p 5201 -t 30 </code></pre> <p>The first time I tested it, in the first room, it was around 900 Mbits/sec. After I moved it to the other room, it dropped to around 500 Mbits/sec, but I assumed that's because of the lower quality cable that had been placed in the wall. But now, when I ran it again, I got around 40-50 Mbits/sec, sometimes even 0 Mbit/sec for several seconds. I ran the test several times and then I noticed something that looked suspicious: a column called "Retr" with values like "165", "229", "66", "160" in it. "Retr" looks very much like "Retries". My hunch is confirmed when looking in the iperf manual: some TCP packets are being retransmitted, which means I have packets dropped somewhere or packet corruption. </p> <p>Running <code>netstat -i</code> confirms that yes, I have errors only among the received packets. Searching for this issue reveals that the most common issue is a bad cable. I had crimped the Ethernet cable in the other room when I moved the NAS there, so I thought that's the problem. I recrimped it, but no luck. Then I thought that maybe the other end is bad. Nope. Then I tried plugging it into the other empty LAN port in the router. Bam. No more packet retries. Speed between my desktop and NAS is back to 900 Mbits/sec. I can access all the services. Outage over. The problem was a faulty port on my ISP provided router. </p> <p>This was my biggest outage I ever had so far with my NAS. All previous ones were several hours long at best, if the internet went out. I have a UPS, so even short electrical outages don't affect it. I didn't like the router I had from my ISP, but now I'll have an even bigger reason to get it replaced as soon as possible. A lesson I learned from this is that intermittent network errors can cause very weird issues, which can be partially masked because TCP has a lot of redundancies and retries built in.</p> Wed, 30 Jan 2019 21:26:00 GMT,2019-01-30:/2019/01/30/nas-outage-1 Books of 2018 <p>Inspired by many of the posts I saw on the blogosphere about how people did in their reading challenge in 2018, I decided to do a short recap as well.</p> <p>I started 12 books in 2018 and I finished 10 of them. This is unusual for me, because I'm quite picky about what books I start reading and I like to finish what I start. But I couldn't stand the author's style in one the books, even though the topic sounded really interesting (The Sovereignty of God by A.W. Pinker), and I found the other one to be too slow (The Progress of Doctrine in the New Testament by Thomas Dehany Bernard).</p> <div style="float: left; margin-right:20px; margin-bottom: 10px; max-height: 400px"> <a target="_blank" href=";camp=1789&amp;creative=9325&amp;creativeASIN=1785651560&amp;linkCode=as2&amp;tag=rolisz-20&amp;linkId=ee1b96f27cdb5a1dae4f4719f1a9d058"> <img border="0" src="/static/images/2019/01/books/nexus.png"> </a> </div> <p>An interesting shift for me was that I read only one fiction book last year, while before that most of the books that I read were fiction, usually sci-fi. I'm not sure why my tastes have changed like this over the last two years. Maybe I want to get more value out of books, rather than just read for entertainment value? Anyway, I ended last year with a sci-fi book, <a href=";camp=1789&amp;creative=9325&amp;creativeASIN=1785651560&amp;linkCode=as2&amp;tag=rolisz-20&amp;linkId=43ffec31c18ca59e0d08a47aaf542be0">Mass Effect: Nexus Uprising</a>, which I chose because I like the game series (even <a href="">Mass Effect: Andromeda</a>), and I actually liked the book too! It's a prequel to the latest game in the series and provides some pretty nice background information on what happens right upon the Nexus' arrival in Andromeda and how everything went wrong there. Sloane Kelly is fleshed out as a character and you find out some information about Jarun Tann and the Krogan Nakmor clan. One comment I have about the book is that sometimes bad things happen too unexpectedly (a sort of reverse Deus Ex Machina), especially the deterioration of relationships between different people. But still, I can't wait to read the two sequels for the book.</p> <p>Another "unusual" fact was that I read three books in Romanian in 2018. Among my last 50 read books, these are the only 3 ones in Romanian, there is another one in Hungarian and all the others are in English, because I prefer reading in the original language, which is English most of the time. One of these books was <a href=";aff_code=de74d8a57&amp;unique=184f69294&amp;redirect_to=http%253A//">Disconfort Residence</a>, which is a book written by a Romanian architect (so Romanian was the original language), about what to look out for when buying an apartment. This comes in very handy as I'm looking to get my own place. The other two books were borrowed from my housemate: one was Seth Godin's <a href=";aff_code=de74d8a57&amp;unique=184f69294&amp;redirect_to=http%253A//">We're all weird</a>. I liked it, but I prefer Seth's style in English (which I know from reading his blog). And also, "We're all weird" rolls of the tongue better than "Toti suntem ciudati", doesn't it? :D The other one was <a href=";aff_code=de74d8a57&amp;unique=184f69294&amp;redirect_to=http%253A//">The Little Book That Saves Your Assets</a>, about investing. I didn't really like the book, because it recommends a too active approach for investing, while I believe more in a passive, long term one. </p> <div style="float:right; margin-left: 20px; max-height:400px"> <a href=";tag=rolisz-20&amp;camp=1789&amp;creative=9325&amp;linkCode=as2&amp;creativeASIN=9814021725&amp;linkId=f6265f518a607718a92b15885fbd4138" target="_blank"> <img src="/static/images/2019/01/books/unknowable.jpg" height="400px"> </a> </div> <p>I found two books to be very interesting and full of new and challenging ideas for me. The first one was <a href=";tag=rolisz-20&amp;camp=1789&amp;creative=9325&amp;linkCode=as2&amp;creativeASIN=9814021725&amp;linkId=f6265f518a607718a92b15885fbd4138">The Unknowable</a> by Gregory Chaitin, who started the field of algorithmic information theory and made contributions to metamathematics. The book describes how you cannot find the shortest program that writes out a given string, which is one of the significant provably improvable facts in mathematics and computer science, after Godel's Incompleteness Theorem and Turing's Halting Problem. While I kinda understood the proof Chaiting presents here in layman's terms, I still find it hard to believe sometimes. The description of the problem is very simple, yet Chaitin builds up a fairly short proof that you cannot calculate it for arbitrary string. This problem, in turn, is related to a number Chaitin discovered, which can be described mathematically and several of its properties are known, but it's still uncomputable. And not just because we don't know how to compute it, but we actually know that it's impossible to do so by anyone in this universe. The question is then can God do it?</p> <p>The other thought-provoking one is the <a href=";camp=1789&amp;creative=9325&amp;creativeASIN=019932218X&amp;linkCode=as2&amp;tag=rolisz-20&amp;linkId=72ffb09de99e92e8b884f9d01869e1cf">Economics of Good and Evil</a>. The book does a whirlwind tour of economics, starting with the Book of Gilgamesh, the Bible, ancient Greeks, medieval times to modern days, and shows how even in an epic story like the one about Gilgamesh, you can find economic reasoning, especially one related to what's good and what's evil. The author, who used to be the economic advisor to the Czech president, questions the current prevailing mindset that growth must happen at all costs, because anything else is bad. </p> <div style="float:left; margin-right: 20px; max-height:400px; margin-bottom: 10px"> <a href=";tag=rolisz-20&amp;camp=1789&amp;creative=9325&amp;linkCode=as2&amp;creativeASIN=0764216171&amp;linkId=f16f59728858f6226d52b8c69d1bebef" target="_blank"> <img src="/static/images/2019/01/books/faith.jpg" height="400px"> </a> </div> <p>The other books that I read were Christian books. Two of them stand out: <a href=";camp=1789&amp;creative=9325&amp;creativeASIN=0310517826&amp;linkCode=as2&amp;tag=rolisz-20&amp;linkId=41111829c2b6610e989156b12c8d8fa2">How to Read the Bible for All Its Worth</a> by Gordon Fee and Douglas Stuart. The book is exactly about what the title says: to help people read the Bible better, by providing some necessary "lenses" with which to understand the text. It explains how different books of the Bible have different styles, which means they need to be read differently. For example, you don't read poems, such as the Psalms, in the same way as you read a letter, like the Epistles of Paul, or like you read the narratives from the Old Testament. I loved this book and I try to put it into practice as often as I can, while doing my daily devotional reading. The other Christian book that I loved was <a href=";tag=rolisz-20&amp;camp=1789&amp;creative=9325&amp;linkCode=as2&amp;creativeASIN=0764216171&amp;linkId=f16f59728858f6226d52b8c69d1bebef">A Disruptive Faith</a> by A.W. Tozer. What he says in the first chapter stands out for me, how we are a fixture on God's mind and that He cannot not think about us. He loves us, despite our frailty and sin and he is grieved by our bad actions. He is crushed by this burden. And that's what drives his actions, not some cold calculated plan to fulfill his purpose. The rest of the book just expands on this :)</p> <p>This was a short review of some of the books that I read last year. I hope this year I will be able to read more!</p> Mon, 14 Jan 2019 07:36:00 GMT,2019-01-14:/2019/01/14/books-of-2018 Man (and car) versus nature 2 <p><img alt="View from the hike" src="" width="100%" style="max-width:600px"/></p> <p>Almost <a href="">a year later</a>, I went again on a winter hike, with the same group of friends, to the "Biserica Moțului" Peak, close to Padiș. This time I was one of the drivers, so I went with our little Skoda, with 3 other guys. </p> <figure> <p><img alt="The church at the peak" src="" width="100%" style="max-width:600px"/> <figcaption>The Moțului Church at the peak</figcaption></p> </figure> <p>Because this hike was shorter, we decided it's enough if we leave one hour later than last time. It was a very good decision, which would have been better if we had left one more hour later. </p> <p>The evening before we left, it started snowing. We decided that we will still go, no matter what. The road until Boga, 15 kilometers away from Padis, was ok. There we ran into the snow plower, which was going in front of us. At some point, it let us go ahead. The other car from our group was an all wheel drive Touareg, so they forged right ahead and I followed in their tracks.</p> <figure> <p><img alt="Following the snow plow" src="" width="100%" style="max-width:600px"/> <figcaption>Following the snow plow</figcaption></p> </figure> <p>But, the snow and the slope was too much for the little Skoda Fabia. After less than 500 meters, it just couldn't continue, so the guys pushed the car to the side and we waited for the snow plow. We finished the last 9.2 kilometers to Padis in one and a half hours. Behind us we would see other cars catching up to us and then turning back when they saw the glacial pace in which we were moving. </p> <figure> <p><img alt="Closed road sign" src="" width="100%" style="max-width:600px"/> <figcaption>The road is closed!</figcaption></p> </figure> <p>Once we got to Padis, we started the actual hike. The visibility was not too great, so in the first kilometer we did some zig zagging until we found the "path", which was under 1 meter of snow. Or so the GPS said. </p> <p>The hike was easier and shorter than last years, but the end was much harder for me. While I complained last year that the snow kept breaking under us and we would fall in knee deep, this year was worse. I gained 4 extra kilos, so the snow would give in until I was waist deep. And it's much, much, much harder to climb out when you are so deep and the snow keeps collapsing with you. I made a third of the last 100 metres mostly crawling. </p> <p><img alt="Closed road sign" src="" width="100%" style="max-width:600px"/></p> <p>But I survived and the hike back was much easier, unlike last year! And I still want to repeat this!</p> Sun, 06 Jan 2019 14:32:00 GMT,2019-01-06:/2019/01/06/man-and-car-versus-nature-2 2018 in Review <p>In 2018 I finally managed to buck the trend of writing fewer and fewer posts: I wrote 22 posts, just like in the previous year! That's a bit less than what I would have liked (at least two a month), but personal life keeps getting busier and busier. </p> <p>The number of sessions continued dropping, by almost 30%, down to 10000. Pageviews dropped a bit below 20000, but session length grew by 6%. </p> <p>The day with the most pageviews was the 1st of February, with 369 views, when I posted about <a href="">leaving Google</a>. The second most popular day was October 17, after I posted about our trip to <a href="">Hong Kong</a>, when I had 285 page views. </p> <p>My most popular page remains the same: the neural networks one. From this year, the most popular one was the one about leaving Google, with 442 views, followed by the <a href="">Synology and Docker</a> one, with 379 views. </p> <p>I hope I will manage to get an increase in numbers next year. I have quite a few ideas, so I hope I can write more blog posts, both more fun ones and more technical ones, and who knows, maybe I'll even have a guest writer :)</p> Mon, 31 Dec 2018 23:46:00 GMT,2019-01-01:/2019/01/01/2018-in-review Boardgames Party: Sushi Go <p><img alt="Sushi Go box" src="" width="100%" style="max-width:600px"/></p> <p>This year I caught the bug of boardgames. I bought a lot of them, I tried a lot of them and I played some a loooot of times. My lovely wife also enjoys this hobby very much, so we often have friends over to play various boardgames. Almost all of our friends enjoy playing boardgames, I even managed to convince my parents to play once, but my dad confessed that he just... doesn't see the point in playing any kinds of games. </p> <p>I want to start reviewing some of the boardgames in my collection. Today I will start with <a href=";aff_code=de74d8a57&amp;unique=184f69294&amp;redirect_to=http%253A//">Sushi Go</a> (<a href=";camp=1789&amp;creative=9325&amp;creativeASIN=B00J57VU44&amp;linkCode=as2&amp;tag=rolisz-20&amp;linkId=184d1eeb866051ca1ea355ddd4085cd9">Amazon</a> link), a very simple and quick card game which revolves around sushi, obviously. </p> <p><img alt="A hand of cards in Sushi Go" src="" width="100%" style="max-width:600px"/></p> <p>Every player gets six cards and they have to choose one, passing the others to the player on their left, and then repeat. In this way, you build a set of six cards which will give you points in the end. </p> <p>However, you have to be careful, because some cards only give points if they are in pairs (such as tempura), if you have the most of them, or if they are combined with something else. </p> <div class="gallery clearfix" style="max-width:600px"> <img title="The more dumplings you have, the more points each gives. " src="" style="display:block" width="100%" style="max-width:600px"/> <img title="You only get points from Makis if you have the most Makis. " src="" style="display:block" width="100%" style="max-width:600px"/> <script type="application/json">[{"src": "/static/images/2018/12/sushi_go/dumplings.jpg", "title": "The more dumplings you have, the more points each gives.\n", "msrc": "/static/images/2018/12/sushi_go/thumbs/dumplings.jpg", "w": 1440, "h": 1920}, {"src": "/static/images/2018/12/sushi_go/maki.jpg", "title": "You only get points from Makis if you have the most Makis.\n", "msrc": "/static/images/2018/12/sushi_go/thumbs/maki.jpg", "w": 1440, "h": 1920}, {"src": "/static/images/2018/12/sushi_go/nigiri.jpg", "title": "Nigiris are not worth much on their own...\n", "msrc": "/static/images/2018/12/sushi_go/thumbs/nigiri.jpg", "w": 1440, "h": 1920}, {"src": "/static/images/2018/12/sushi_go/wasabi.jpg", "title": "... but with Wasabi, they are worth 3x as much.\n", "msrc": "/static/images/2018/12/sushi_go/thumbs/wasabi.jpg", "w": 1440, "h": 1920}, {"src": "/static/images/2018/12/sushi_go/sashimi.jpg", "title": "Sashimi gives you points only if you have 3 of them.\n", "msrc": "/static/images/2018/12/sushi_go/thumbs/sashimi.jpg", "w": 1440, "h": 1920}]</script> </div> <p>It's a very quick game, not taking more than 15 minutes. Explaining is also easy. With everyone learning the game, it took 5 minutes to read and explain the rules, the next time it will probably take only 2 minutes. </p> <p>Score: 9</p> <p>And as an added bonus, it's about Sushi, which I really love and somehow, after lots of persuasion, my wife also came to love! And I want to give a shoutout to <a href="">Sushi 101</a>, the best sushi place in Oradea! </p> <p><img alt="Sushi plate from Sushi 101" src="" width="100%" style="max-width:600px"/></p> Sat, 29 Dec 2018 22:38:00 GMT,2018-12-30:/2018/12/30/boardgames-party-sushi-go Synology Moments <p><img alt="First pictures" src="" width="100%" style="max-width:600px"/></p> <p>I have almost 60000 pictures (about 195 GB), going back to about 2007 and I like to keep things organized. Of course, I also don't want to manually tag 60000 pictures. Until last year, the only option to avoid that was Google Photos, which would automatically recognize objects, locations and faces in pictures and give you the option to search by them. But Google Photos has the downside that you have to upload all your pictures to a 3rd party and that you have to pay a monthly fee for that storage. </p> <p>Well, at the end of last year, Synology (the company that makes <a href="">my NAS</a>) launched a beta for Synology Moments, an app that does exactly what I want: it takes all my pictures, analyzes them with deep learning, and gives me a nice UI to search and browse them, by object, location or faces.</p> <p><img alt="Object browsing" src="" width="100%" style="max-width:600px"/></p> <p>While it was in beta, Moments had some issues, but they have since been resolved (or at least the documentation has been made clearer about those issues). In the last half a year it has worked well for me, which is why I'm finally writing this post.</p> <p>One of the quirks of Moments is that photos have to be in a certain folder and you can't specify multiple folders all over the disk to be indexed. Some people have had luck with symlinks, but I just copied everything to one folder and that's it. It means I have a little bit of duplication (for example, between my automatic backups from Google Drive and the ones from Moments), but I still have pleeenty of space in my RAID6 array, so it's fine. </p> <p>The server that does all the heavy lifting lives on my NAS, but Moments also has mobile apps, for both Android and iOS, which enable you to automatically backup every photo and video that you take to the server. The same apps also allow you to search and browse your gallery of pictures, in almost every way as the web app (except that you don't have a folder view). </p> <p><img alt="Face clustering" src="" width="100%" style="max-width:600px"/></p> <p>The clustering of faces (the process where it tries to group faces that it thinks belong to the same person) is not as good as the one that Google Photos has. I spent several hours merging clusters that were of the same person, but Moments thought were of different persons. </p> <p>The object recognition part is pretty good, but that's something I use much less frequently, so I don't notice as many errors. </p> <p>Moments also does reverse geocoding, so it tries to identify where every picture was taken, from GPS coordinates in EXIF metadata, and then it shows them around the pictures.</p> <p><img alt="Timeline" src="" width="100%" style="max-width:600px"/></p> <p>The backup works flawlessly so far from my phone. For pictures I take from my camera, I only save them here. Pictures made with my phone are backed up to both Google Photos and Moments. </p> <p>One of the biggest downsides for Moments is it's availability: it works only on Synology NAS devices, so if you don't have one of these devices, you can't get it :(</p> <p>I am really happy that I found this app to organize all my photos, on a device that is under my control, so I strongly recommend it!</p> Sun, 16 Dec 2018 19:03:00 GMT,2018-12-16:/2018/12/16/synology-moments Blogs I follow - part 2 <p>After a <a href="">year</a>, prompted by a question Catalin asked me, I finally made time to write about a couple more great writers I follow.</p> <dl> <dt><a href="">Seth Godin</a></dt> <dd>Seth does many things, but he is mainly a marketing person. However, he doesn't do marketing in the annoying, in your face spammy way, but in a thoughtful, creative way. Instead of trying to go for the masses, to appeal to everyone, to bring the product to the lowest common denominator, he tries to broaden the market, to find new people, to win people over by quality, not by the lowest price and by having an eye on the long term. I have too many posts from him saved in Wallabag (about which I'll write at another time), so here are two: <a href="">Avoiding the GIGO trap</a> and <a href="">What would happen</a>.</dd> <dt><a href="">DataGenetics</a></dt> <dd>DataGenetics is a blog written by Nick Berry, a data scientist. On his blog, he tackles a lot of interesting questions, with lots of math and physics. I enjoy trying to work out solutions before reading his post. Some of my favorites are about <a href="">Kolomogorov randomness</a> (which is necessary for the whole algorithmic randomness concept), <a href="">Wind Turbine Efficiency</a> and <a href="">Cake Cutting</a>.</dd> <dt><a href="">Itchy Feet</a></dt> <dd>It's not a blog per se, but it's a comic I follow via RSS, so meh. It speaks to my heart, given that my feet are also itching to travel, because I haven't flown since September :( I have actually met the author while at Google and I have one of his books with! He's really nice. Some of my favorite comics from him: <a href="">Slovakia vs Slovenia</a>, <a href="">Relatively inclined</a> and <a href="">Sedentary Workout</a> (the last two I've experienced myself).</dd> </dl> Fri, 30 Nov 2018 21:15:00 GMT,2018-11-30:/2018/11/30/blogs-i-follow-part-2 Monitoring GPU usage with StackDriver <p>At work we use Google Cloud Platform to run our machine learning jobs on multiple machines. GCP has a monitoring platform called Stackdriver which can be used to view all kinds of metrics about your VMs. Unfortunately, it doesn't collect any metrics about GPUs, neither usage or memory. The good news is that it is extensible and you can "easily" set up a new kind of metric and monitor it. </p> <p>To get GPU metrics, we can use the <code>nvidia-smi</code> program, which is installed when you get all the necessary drivers for your graphics card. If you call it simply, it will give you the following output:</p> <div class="highlight"><pre>&gt; nvidia-smi +-----------------------------------------------------------------------------+ | NVIDIA-SMI 410.66 Driver Version: 410.66 CUDA Version: 10.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 108... Off | 00000000:01:00.0 On | N/A | | 0% 43C P8 17W / 250W | 1309MiB / 11177MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 700 G /usr/lib/Xorg 40MiB | | 0 733 G /usr/bin/gnome-shell 110MiB | | 0 931 G /usr/lib/Xorg 371MiB | | 0 1119 G /usr/lib/firefox/firefox 2MiB | | 0 1279 G /usr/lib/firefox/firefox 3MiB | | 0 23585 G /usr/lib/firefox/firefox 24MiB | +-----------------------------------------------------------------------------+ </pre></div> <p>This is a bit convoluted, hard to parse and has too many details. But, with the right flags, you can get just what you want in CSV format: </p> <div class="highlight"><pre>&gt; nvidia-smi --query-gpu=utilization.gpu,utilization.memory --format=csv,noheader,nounits 10,35 </pre></div> <p>The first value is the GPU utilization, as a percentage, and the second value is the memory usage of the GPU, also as a percentage.</p> <p>We are going to write a Python process that open a subprocess to call nvidia-smi once a second and aggregates statistics, on a per minute basis. We have to do this, because we cannot write to Stackdriver metrics more than once a minute, per label (which are a sort of identifier for these time series). </p> <div class="highlight"><pre><span class="kn">from</span> <span class="nn">subprocess</span> <span class="kn">import</span> <span class="n">Popen</span><span class="p">,</span> <span class="n">PIPE</span> <span class="kn">import</span> <span class="nn">os</span> <span class="kn">import</span> <span class="nn">time</span> <span class="kn">import</span> <span class="nn">sys</span> <span class="k">def</span> <span class="nf">compute_stats</span><span class="p">():</span> <span class="n">all_gpu</span> <span class="o">=</span> <span class="p">[]</span> <span class="n">all_mem</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">):</span> <span class="n">p</span> <span class="o">=</span> <span class="n">Popen</span><span class="p">([</span><span class="s">&quot;nvidia-smi&quot;</span><span class="p">,</span><span class="s">&quot;--query-gpu=utilization.gpu,utilization.memory&quot;</span><span class="p">,</span> <span class="s">&quot;--format=csv,noheader,nounits&quot;</span><span class="p">],</span> <span class="n">stdout</span><span class="o">=</span><span class="n">PIPE</span><span class="p">)</span> <span class="n">stdout</span><span class="p">,</span> <span class="n">stderror</span> <span class="o">=</span> <span class="n">p</span><span class="o">.</span><span class="n">communicate</span><span class="p">()</span> <span class="n">output</span> <span class="o">=</span> <span class="n">stdout</span><span class="o">.</span><span class="n">decode</span><span class="p">(</span><span class="s">&#39;UTF-8&#39;</span><span class="p">)</span> <span class="c"># Split on line break</span> <span class="n">lines</span> <span class="o">=</span> <span class="n">output</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">linesep</span><span class="p">)</span> <span class="n">numDevices</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">lines</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span> <span class="n">gpu</span> <span class="o">=</span> <span class="p">[]</span> <span class="n">mem</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">g</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">numDevices</span><span class="p">):</span> <span class="n">line</span> <span class="o">=</span> <span class="n">lines</span><span class="p">[</span><span class="n">g</span><span class="p">]</span> <span class="n">vals</span> <span class="o">=</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s">&#39;, &#39;</span><span class="p">)</span> <span class="n">gpu</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="nb">float</span><span class="p">(</span><span class="n">vals</span><span class="p">[</span><span class="mi">0</span><span class="p">]))</span> <span class="n">mem</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="nb">float</span><span class="p">(</span><span class="n">vals</span><span class="p">[</span><span class="mi">1</span><span class="p">]))</span> <span class="n">all_gpu</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">gpu</span><span class="p">)</span> <span class="n">all_mem</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">mem</span><span class="p">)</span> <span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="n">max_gpu</span> <span class="o">=</span> <span class="p">[</span><span class="nb">max</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">all_gpu</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">numDevices</span><span class="p">)]</span> <span class="n">avg_gpu</span> <span class="o">=</span> <span class="p">[</span><span class="nb">sum</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">all_gpu</span><span class="p">)</span><span class="o">/</span><span class="nb">len</span><span class="p">(</span><span class="n">all_gpu</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">numDevices</span><span class="p">)]</span> <span class="n">max_mem</span> <span class="o">=</span> <span class="p">[</span><span class="nb">max</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">all_mem</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">numDevices</span><span class="p">)]</span> <span class="n">avg_mem</span> <span class="o">=</span> <span class="p">[</span><span class="nb">sum</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">all_mem</span><span class="p">)</span><span class="o">/</span><span class="nb">len</span><span class="p">(</span><span class="n">all_mem</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">numDevices</span><span class="p">)]</span> <span class="k">return</span> <span class="n">max_gpu</span><span class="p">,</span> <span class="n">avg_gpu</span><span class="p">,</span> <span class="n">max_mem</span><span class="p">,</span> <span class="n">avg_mem</span> </pre></div> <p>Here we computed both the average and the maximum over a 1 minute interval. This can be changed to other statistics if they are more relevant for your use case. </p> <p>To write the data to Stackdriver, we have to build up the appropriate protobufs. We will set two labels: one for the zone in which are machines are and one for the <code>instance_id</code>, which we will hack to contain both the name of the machine and the number of the GPU (this is useful in case you attach multiple GPUs to one machine). I hacked the <code>instance_id</code> because Stackdriver kept refusing any API calls with custom labels, even though the docs said it supported them. </p> <div class="highlight"><pre><span class="kn">from</span> <span class="nn"></span> <span class="kn">import</span> <span class="n">monitoring_v3</span> <span class="n">client</span> <span class="o">=</span> <span class="n">monitoring_v3</span><span class="o">.</span><span class="n">MetricServiceClient</span><span class="p">()</span> <span class="n">project</span> <span class="o">=</span> <span class="s">&#39;myGCPprojectid&#39;</span> <span class="n">project_name</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">project_path</span><span class="p">(</span><span class="n">project</span><span class="p">)</span> <span class="k">def</span> <span class="nf">write_time_series</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">gpu_nr</span><span class="p">,</span> <span class="n">value</span><span class="p">):</span> <span class="n">series</span> <span class="o">=</span> <span class="n">monitoring_v3</span><span class="o">.</span><span class="n">types</span><span class="o">.</span><span class="n">TimeSeries</span><span class="p">()</span> <span class="n">series</span><span class="o">.</span><span class="n">metric</span><span class="o">.</span><span class="n">type</span> <span class="o">=</span> <span class="s">&#39;;</span> <span class="o">+</span> <span class="n">name</span> <span class="n">series</span><span class="o">.</span><span class="n">resource</span><span class="o">.</span><span class="n">type</span> <span class="o">=</span> <span class="s">&#39;gce_instance&#39;</span> <span class="n">series</span><span class="o">.</span><span class="n">resource</span><span class="o">.</span><span class="n">labels</span><span class="p">[</span><span class="s">&#39;instance_id&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="s">&quot;_gpu_&quot;</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">gpu_nr</span><span class="p">)</span> <span class="n">series</span><span class="o">.</span><span class="n">resource</span><span class="o">.</span><span class="n">labels</span><span class="p">[</span><span class="s">&#39;zone&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&#39;us-central1-f&#39;</span> <span class="n">point</span> <span class="o">=</span> <span class="n">series</span><span class="o">.</span><span class="n">points</span><span class="o">.</span><span class="n">add</span><span class="p">()</span> <span class="n">point</span><span class="o">.</span><span class="n">value</span><span class="o">.</span><span class="n">double_value</span> <span class="o">=</span> <span class="n">value</span> <span class="n">now</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span> <span class="n">point</span><span class="o">.</span><span class="n">interval</span><span class="o">.</span><span class="n">end_time</span><span class="o">.</span><span class="n">seconds</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">now</span><span class="p">)</span> <span class="n">point</span><span class="o">.</span><span class="n">interval</span><span class="o">.</span><span class="n">end_time</span><span class="o">.</span><span class="n">nanos</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span> <span class="p">(</span><span class="n">now</span> <span class="o">-</span> <span class="n">point</span><span class="o">.</span><span class="n">interval</span><span class="o">.</span><span class="n">end_time</span><span class="o">.</span><span class="n">seconds</span><span class="p">)</span> <span class="o">*</span> <span class="mi">10</span><span class="o">**</span><span class="mi">9</span><span class="p">)</span> <span class="n">client</span><span class="o">.</span><span class="n">create_time_series</span><span class="p">(</span><span class="n">project_name</span><span class="p">,</span> <span class="p">[</span><span class="n">series</span><span class="p">])</span> </pre></div> <p>And now, we put everything together. The program must be called with a the name of the instance as a first parameter. If you run it only on GCP, you can use the GCP APIs to get the name of the instance automatically.</p> <div class="highlight"><pre><span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">2</span><span class="p">:</span> <span class="k">print</span><span class="p">(</span><span class="s">&quot;You need to pass the instance name as first argument&quot;</span><span class="p">)</span> <span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="k">try</span><span class="p">:</span> <span class="n">max_gpu</span><span class="p">,</span> <span class="n">avg_gpu</span><span class="p">,</span> <span class="n">max_mem</span><span class="p">,</span> <span class="n">avg_mem</span> <span class="o">=</span> <span class="n">compute_stats</span><span class="p">()</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">max_gpu</span><span class="p">)):</span> <span class="n">write_time_series</span><span class="p">(</span><span class="s">&#39;max_gpu_utilization&#39;</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">max_gpu</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="n">write_time_series</span><span class="p">(</span><span class="s">&#39;max_gpu_memory&#39;</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">max_mem</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="n">write_time_series</span><span class="p">(</span><span class="s">&#39;avg_gpu_utilization&#39;</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">avg_gpu</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="n">write_time_series</span><span class="p">(</span><span class="s">&#39;avg_gpu_memory&#39;</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">avg_mem</span><span class="p">[</span><span class="n">i</span><span class="p">])</span> <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span> <span class="k">print</span><span class="p">(</span><span class="n">e</span><span class="p">)</span> </pre></div> <p>If you save all this code to a file called <code></code> and you run this locally, on a machine with an NVidia GPU, after a minute you should start seeing the new metrics in your Stackdriver console associated with your GCP project.</p> <p><img alt="Stackdriver GPU graphs" src="" width="100%" style="max-width:600px"/></p> <p>This code can then be called with cron once a minute or it can be changed so that it runs without stopping, posting results once a minute. </p> <div class="highlight"><pre>* * * * * python /path/to/ instance_name &gt;&gt; /var/log/gpu.log 2&gt;&amp;1 </pre></div> <p>Setting up the GCP project and authentication to connect to Stackdriver is left as an exercise to the user. The whole code can be seen in this <a href="">gist</a>.</p> Wed, 21 Nov 2018 14:06:00 GMT,2018-11-21:/2018/11/21/monitoring-gpu-usage-with-stackdriver Epistaxis <p><q>hemorrhage from the nose, usually due to rupture of small vessels overlying the anterior part of the cartilaginous nasal septum. Minor bleeding may be caused by a blow on the nose, irritation from foreign bodies, or vigorous nose-blowing during a cold</q></p> <p style="text-align:right"> <a href="">The Free Dictionary</a> </p> <p>Since I was a little kid I would sometime have nose bleeds, especially when I had a cold and would blow my nose a lot. I recently actually started tracking this, and it seems to happen every 2-3 months, but until now, the bleeding stopped in half an hour or one hour at most.</p> <p>But on Saturday, it didn't. It started around 7 AM. It stopped 2-3 times, but it resumed at the smallest effort (like getting up). I tried pinching my nose, laying down, standing up and tilting my head forward and other things I found on the internet, but the bleeding continued. </p> <p>After about four hours, I decided to go to the ER. I was a bit scared, knowing that probably they would do a nasal cauterization to close off all the capillaries that keep breaking. I was also scared because it was my first visit to the ER and my first more "serious" thing that needed treatment at a hospital. And, because of my awesome bubble that I live in on Facebook, I was also a bit scared about nosocomial infections<sup id="fnref-nosocomial"><a class="footnote-ref" href="">1</a></sup>, which have made several headlines lately in Romania. Even the day before I heard from some friends about how long they had to wait at the ER and how rude the doctors were.</p> <p>But I have to say, I was mostly pleasantly surprised. The nurses and the doctors at the ER were very nice, calm and friendly. The wait times were quite short. After I got my blood pressure measured, I was sent upstairs to the ENT section. After a short wait here, the doctor passed by me on the hall, barely looking at me, and told the nurse to prep me for cauterization. Here things weren't quite as nice, the doctor seemed particularly bored of doing this, but the actual cauterization didn't take more than half a minute and it wasn't that painful! Well, I had my nose stuffed with anesthetics just a minute before, but still. Then the doc stuffed my nose with nasal packing, told me to come back in 5 days to take it out and sent me on my way. </p> <p>So, I got over my first "intervention" at a hospital and hopefully I fixed the nose bleeding problem for a while! Go Spitalul Județean Oradea!</p> <div class="footnote"> <hr /> <ol> <li id="fn-nosocomial"> <p>hospital acquired infections&#160;<a class="footnote-backref" href="" title="Jump back to footnote 1 in the text">&#8617;</a></p> </li> </ol> </div> Mon, 05 Nov 2018 20:05:00 GMT,2018-11-05:/2018/11/05/epistaxis Sweating in Hong Kong <figure> <p><img alt="Landing in Hong Kong" src="" width="100%" style="max-width:600px"/> <figcaption>Landing in Hong Kong</figcaption></p> </figure> <p>At the beginning of the year, a dear brother and good friend, whom I met in New York, invited me to his wedding which was going to be in Hong Kong. What better excuse to go to visit "China" than going for a wedding? This also made the decision of where to go on vacation this year quite easy to figure out. So we booked a hotel and the 12 hour flights, with a layover in Zurich, because we missed it/because it was the cheapest flight.</p> <p><img alt="Thank you for flying Swiss" src="" width="100%" style="max-width:600px"/></p> <p>First impression after we got there: it's hot. Like really, really, really hot. And humid. Buckets of water dripping from you in one minute after you step outside. Second impression: at 11 PM it's not much better. The temperature drops from 32 degrees to 28 degrees, but the humidity doesn't really change. Because of this heat, we actually didn't take too many pictures, because who wants to stop to take pictures when you are running from one place with AC to another?</p> <p><img alt="Tall buildings" src="" width="100%" style="max-width:600px"/></p> <p>A really cool thing about Hong Kong is that while it's a Chinese city (technically a Special Administrative Region, with some degree of independence), it's very easy to get by just by knowing English. Public transport has absolutely every sign in Latin letters as well, many people speak English (except the occasional taxi driver or waiter). In this way you can eat all the really yummy Chinese food, without having communication problems, although I have to admit, a couple of times I don't know what I ate. But it was still really good (though my wife disagrees).</p> <figure> <p><img alt="AC's everywhere" src="" width="100%" style="max-width:600px"/> <figcaption>AC's, AC's everywhere</figcaption></p> </figure> <p>On the first night, we went to Jamie's Place, so that we have a gentler start to our stay in Hong Kong (and also, my wife is a big fan of his). We walked around a bit in the Causeway Bay area and I noticed that buildings are covered in a ridiculous amount of AC units. </p> <p>The next day we met with my friend Lok Ka and his fiancee. They showed us the "local" food experience, by taking us to a restaurant situated in what was a former factory in Kowloon. </p> <figure> <p><img alt="Queue for the Peak Tram" src="" width="100%" style="max-width:600px"/> <figcaption>The queue for the peak tram</figcaption></p> </figure> <p>In the evening we went up Victoria Peak with the Peak Tram. While the queues looked incredibly long, they moved quite fast. There, at the top of Hong Kong island, you could feel a bit of a breeze, making it a bit more comfortable. </p> <div class="gallery clearfix" style="max-width:600px"> <img title="View from the top of our hotel " src="" style="display:block" width="100%" style="max-width:600px"/> <img title="View from Sky 100 " src="" style="display:block" width="100%" style="max-width:600px"/> <script type="application/json">[{"src": "/static/images/2018/10/hk/night/IMG_20180730_222800.jpg", "title": "View from the top of our hotel\n", "msrc": "/static/images/2018/10/hk/night/thumbs/IMG_20180730_222800.jpg", "w": 4048, "h": 3036}, {"src": "/static/images/2018/10/hk/night/IMG_20180803_205150.jpg", "title": "View from Sky 100\n", "msrc": "/static/images/2018/10/hk/night/thumbs/IMG_20180803_205150.jpg", "w": 4048, "h": 3036}, {"src": "/static/images/2018/10/hk/night/P1170267.jpg", "title": "View from Victoria Peak\n", "msrc": "/static/images/2018/10/hk/night/thumbs/P1170267.jpg", "w": 4272, "h": 2856}]</script> </div> <p>Wednesday was beach day. We went south, to Stanley Beach and to the nearby shopping complex, Stanley Market. Conclusion: the Pacific is really pacific (calm). We went there by bus. It was a terrifying experience. The drivers there are crazy. They go on winding roads, cutting corners and going really fast. And in the city, they regularly go between lanes for extended periods of time. We preferred the tram, it was much more chill. And the top level had open windows, so you could feel the wind blowing in your face.</p> <div class="gallery clearfix" style="max-width:600px"> <img title="No idea " src="" style="display:block" width="100%" style="max-width:600px"/> <img title="No idea " src="" style="display:block" width="100%" style="max-width:600px"/> <script type="application/json">[{"src": "/static/images/2018/10/hk/food/20180802_151232.jpg", "title": "No idea\n", "msrc": "/static/images/2018/10/hk/food/thumbs/20180802_151232.jpg", "w": 4128, "h": 3096}, {"src": "/static/images/2018/10/hk/food/IMG_20180731_132706.jpg", "title": "No idea\n", "msrc": "/static/images/2018/10/hk/food/thumbs/IMG_20180731_132706.jpg", "w": 4048, "h": 3036}, {"src": "/static/images/2018/10/hk/food/IMG_20180731_194218.jpg", "title": "Dim sum with no idea what\n", "msrc": "/static/images/2018/10/hk/food/thumbs/IMG_20180731_194218.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/food/IMG_20180801_112857.jpg", "title": "Delicious salmon croissant\n", "msrc": "/static/images/2018/10/hk/food/thumbs/IMG_20180801_112857.jpg", "w": 4048, "h": 3036}, {"src": "/static/images/2018/10/hk/food/IMG_20180801_114904.jpg", "title": "Amazing waffles\n", "msrc": "/static/images/2018/10/hk/food/thumbs/IMG_20180801_114904.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/food/IMG_20180802_151925.jpg", "title": "Many kinds of No idea\n", "msrc": "/static/images/2018/10/hk/food/thumbs/IMG_20180802_151925.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/food/IMG_20180802_152200.jpg", "title": "Really cute and delicious pork figures\n", "msrc": "/static/images/2018/10/hk/food/thumbs/IMG_20180802_152200.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/food/IMG_20180803_130443.jpg", "title": "Pork knuckles and fried octopus\n", "msrc": "/static/images/2018/10/hk/food/thumbs/IMG_20180803_130443.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/food/IMG_20180803_214332.jpg", "title": "Chicken popcorn\n", "msrc": "/static/images/2018/10/hk/food/thumbs/IMG_20180803_214332.jpg", "w": 3036, "h": 4048}]</script> </div> <p>After that, we took Lok Ka's recommendation and went to Yum Cha, a fusion dim sum place, where they make dim sum's shaped like little piggies. And they had some delicious soup. And some more stuff that I'm not sure what it was. But it was really good. </p> <p><img alt="Temple Street Night Market" src="" width="100%" style="max-width:600px"/></p> <p>The rest of the day was spent doing tourisy things, such as going on the Ferris Wheel and taking the Hong Kong ferry to Kowloon, walking around some parks there and, most important, shopping in the Temple Street Night Market, where we bought some overpriced, but fancy looking chopsticks (and some other stuff). </p> <figure> <p><img alt="Temple Street Night Market" src="" width="100%" style="max-width:600px"/> <figcaption>Temple Street Night Market</figcaption></p> </figure> <p>On Friday we met up with several friends, whom I had also met in New York. We had more delicious Chinese food (except the pork knuckles). Turns out fried octopus is a really good snack. We also went to the History Museum, where there was a special exhibition on the Assyrian, Babylonian and Achemenid (Medo-persian) empires, focusing on luxury in that time. It turns out that cheap knock-offs existed even back then. The most expensive pots and jugs were made out of various kinds of metals, which sometimes needed some nits to hold different plates together. The clay jugs were much cheaper and could be made out of one piece, without needing any nits, but people still include something looking like nits on them, just so they would look more expensive. </p> <div class="gallery clearfix" style="max-width:600px"> <img title="Goldfish " src="" style="display:block" width="100%" style="max-width:600px"/> <img title="One of those squiggly lines is " king hezekiah" " src="" style="display:block" width="100%" style="max-width:600px"/> <script type="application/json">[{"src": "/static/images/2018/10/hk/museum/IMG_20180803_144421.jpg", "title": "Goldfish\n", "msrc": "/static/images/2018/10/hk/museum/thumbs/IMG_20180803_144421.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/museum/IMG_20180803_145804.jpg", "title": "One of those squiggly lines is \"King Hezekiah\"\n", "msrc": "/static/images/2018/10/hk/museum/thumbs/IMG_20180803_145804.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/museum/IMG_20180803_145806.jpg", "title": "\n", "msrc": "/static/images/2018/10/hk/museum/thumbs/IMG_20180803_145806.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/museum/IMG_20180803_152644.jpg", "title": "Metal pot and immitation\n", "msrc": "/static/images/2018/10/hk/museum/thumbs/IMG_20180803_152644.jpg", "w": 4048, "h": 3036}, {"src": "/static/images/2018/10/hk/museum/IMG_20180803_152641.jpg", "title": "\n", "msrc": "/static/images/2018/10/hk/museum/thumbs/IMG_20180803_152641.jpg", "w": 4048, "h": 3036}, {"src": "/static/images/2018/10/hk/museum/IMG_20180803_151359.jpg", "title": "First coins in the world\n", "msrc": "/static/images/2018/10/hk/museum/thumbs/IMG_20180803_151359.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/museum/IMG_20180803_153556.jpg", "title": "Taking Raffaelos to the King\n", "msrc": "/static/images/2018/10/hk/museum/thumbs/IMG_20180803_153556.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/museum/IMG_20180803_154136.jpg", "title": "", "msrc": "/static/images/2018/10/hk/museum/thumbs/IMG_20180803_154136.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/museum/IMG_20180803_154800.jpg", "title": "Impressive detail on a really old coin\n", "msrc": "/static/images/2018/10/hk/museum/thumbs/IMG_20180803_154800.jpg", "w": 3036, "h": 4048}]</script> </div> <p>To finish the night, we went up Sky 101, which is the tallest building in Hong Kong and we enjoyed the vistas from there. Unfortunately, it's an enclosed space at the top, so you can't take good pictures :(</p> <div class="gallery clearfix" style="max-width:600px"> <img title="The beautiful bride " src="" style="display:block" width="100%" style="max-width:600px"/> <img title="Cutting the cake " src="" style="display:block" width="100%" style="max-width:600px"/> <script type="application/json">[{"src": "/static/images/2018/10/hk/wedding/IMG_20180805_192034.jpg", "title": "The beautiful bride\n", "msrc": "/static/images/2018/10/hk/wedding/thumbs/IMG_20180805_192034.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/wedding/IMG_20180804_193221.jpg", "title": "Cutting the cake\n", "msrc": "/static/images/2018/10/hk/wedding/thumbs/IMG_20180804_193221.jpg", "w": 3036, "h": 4048}, {"src": "/static/images/2018/10/hk/wedding/IMG_20180804_142730.jpg", "title": "The church service\n", "msrc": "/static/images/2018/10/hk/wedding/thumbs/IMG_20180804_142730.jpg", "w": 4048, "h": 3036}, {"src": "/static/images/2018/10/hk/wedding/IMG_20180805_191444.jpg", "title": "The menu.\n", "msrc": "/static/images/2018/10/hk/wedding/thumbs/IMG_20180805_191444.jpg", "w": 3036, "h": 4048}]</script> </div> <p>Saturday was the wedding ceremony. More good food basically. (Notice a pattern?) But in between the church service and dinner there were several hours, during which my wife wanted to eat something that is more familiar... like KFC. Hong Kong seems to be full of KFCs, so we went into one and we were horribly dissapointed. No french fries No Glenn Garlic Sauce!! No crispy strips!!! What is this???</p> <p>On Sunday, I woke up with a horribly sore throat. This put a bit of a damper on the last two days of our time there, but at least I started enjoying the fact that it was really nice and warm outside. </p> <p>On Monday we went shopping. Among other places to... IKEA. Because it's easier to go there to IKEA than to visit the only one in Romania, which is in Bucharest. And then we began the long flight back home. </p> <div class="gallery clearfix" style="max-width:600px"> <img title="Sky 100 dominating the landscape " src="" style="display:block" width="100%" style="max-width:600px"/> <img title="A green oasis " src="" style="display:block" width="100%" style="max-width:600px"/> <script type="application/json">[{"src": "/static/images/2018/10/hk/day/IMG_20180802_174658.jpg", "title": "Sky 100 dominating the landscape\n", "msrc": "/static/images/2018/10/hk/day/thumbs/IMG_20180802_174658.jpg", "w": 3839, "h": 2552}, {"src": "/static/images/2018/10/hk/day/P1170273.jpg", "title": "A green oasis\n", "msrc": "/static/images/2018/10/hk/day/thumbs/P1170273.jpg", "w": 4272, "h": 2856}, {"src": "/static/images/2018/10/hk/day/P1170276.jpg", "title": "Skyscrapers are a buildings mirror\n", "msrc": "/static/images/2018/10/hk/day/thumbs/P1170276.jpg", "w": 2856, "h": 4272}, {"src": "/static/images/2018/10/hk/day/P1170278.jpg", "title": "Stillness\n", "msrc": "/static/images/2018/10/hk/day/thumbs/P1170278.jpg", "w": 4272, "h": 2856}, {"src": "/static/images/2018/10/hk/day/P1170279.jpg", "title": "Sticking point\n", "msrc": "/static/images/2018/10/hk/day/thumbs/P1170279.jpg", "w": 2856, "h": 4272}, {"src": "/static/images/2018/10/hk/day/P1170306.jpg", "title": "Ye olde ferry\n", "msrc": "/static/images/2018/10/hk/day/thumbs/P1170306.jpg", "w": 2856, "h": 4272}, {"src": "/static/images/2018/10/hk/day/P1170337.jpg", "title": "\n", "msrc": "/static/images/2018/10/hk/day/thumbs/P1170337.jpg", "w": 2856, "h": 4272}]</script> </div> <p>My conclusion after my first trip to East Asia: really good food, but I wouldn't go again during the hot season. So my other Chinese friends, please plan your weddings during the more normal weather, if you want us to attend as well!</p> <figure> <p><img alt="Flamingos" src="" width="100%" style="max-width:600px"/> <figcaption>Flamingos, my wife's favorite birds, in Kowloon Park</figcaption></p> </figure> <p>P.S. I noticed that while Google Maps knows about the metro stations and the position of the exits, it doesn't really know how to use the exits and always guides you to the same one. Which happens to be the one that was the furthest away from our hotel.</p> Sun, 14 Oct 2018 20:17:00 GMT,2018-10-14:/2018/10/14/sweating-in-hong-kong Repairing Linux and Windows boot partitions <p>My Windows installation was on a 120 GB SSD and I decided it was time to upgrade that to 256 GB. Because I didn't want to reinstall my Windows, I copied it with CloneZilla. Everything seemed to work fine, until I formatted the old partition. Then the Windows partition stopped booting. I could access the files from ArchLinux. I tried to run some commands from the Windows Recovery Console, but I held it wrong and the result was that my Archlinux stopped booting too! Yaaay. </p> <p>After lots of digging on various tutorials, I managed to put together the following to fix it. First, download a Live Linux Image to boot from an USB. I used the Antergos one, but it shouldn't matter much. </p> <p>The extra fun for me was that for reasons my Archlinux harddisk is using LVM (I have no idea why I did that 3 years ago). So first, I had to load the LVM partitions:</p> <div class="highlight"><pre>vgchange -a y </pre></div> <p>Then I had to see the partitions with <code>fdisk</code>:</p> <div class="highlight"><pre>fdisk -l </pre></div> <p>Then I had to mount the needed partitions to the correct folders. In my case, for some reason, which I also don't remember, my /home folder was separate from the others, so I needed to mount it separately. The /mnt can be somewhere else and then you just chroot. </p> <div class="highlight"><pre>sudo su mount /dev/mapper/AntergosRoot /mnt mount /dev/sda1 /mnt/boot mount /dev/mapper/AntergosHome /mnt/home arch-chroot /mnt </pre></div> <p>The last step was to regenerate the grub config and install it:</p> <div class="highlight"><pre>grub-mkconfig -o /boot/grub/grub.cfg grub-install /dev/sda </pre></div> <p>And after this my Linux started working again (for some definition of working, because I think the Nvidia drivers are still screwing around). Time to fix my precious Windows 10 Insider Preview installation.</p> <p>Download an install image of Windows, put it on a USB with Rufus and then get to the recovery console. You have to start diskpart and then run some commands inside it:</p> <div class="highlight"><pre>diskpart &gt;sel disk 0 &gt;list vol &gt;sel vol 2 &gt;assign letter=x: &gt;exit </pre></div> <p>In diskpart, you have to look at your volumes and figure out which one is the Windows boot partition (it's a FAT32 partition, that is quite small, around 200 MB, and is marked as bootable I think). You then select that volume (number 2 in my case), assign it a drive letter and exit diskpart. You need the drive letter so that you can access the partition later.</p> <p>You change directory to it and run the following command:</p> <div class="highlight"><pre><span class="n">X</span><span class="o">:</span> <span class="n">bcdboot</span> <span class="n">C</span><span class="o">:\</span><span class="n">windows</span> <span class="sr">/s X: /</span><span class="n">f</span> <span class="n">UEFI</span> </pre></div> <p>Warning: the Windows partition might not be C! In my case it was E, but only in diskpart. The naming of partitions seems to be different here. </p> <p>And after this, my Windows booted up too! No dataloss so far! Yaaaaay!</p> <p>Sources: </p> <ul> <li><a href="">Antergos Docs</a></li> <li><a href="">Answers Microsoft</a></li> <li><a href="">Neosmart</a></li> </ul> Mon, 24 Sep 2018 21:01:00 GMT,2018-09-24:/2018/09/24/repairing-linux-and-windows-boot-partitions Evaluating 2018 goals <p>Another half a year has passed from my <a href="">latest goals</a> and this is what seems like one of my worst performances so far. Turns out transitioning to a full time remote job is quite hard and time consuming. Also, not having a gym right at work makes it much harder to go to one :/</p> <dl> <dt>Weight watching</dt> <dd>Ahahaha. :( I am "glad" to report that I am the same weight as I was when I posted this. Score: 0. At least I didn't gain weight (though I definitely got fatter :( )</dd> <dt>Exercise three times a week</dt> <dd>I averaged about 1.5 times a week. Score 0.5. </dd> <dt>Juggle for 5 minutes every day</dt> <dd>I did 5 minutes almost every day. I am very comfortable with 3 balls. But with 4 balls I can do at most 10 seconds. Score: 0.7</dd> <dt>Read 20 books this year</dt> <dd>I am at 10 books by now, which is a bit behind. Also, I couldn't finish 2 of those books, which is quite rare for me, because I'm usually very picky about the books that I start. Score: 0.65</dd> <dt>Write three blog posts per month</dt> <dd>I did about 1.5 per month. Score: 0.5</dd> <dt>Study German</dt> <dd>That went out the window fast. Turns out not having a good use case for this in the nearby future meant that I lost motivation after 1 month. Score: 0.2</dd> <dt>Two open source contributions per month</dt> <dd>I did one. In total. And I spent more time setting up my environment to be able to contribute than actually fixing a bug. Score: 0.1</dd> <dt>Read 3 chapters a day from the Bible</dt> <dd>Mostly did it. Score: 0.9</dd> <dt>Memorize 1 Bible verse every week</dt> <dd>So, I found a group to try to memorize verses together. But I still mostly failed. Score: 0.2</dd> </dl> <p>Average score: 0.41. That's a big drop from last year. :( I will take a break from most of my goals for a while and come back once things are a bit more settled.</p> Sun, 09 Sep 2018 23:13:00 GMT,2018-09-10:/2018/09/10/evaluating-2018-goals Setting up SSH keys <p>Lately my computer network setup has grown more and more complicated. I have my server in the cloud, a NAS, a desktop work machine, a laptop and soon some Raspberry Pis too. In order to be able to easily connect from one to another I need to setup SSH between them. The default arguments to generate the keys are insecure and many sites on the Internet don't follow the best practices, so I am writing them down here so I can find them more easily.</p> <p>We first need to generate a key on the source machine. There are several options available for algorithm choice: RSA is older, but still secure with a large enough key, while Ed25519 is newer, so it might not work if you connect to older software:</p> <div class="highlight"><pre>ssh-keygen -t rsa -b 4096 -o -a 100 ssh-keygen -t ed25519 -a 100 </pre></div> <p>Then you need to copy the public part of the key to the destination server. Luckily, there is a tool that does just this:</p> <div class="highlight"><pre>ssh-copy-id user@destination </pre></div> <p>For now, that's it. If I ever get really bored, I might set up my own SSH Certificate Authority. However, my todo list is long enough that I don't foresee getting bored anytime soon.</p> Sun, 12 Aug 2018 19:59:00 GMT,2018-08-12:/2018/08/12/setting-up-ssh-keys The Expanse <p><img alt="The Expanse poster" src="" width="100%" style="max-width:600px"/></p> <p>I'm a huge sci-fi fan. I love sci-fi TV shows, but in the last couple of years, most of them have been... lacking, to put it gently. But finally, Sy-Fy made a new great show: The Expanse, based on a novel by James S. A. Corey.</p> <p>The Expanse is a space opera, set two hundred years into the future. Humanity has colonized most of the solar system, but they haven't managed to get any further. Most asteroids have colonies on them, either for mining(Ceres) or for growing food (Ganymede). Mars has broken free of Earth control and the two are in constant political struggle with each other, and occasionally militarily too, over who can do what in the Solar System. Of course, the Belters are the one that always get the short end of the deal, whenever something goes wrong.</p> <p>The story follows James Holden and his crew, who somehow manage to get into the middle of trouble every single time. Partly because James wants to save everyone. Initially they work on an ice hauler, but after some explosions gone awry, they end up taking control of a Martian corvette and they end up fighting lots of battles from there. </p> <p>Everything changes in the solar system when an alien substance, called the protomolecule, is discovered on Phoebe. It transforms whatever it touches into zombie like creatures, with a hive mind. Everyone tries to control it to make weapons. There's a lot of back and forth, with people betraying each other and gaining the upper hand then losing it. I don't want to get into the story too much.</p> <p><img alt="Rocinante Crew" src="" width="100%" style="max-width:600px"/></p> <p>What I really like is the "realism" of the science. The show tries it's best to show that sound doesn't propagate in vacuum. Whenever ships accelerate, you have extra Gs and humans can't survive more than a couple of Gs. So you have some serums that helps against it. On ships people always use magnetic boots so that they stick to the floor. The quickest way to get from one point to the other usually means burning extra fuel, so ships travel by going from one orbit to another and use gravitational aids. The biggest handwavey science is about the Epstein drive, which enables much greater fuel efficiency, so ships can accelerate for longer and travel further. </p> <p>While I wasn't a big fan of the protomolecule (I don't really like zombies), I have to admit that the final twist that comes at the end of season 3 is really good. It turns out that the protomolecule was not sent as a weapon to destroy the Solar System, but it has another purpose: to gather materials to build a massive ring in space. This ring, after it's activated, acts as a wormhole, from where spaceships can travel to distant solar systems. It changes the dynamic of the politics between Earth, Mars and Belters completely, because it offers a lot of new space to explore. </p> <p><img alt="Martian marines" src="" width="100%" style="max-width:600px"/></p> <p>One of the interesting things is how the Belters speak. While Martians have kept a similar language to Earth, even though they are more advanced technologically, Belters speak a very different accent, similar to Creole. Things that stood out for me were "sasa ke" for "ok?" and "bossmang" for "boss" :))) </p> <p>The actors also do a very good job, especially Shohreh Aghdashloo. I never heard of this Iranian actress before, but she plays beautifully a ruthless politician who will do anything it takes to protect Earths interests. </p> <p>For some reason, Syfy cancelled the show, but Amazon picked it up and renewed it for a 4th season. I can't wait for it to start and I highly recommend this show to all sci-fi fans!</p> Thu, 12 Jul 2018 21:20:00 GMT,2018-07-12:/2018/07/12/the-expanse Fiddler on the Roof <p><img alt="Fiddler on the Roof" src="" width="100%" style="max-width:600px"/></p> <p>The last time I went to the theater was sometime in elementary school. I think it might have been The Lion King, but I'm not sure. I definitely don't remember anything that happened on stage. I vaguely remember the chairs and some of my classmates. </p> <p>But about a month ago, my wife told me that the theater from Oradea will be doing The Fiddler on the Roof and that they still have tickets. She was going to have an exam that day, so what better way to relax than to go to the theater? </p> <p>I have seen the movie twice and I really liked it. It's possibly the only musical I have liked so far (this and the Scrubs musical episode). I mean, who, except for billionaires, doesn't like the song "If I were a rich man"? </p> <p>The theater group from Oradea has been playing this play for at least 5 years and they have even done tours. They are pretty well known and have a good reputation. And at least, based on my completely theater-naive appreciation, they deserve it. The cast did a really good job. Especially Tevye, who has to combine several subtle personality aspects, from the strict father, to the village joker, who battles a quickly changing world.</p> <p>Despite having seen the movie, I never "got it". I never understood the title of the movie. I knew that there was an actual fiddler on the roof sometimes in the movie, but I never understood why and what he represented. But within 5 minutes of the start of this play, the whole thing with the "Tradition" hit me and it clicked in my brain. I realized how the traditions are a survival mechanisms for societies, at the potential expense of the individual (happiness). It might be better to have arranged marriages, in the sense of perpetuating a society, but it will lead to many unhappy marriages. However, if people are too unhappy, they might not play along, so the whole thing unravels. So traditions are a way of balancing the long term survival of the population and the short term benefits of the individual.</p> <p>Another thing that struck me was Tevye's religious experience. He misquotes Scripture so much that it becomes the butt of many jokes. He clearly doesn't know them very well, but he wishes to have more time to study them (as he sings he would do if he were rich). But all this doesn't stop Tevye from having a personal relationship with God. He seeks God "daily" and talks to Him, tells Him his hardships and he has a genuine expectation that God would answer him - and indeed, sometimes he does seem to be struck by divine inspiration. </p> <p>The relationship between the Russians and the Jews is very interesting. The first time they interact is at the local tavern, where the Tevye is celebrating the fact that Lazar Wolf will marry his daughter. The russians join in on the merrymaking and they all end up dancing together. Tevye even seems to have a good relationship with the police officer. But slowly, things degrade between the two. A pogram, called "a small manifestation" comes to town and the same russians who danced with Tevye now ruin his daughter's wedding. And of course, at the end of the play, when the Jews are being kicked out of the village, nobody has any qualms about taking over their homes. </p> <p>Seeing this play confirmed what I have started to realize lately, that narratives are a very powerful medium of transmitting messages, much more so than a bland, "scientific" text. You could read about the Pogroms, but it won't have as big of an impact on you as watching a play about it would have. </p> <p>Once again, I'm super impressed by the Oradea Theater actors and I can't wait to go see more shows with my wife, both here and in other places.</p> Sun, 24 Jun 2018 23:14:00 GMT,2018-06-25:/2018/06/25/fiddler-on-the-roof Regular Expressions for Objects <p>For work I recently needed to do something that is very similar to regexes, but with a twist: it should operate on lists of objects, not only on strings. Luckily, Python came to the rescue with <a href="">REfO</a>, a library for doing just this.</p> <p>My usecase was selecting phrases from Part-of-Speech (POS) annotated text. The text was lemmatized and tagged using <a href="">SpaCy</a> and it resulted in lists of the following form:</p> <div class="highlight"><pre><span class="n">s</span> <span class="o">=</span> <span class="p">[[</span><span class="s">&#39;i&#39;</span><span class="p">,</span> <span class="s">&#39;PRON&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;look&#39;</span><span class="p">,</span> <span class="s">&#39;VERB&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;around&#39;</span><span class="p">,</span> <span class="s">&#39;ADP&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;me&#39;</span><span class="p">,</span> <span class="s">&#39;PRON&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;and&#39;</span><span class="p">,</span> <span class="s">&#39;CCONJ&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;see&#39;</span><span class="p">,</span> <span class="s">&#39;VERB&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;that&#39;</span><span class="p">,</span> <span class="s">&#39;ADP&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;everyone&#39;</span><span class="p">,</span> <span class="s">&#39;NOUN&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;be&#39;</span><span class="p">,</span> <span class="s">&#39;VERB&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;run&#39;</span><span class="p">,</span> <span class="s">&#39;VERB&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;around&#39;</span><span class="p">,</span> <span class="s">&#39;ADV&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;in&#39;</span><span class="p">,</span> <span class="s">&#39;ADP&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;a&#39;</span><span class="p">,</span> <span class="s">&#39;DET&#39;</span><span class="p">],</span> <span class="p">[</span><span class="s">&#39;hurry&#39;</span><span class="p">,</span> <span class="s">&#39;NOUN&#39;</span><span class="p">]]</span> </pre></div> <p>From these sentences we want to extract human action phrases and noun phrases, which are defined as follows, using regex-like notation:</p> <div class="highlight"><pre>human_action = (&quot;he&quot;|&quot;she&quot;|&quot;i&quot;|&quot;they&quot;|&quot;we&quot;) ([VERB] [ADP])+ noun_phrase = [DET]? ([ADJ] [NOUN])+ </pre></div> <p>Translated to English this means that human actions are defined as 1st and 3rd person, singular and plural pronouns followed by repeated groups of verbs and adpositions (in, to, during). Noun phrases are composed of an optional determiner (a, an, the) followed by repeated groups of adjectives and nouns.</p> <p>Most standard regex libraries won't help you with this, because they work only on strings. But this problem is still perfectly well described by regular grammars, so after a bit of Googling I found REfO and it's super simple to use, albeit you have to read the source code, because it doesn't really have documentation.</p> <p>REfO is a bit more verbose than normal regular expressions, but at least it tries to stay close to usual regex notions. Lazy repetition (*) is done using the <code>refo.Star</code> operator, while greedy one (+) is <code>refo.Plus</code> . The only new operator is <code>refo.Predicate</code>, which takes a function which takes a parameter and matches if that function returns true when called with the element at that position. Using this we will build the functions we need:</p> <div class="highlight"><pre><span class="k">def</span> <span class="nf">pos</span><span class="p">(</span><span class="n">pos</span><span class="p">):</span> <span class="k">return</span> <span class="n">refo</span><span class="o">.</span><span class="n">Predicate</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="n">pos</span><span class="p">)</span> <span class="k">def</span> <span class="nf">humanpron</span><span class="p">():</span> <span class="k">return</span> <span class="n">refo</span><span class="o">.</span><span class="n">Predicate</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="s">&#39;PRON&#39;</span> <span class="ow">and</span> <span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="ow">in</span> <span class="p">{</span><span class="s">&#39;i&#39;</span><span class="p">,</span> <span class="s">&#39;he&#39;</span><span class="p">,</span> <span class="s">&#39;she&#39;</span><span class="p">,</span> <span class="s">&#39;we&#39;</span><span class="p">,</span> <span class="s">&#39;they&#39;</span><span class="p">})</span> </pre></div> <p>For matching POS, we use a helper to create a function that will match the given tag. For matching human pronouns, we also check the words, not just the POS tag.</p> <div class="highlight"><pre><span class="n">np</span> <span class="o">=</span> <span class="n">refo</span><span class="o">.</span><span class="n">Question</span><span class="p">(</span><span class="n">pos</span><span class="p">(</span><span class="s">&#39;DET&#39;</span><span class="p">))</span> <span class="o">+</span> <span class="n">refo</span><span class="o">.</span><span class="n">Plus</span><span class="p">(</span><span class="n">refo</span><span class="o">.</span><span class="n">Question</span><span class="p">(</span><span class="n">pos</span><span class="p">(</span><span class="s">&#39;ADJ&#39;</span><span class="p">))</span> <span class="o">+</span> <span class="n">pos</span><span class="p">(</span><span class="s">&#39;NOUN&#39;</span><span class="p">))</span> <span class="n">humanAction</span> <span class="o">=</span> <span class="n">humanpron</span><span class="p">()</span> <span class="o">+</span> <span class="n">refo</span><span class="o">.</span><span class="n">Plus</span><span class="p">(</span><span class="n">pos</span><span class="p">(</span><span class="s">&#39;VERB&#39;</span><span class="p">)</span> <span class="o">+</span> <span class="n">pos</span><span class="p">(</span><span class="s">&#39;ADP&#39;</span><span class="p">))</span> </pre></div> <p>Then we just compose our functions and concatenate them and we got what we wanted. Using them is simple. You either call <code></code>, which finds the first match or <code>refo.finditer</code> which returns an iterable over all matches. </p> <div class="highlight"><pre><span class="k">for</span> <span class="n">match</span> <span class="ow">in</span> <span class="n">refo</span><span class="o">.</span><span class="n">finditer</span><span class="p">(</span><span class="n">humanAction</span><span class="p">,</span> <span class="n">s</span><span class="p">):</span> <span class="n">start</span> <span class="o">=</span> <span class="n">match</span><span class="o">.</span><span class="n">start</span><span class="p">()</span> <span class="n">end</span> <span class="o">=</span> <span class="n">match</span><span class="o">.</span><span class="n">end</span><span class="p">()</span> <span class="k">print</span><span class="p">(</span><span class="n">s</span><span class="p">[</span><span class="n">start</span><span class="p">:</span><span class="n">end</span><span class="p">])</span> </pre></div> <div class="highlight"><pre>[[u&#39;i&#39;, u&#39;PRON&#39;], [u&#39;look&#39;, u&#39;VERB&#39;], [u&#39;around&#39;, u&#39;ADP&#39;]] </pre></div> <p>So, it's always good to Google around for a solution, because my first instict to whip up a parser in Parsec would have lead to a much more complicated solution. This is nice, elegant, short and efficient.</p> Fri, 08 Jun 2018 20:43:00 GMT,2018-06-08:/2018/06/08/regular-expressions-for-objects Real estate cohort analysis <p>After in the last post we looked how to get the data, now we are going to start analyzing it. The first question we are interested in is how quickly do houses sell. We don't have access to actual contracts, so we will use a proxy to measure this: how long is an advertisment for a house still displayed. We are going to estimate that this is roughly the time it takes to sell a house. </p> <p>We will do a cohort analysis, where each cohort will be composed of ads that were shown for the first time on that day and we will track what percentage of those ads is still shown as days pass.</p> <p>To do this, we will need to gather data for some time, running our scraper daily. I did for almost three months now. I forgot to run it on some days (and I have an item on my todo list to automate this for almost three months now), so I have about 75 entry points.</p> <p>Let's start by reading in the data. We are going to use Pandas for this, because it has support for the JSON lines format. </p> <div class="highlight"><pre><span class="kn">import</span> <span class="nn">pandas</span> <span class="kn">as</span> <span class="nn">pd</span> <span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span> <span class="kn">import</span> <span class="nn">glob</span> <span class="kn">import</span> <span class="nn">seaborn</span> <span class="kn">as</span> <span class="nn">sns</span> <span class="n">sns</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="n">style</span><span class="o">=</span><span class="s">&#39;white&#39;</span><span class="p">)</span> <span class="n">data</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">concat</span><span class="p">([</span><span class="n">pd</span><span class="o">.</span><span class="n">read_json</span><span class="p">(</span><span class="n">f</span><span class="p">)</span> <span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">glob</span><span class="o">.</span><span class="n">glob</span><span class="p">(</span><span class="s">&quot;../data/houses*.jl&quot;</span><span class="p">)],</span> <span class="n">ignore_index</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">drop</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s">&quot;adaugat_la&quot;</span><span class="p">,</span> <span class="s">&quot;Compartimentare&quot;</span><span class="p">,</span> <span class="s">&quot;text&quot;</span><span class="p">])</span> <span class="n">data</span><span class="p">[</span><span class="s">&quot;nr_anunt&quot;</span><span class="p">]</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s">&quot;nr_anunt&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">if</span> <span class="nb">type</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">==</span> <span class="nb">list</span> <span class="k">else</span> <span class="n">x</span><span class="p">)</span> </pre></div> <p>First we do some imports and set some graph plotting settings. Then we read all the data into a big Pandas dataframe and we drop some of the columns we don't need for our analysis right now and we fix some issues with the <code>nr_anunt</code> column (sometimes it's a list).</p> <div class="highlight"><pre><span class="n">data</span><span class="o">.</span><span class="n">set_index</span><span class="p">(</span><span class="s">&#39;nr_anunt&#39;</span><span class="p">,</span> <span class="n">inplace</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="n">data</span><span class="p">[</span><span class="s">&#39;CohortGroup&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">groupby</span><span class="p">(</span><span class="n">level</span><span class="o">=</span><span class="mi">0</span><span class="p">)[</span><span class="s">&#39;date&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">min</span><span class="p">()</span> <span class="n">data</span><span class="o">.</span><span class="n">reset_index</span><span class="p">(</span><span class="n">inplace</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> </pre></div> <p>We reset the index, to be according to the <code>nr_anunt</code> column, which is the ID of each ad, so it identifies each ad uniquely. Then we create a new column, by grouping by the index and taking the minimum out of each group. When we group by the index, it means that each group will contain each scraping of an ad, taken on different days. By taking the minimum of the date, we get the first time we saw the ad.</p> <div class="highlight"><pre><span class="n">grouped</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">groupby</span><span class="p">([</span><span class="s">&quot;CohortGroup&quot;</span><span class="p">,</span> <span class="s">&quot;date&quot;</span><span class="p">])</span> <span class="n">cohorts</span> <span class="o">=</span> <span class="n">grouped</span><span class="o">.</span><span class="n">agg</span><span class="p">({</span><span class="s">&#39;nr_anunt&#39;</span><span class="p">:</span> <span class="n">pd</span><span class="o">.</span><span class="n">Series</span><span class="o">.</span><span class="n">nunique</span><span class="p">})</span> <span class="n">cohorts</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> <table> <thead> <tr> <th>CohortGroup</th> <th>date</th> <th>count</th> </tr> </thead> <tbody> <tr> <td>2018-02-07</td> <td>2018-02-07</td> <td>2551</td> </tr> <tr> <td></td> <td>2018-02-08</td> <td>2461</td> </tr> <tr> <td></td> <td>2018-02-09</td> <td>2390</td> </tr> <tr> <td></td> <td>2018-02-10</td> <td>2345</td> </tr> <tr> <td></td> <td>2018-02-11</td> <td>2300</td> </tr> </tbody> </table> <p>Then we group the data again, this time by the CohortGroup and by the date we saw the ad. The first grouping means we put in one group all the ads that showed up the first time on a certain day, and then we group again by every subsequent day when we saw them. On these groups, we aggregate by the number of unique IDs we see, which will give us the size of our cohorts on each day.</p> <div class="highlight"><pre><span class="n">exp</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">cohorts</span><span class="o">.</span><span class="n">to_records</span><span class="p">())</span> <span class="n">exp</span><span class="p">[</span><span class="s">&quot;CohortPeriod&quot;</span><span class="p">]</span> <span class="o">=</span> <span class="n">exp</span><span class="p">[</span><span class="s">&quot;date&quot;</span><span class="p">]</span> <span class="o">-</span> <span class="n">exp</span><span class="p">[</span><span class="s">&quot;CohortGroup&quot;</span><span class="p">]</span> <span class="n">exp</span><span class="o">.</span><span class="n">set_index</span><span class="p">([</span><span class="s">&quot;CohortGroup&quot;</span><span class="p">,</span> <span class="s">&quot;CohortPeriod&quot;</span><span class="p">],</span> <span class="n">inplace</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="n">exp</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> <table> <thead> <tr> <th>CohortGroup CohortPeriod</th> <th>CohortPeriod</th> <th>date</th> <th>nr_anunt</th> </tr> </thead> <tbody> <tr> <td>2018-02-07</td> <td>0 days</td> <td>2018-02-07</td> <td>2551</td> </tr> <tr> <td></td> <td>1 days</td> <td>2018-02-08</td> <td>2461</td> </tr> <tr> <td></td> <td>2 days</td> <td>2018-02-09</td> <td>2390</td> </tr> <tr> <td></td> <td>3 days</td> <td>2018-02-10</td> <td>2345</td> </tr> <tr> <td></td> <td>4 days</td> <td>2018-02-11</td> <td>2300</td> </tr> </tbody> </table> <p>Until now, we have absolute dates, which we can't compare between cohorts. If one cohort started in March, and the other one in April, of course they will have different behaviours in April (one being right at the beginning and the other one one month in). So we want to convert to relative dates, by substracting the date of the scraping of the ad and the first time we saw the ad. Because my Pandas-fu is not that good, I didn't find another solution to this other than expand the index to columns again. Then I did the substraction and reindexed everything, this time by the CohortGroup (so first day an ad showed up) and CohortPeriod (how many days have passed since then).</p> <div class="highlight"><pre><span class="n">cohort_group_size</span> <span class="o">=</span> <span class="n">exp</span><span class="p">[</span><span class="s">&quot;nr_anunt&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">groupby</span><span class="p">(</span><span class="n">level</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">first</span><span class="p">()</span> <span class="n">retention</span> <span class="o">=</span> <span class="n">exp</span><span class="p">[</span><span class="s">&quot;nr_anunt&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">unstack</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">divide</span><span class="p">(</span><span class="n">cohort_group_size</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span> <span class="n">retention</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </pre></div> <table> <thead> <tr> <th>Cohort Group CohortPeriod</th> <th>2018-02-07</th> <th>2018-02-08</th> <th>2018-02-09</th> <th>2018-02-10</th> <th>2018-02-11</th> </tr> </thead> <tbody> <tr> <td>0 days</td> <td>1.000000</td> <td>1.000000</td> <td>1.000000</td> <td>1.00000</td> <td>1.000000</td> </tr> <tr> <td>1 days</td> <td>0.964720</td> <td>0.921569</td> <td>0.941176</td> <td>1.00000</td> <td>0.955882</td> </tr> <tr> <td>2 days</td> <td>0.936887</td> <td>0.901961</td> <td>0.862745</td> <td>0.93750</td> <td>0.852941</td> </tr> <tr> <td>3 days</td> <td>0.919247</td> <td>0.901961</td> <td>0.862745</td> <td>0.90625</td> <td>0.852941</td> </tr> <tr> <td>4 days</td> <td>0.901607</td> <td>0.882353</td> <td>0.862745</td> <td>0.87500</td> <td>0.794118</td> </tr> </tbody> </table> <p>We are now ready to calculate retention. We want percentages, not absolute values, so we first look at the size of each cohort on the first day. Then we pivot the dataframe, so that on the rows we get CohortPeriod and on the columns we get CohortGroups (that's what unstack does) and divide with the size of the cohorts, column-wise.</p> <p>And then we plot what we got. First, we can do a line chart:</p> <div class="highlight"><pre><span class="n">retention</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">15</span><span class="p">,</span><span class="mi">10</span><span class="p">))</span> </pre></div> <p><img alt="Line plot of retention" src="" width="100%" style="max-width:600px"/></p> <p>And then a nice heatmap:</p> <div class="highlight"><pre><span class="n">sns</span><span class="o">.</span><span class="n">heatmap</span><span class="p">(</span><span class="n">retention</span><span class="o">.</span><span class="n">T</span><span class="p">,</span> <span class="n">fmt</span><span class="o">=</span><span class="s">&#39;.0%&#39;</span><span class="p">,</span> <span class="p">)</span> </pre></div> <p><img alt="Heatmap of retention" src="" width="100%" style="max-width:600px"/></p> <p>From looking at these two charts, we see that after about one month, 60% of the ads are still available, but then, there is a big cliff and it jumps down to below 30%. However, then it climbs back up, to around 40% and then stabilizes around 30%. </p> <p>My hypothesis is that after the first month, the ads are taken offline by an automated system, for inactivity and that's why you have a big drop. Some people reactivate the ads, which leads to the rebounce and then ads organically go away, as the underlying real estate is no longer available. </p> <p>In conclusion, you don't need to hurry too much when buying houses in Oradea: you easily have at least two weeks for 80% of the cases, after an ad shows up, and up to a month in 60% of the cases.</p> Tue, 29 May 2018 20:40:00 GMT,2018-05-29:/2018/05/29/real-estate-cohort-analysis Scraping for houses <p>Having moved back to Romania, I decided I would need a place to live in, ideally to buy. So we started looking online for various places, we went to see a lot of them. Lots of work, especially footwork. But, being the data nerd that I am, I wanted to get smart about it and analyze the market. </p> <p>For that, I needed data. For data, I turned to scraping. For scraping, I turned to Scrapy. While I did write a scraper <a href="">5 years ago</a>, I didn't want to reinvent the wheel yet again, so I turned to Scrapy because it's a well-known, much used scraping framework in Python. And I was super impressed with it. I even started scraping things more often, just because it's so easy to do in Scrapy :D</p> <p>In this post I am going to show you how to use it to scrape the olx website for housing posts in a given city, in 30 lines of Python. Later, we are going to analyze the data too. </p> <p>First, you have to generate a new Scrapy project and a Scrapy spider. Run the following commands in your preferred Python environment (I currently prefer pipenv).</p> <div class="highlight"><pre>pip install scrapy scrapy startproject olx_houses scrapy genspider olx </pre></div> <p>This will generate a file for you inside <code>olx_houses/spiders</code>, with some boilerplate already written, and you just have to extend it a bit.</p> <div class="highlight"><pre><span class="kn">import</span> <span class="nn">scrapy</span> <span class="kn">import</span> <span class="nn">datetime</span> <span class="n">today</span> <span class="o">=</span> <span class="n">datetime</span><span class="o">.</span><span class="n">date</span><span class="o">.</span><span class="n">today</span><span class="p">()</span><span class="o">.</span><span class="n">strftime</span><span class="p">(</span><span class="s">&#39;%Y-%m-</span><span class="si">%d</span><span class="s">&#39;</span><span class="p">)</span> </pre></div> <p>These are just imports and I am precomputing today's date, because I want each entry to contain when it was scraped.</p> <div class="highlight"><pre><span class="k">class</span> <span class="nc">OlxHousesSpider</span><span class="p">(</span><span class="n">scrapy</span><span class="o">.</span><span class="n">Spider</span><span class="p">):</span> <span class="n">name</span> <span class="o">=</span> <span class="s">&#39;olx_houses&#39;</span> <span class="n">allowed_domains</span> <span class="o">=</span> <span class="p">[</span><span class="s">&#39;;</span><span class="p">]</span> <span class="n">start_urls</span> <span class="o">=</span> <span class="p">[</span><span class="s">&#39;;</span><span class="p">,</span> <span class="s">&#39;;</span><span class="p">]</span> </pre></div> <p>Then we define our class, with the allowed domains. If we encounter a link that is not from these domains, it is not followed. We are interested only in olx stuff, so we allow only that. The start URLs are the inital pages, from where we should start the scraping. In our case, these are the listing pages for house and flats.</p> <div class="highlight"><pre> <span class="k">def</span> <span class="nf">parse</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">response</span><span class="p">):</span> <span class="k">for</span> <span class="n">href</span> <span class="ow">in</span> <span class="n">response</span><span class="o">.</span><span class="n">css</span><span class="p">(</span><span class="s">&#39;a.detailsLink::attr(href)&#39;</span><span class="p">):</span> <span class="k">yield</span> <span class="n">response</span><span class="o">.</span><span class="n">follow</span><span class="p">(</span><span class="n">href</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">parse_details</span><span class="p">)</span> <span class="k">for</span> <span class="n">href</span> <span class="ow">in</span> <span class="n">response</span><span class="o">.</span><span class="n">css</span><span class="p">(</span><span class="s">&#39;a.pageNextPrev::attr(href)&#39;</span><span class="p">)[</span><span class="o">-</span><span class="mi">1</span><span class="p">:]:</span> <span class="k">yield</span> <span class="n">response</span><span class="o">.</span><span class="n">follow</span><span class="p">(</span><span class="n">href</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">parse</span><span class="p">)</span> </pre></div> <p><code>parse</code> is a special function, which is called by default for every URL. So the start URLs will be parsed using this. It is called with a response object, containing the HTML received from the website. This response object contains both all the HTML text, but it also has a DOM parse and it allows direct querying with CSS and XPath selectors. If you return or yield a Request object from this function, Scrapy will add it to the queue of pages to be visited. A convenience method for doing this is to use the <code>follow</code> method on the response object. You pass it the URL to visit and what callback method to use for parsing (by default it's the <code>parse</code> method). </p> <p>We are looking for two things on this page:</p> <p>1) For anchor links that have a <code>detailsLink</code> CSS class. These we want to parse with the <code>parse_details</code> method. 2) Anchor links that have a <code>pageNextPrev</code> CSS class. We look only at the last one of these links (that's what the [-1:] indexing does), because that one always points forward. We could look at all links and it wouldn't cause duplicate requests, because Scrapy is keeping track of what links it already visited and it doesn't visit them again. These links we will parse with the default method. </p> <p>And now comes the fun part, getting the actual data. </p> <div class="highlight"><pre> <span class="k">def</span> <span class="nf">parse_details</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">response</span><span class="p">):</span> <span class="n">attrs</span> <span class="o">=</span> <span class="p">{</span> <span class="s">&#39;url&#39;</span><span class="p">:</span> <span class="n">response</span><span class="o">.</span><span class="n">url</span><span class="p">,</span> <span class="s">&#39;text&#39;</span><span class="p">:</span> <span class="n">response</span><span class="o">.</span><span class="n">css</span><span class="p">(</span><span class="s">&#39;#textContent&gt;p::text&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">extract_first</span><span class="p">()</span><span class="o">.</span><span class="n">strip</span><span class="p">(),</span> <span class="s">&#39;title&#39;</span><span class="p">:</span> <span class="n">response</span><span class="o">.</span><span class="n">css</span><span class="p">(</span><span class="s">&#39;h1::text&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">extract_first</span><span class="p">()</span><span class="o">.</span><span class="n">strip</span><span class="p">(),</span> <span class="s">&#39;price&#39;</span><span class="p">:</span> <span class="n">response</span><span class="o">.</span><span class="n">css</span><span class="p">(</span><span class="s">&#39;.price-label &gt; strong::text&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">extract_first</span><span class="p">()</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s">&quot; &quot;</span><span class="p">,</span> <span class="s">&quot;&quot;</span><span class="p">),</span> <span class="s">&#39;date&#39;</span><span class="p">:</span> <span class="n">today</span><span class="p">,</span> <span class="s">&#39;nr_anunt&#39;</span><span class="p">:</span> <span class="n">response</span><span class="o">.</span><span class="n">css</span><span class="p">(</span><span class="s">&#39;.offer-titlebox em small::text&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">re</span><span class="p">(</span><span class="s">&#39;\d+&#39;</span><span class="p">),</span> <span class="s">&#39;adaugat_la&#39;</span><span class="p">:</span> <span class="n">response</span><span class="o">.</span><span class="n">css</span><span class="p">(</span><span class="s">&#39;.offer-titlebox em::text&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">re</span><span class="p">(</span><span class="s">&#39;Adaugat (de pe telefon) +La (.*),&#39;</span><span class="p">)</span> <span class="p">}</span> </pre></div> <p>We extract various attributes from the listing pages. Some things are straightforward, like the URL, or the text and title, which are obtained by taking the text of some elements chosen with CSS selectors. For price the selector is a bit more complicated and we have to prepare the text a bit (by removing spaces). For the ID of the listing and the date added field, we have to apply some regular expressions to obtain only the data that we want, without anything else. </p> <div class="highlight"><pre> <span class="k">for</span> <span class="n">tr</span> <span class="ow">in</span> <span class="n">response</span><span class="o">.</span><span class="n">css</span><span class="p">(</span><span class="s">&#39;.details&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;tr/td//tr&#39;</span><span class="p">):</span> <span class="n">title</span> <span class="o">=</span> <span class="n">tr</span><span class="o">.</span><span class="n">css</span><span class="p">(</span><span class="s">&#39;th::text&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">extract_first</span><span class="p">()</span> <span class="n">value</span> <span class="o">=</span> <span class="s">&quot; &quot;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">x</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">tr</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;td/strong//text()&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">extract</span><span class="p">()</span> <span class="k">if</span> <span class="n">x</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span><span class="o">!=</span><span class="s">&quot;&quot;</span><span class="p">)</span> <span class="n">attrs</span><span class="p">[</span><span class="n">title</span><span class="p">]</span><span class="o">=</span><span class="n">value</span> <span class="k">yield</span> <span class="n">attrs</span> </pre></div> <p>There is one last thing: some crucial information is displayed in a "structured" way, but it's marked up in a completely unstructured way. Things like the size of the house or the age. These values are in a table, with rows containing a table header? cell with the name of the attribute, followed by table data cells containg values. We take all the values, join them with a space, and put them in the dictionary we used above, with the key being the value we got from the table header cell. We do this for all the rows in the table. </p> <p>And that's it. Easy peasy. Now all we have to do is run the scraper with the following command:</p> <div class="highlight"><pre>scrapy run olx -o houses.csv </pre></div> <p>We wait a little bit and then in that file we have all the listings. And if we repeat this process (almost) daily for several months, we can get trends and see how long are houses on the market on average. But that's a topic for another post.</p> Mon, 07 May 2018 20:39:00 GMT,2018-05-07:/2018/05/07/scraping-for-houses Synology and Docker <p>After more than a year of using my NAS only to collect dust and the occasional backup, I decided this month to start self hosting various web apps on it. Synology has a visual Docker interface, so I installed that and I started trying to install my first app: <a href="">Monica</a>, which is a personal relationship manager (I love to track things).</p> <p>Unfortunately, modern technology has not yet solved this problem of deploying apps seamlessly. :(</p> <figure> <p><img alt="Installing from the Docker registry" src="" width="100%" style="max-width:600px"/> <figcaption>Installing Monica from the Docker registry</figcaption></p> </figure> <p>Installing the Monica Docker container was fairly simple: you search for it, click download, set the environment variables in the GUI, carefully copy pasting their names from the documentation, and it kinda starts, but spews errors in the logs. Monica provides you with an example .env file, but you can't use it in the UI :(.</p> <figure> <p><img alt="Setting the environment variables" src="" width="100%" style="max-width:600px"/> <figcaption>Setting the environment variables</figcaption></p> </figure> <p>But then you get to the first problem: the database. Monica has a dependency on a MySQL database. It provides a docker compose file, which could bring it up and wire them together, but the Synology Docker UI doesn't have support for Docker Compos, so I had to manually provision the database. </p> <p>Synology has a package for MariaDB. I tried to use that, but I couldn't figure out how to connect the two. Then I tried to install a MySQL Docker container, also from the UI. I struggled a lot to connect them, until I realized that the IP address that I had to pass was the one from the separate network Docker creates.</p> <p>Once I got the database connection set up and initialized, the app started, but it was working from an IP address. I wanted to add HTTPS support for it and I wanted to access it from a pretty url, so I had to add a DNS subdomain entry at my registrar. </p> <p>I thought that I would have to open up a new port on my router to do all this, but it turns out that the Docker image for Monica serves only HTTP, so actually what you have to do is set up a reverse proxy to receive HTTPS requests from the outside world and send them on plain HTTP to your app. Luckily, Synology has a package built in for doing just this, inside Control Panel/Application Portal.</p> <p>The last step was adding an SSL certificate from Let's Encrypt. Synology also has support for this, so it was trivial to do this, from the Control Panel/Security/Certificate.</p> <p>I probably spent around two days figuring this out. Afterwards, the next apps were easier to install, but some still caused issues (like Wallabag). There are rumours of even better packaged solutions, but they don't have too many apps for now (also, why are torrenting/usenet apps so popular??).</p> Thu, 26 Apr 2018 18:39:00 GMT,2018-04-26:/2018/04/26/synology-and-docker Man versus nature <p><img alt="Me hiking" src="" width="100%" style="max-width:600px"/></p> <p>Some friends in Oradea are big hikers. They had plans to go to Curcubăta Mare, the highest peak on a 500 km radius, being at 1849m, but they were always foiled by weather. Finally they decided to go last Saturday, because the temperature had risen slightly and the forecast wasn't too cloudy. They invited me to go along, so I joined the group of 5 people.</p> <p>We went by car to Vârtop, which is a ski resort. On the way there we saw some beautiful winter landscapes. I didn't know that Romania could be so amazing in winter. You definitely don't need to go 1200 kilometers to the Alps to see such things :D</p> <p>At the ski resort we took the ski chair to top of the ski track and from there we started off on foot on a narrow, slippery path among firs. While the snow occasionally gave way under our feet, it was pretty ok, though the climb was a bit steep (at least for a beginning hiker like me).</p> <p>Once we got to the mountain crest, the forest ended and there was only the occassional tree or bush. And the wind. We didn't expect any wind, but at some point it got up to 40 km/h. The visibility got reduced to below 100 meters. While it was only -7 degrees, it felt like -12 at least. </p> <figure> <p><img alt="First checkpoint" src="" width="100%" style="max-width:600px"/> <figcaption>Proof that we got to the mountain ridge</figcaption></p> </figure> <p>Our progress was made difficult by the fact that the snow would occasionally break under us, so we would end up knee deep in snow. I found it very interesting how in some places the snow was frozen solid, to ice, while two meters further it would crumble under you. Snow dunes :D Luckily, we didn't have to climb as steeply as before. </p> <p><img alt="Snow dunes" src="" width="100%" style="max-width:600px"/></p> <p>Because of the lack of speed, we ended up going only halfway to Curcubata Mare. We didn't want to hike in the dark, so we turned back from an intermediate plateau. </p> <p>I realized on the way back that my boots weren't up to snuff. While the rest of my equipment kept me nice and warm, I managed to get snow into my boots and they got really wet :( So that's on my to buy list. </p> <p>Going back was even more difficult. I had already exhausted a large part of my energy reserves. Luckily I had my trusty Thermos, which still kept the tea warm, so it made me very popular in our group :))) But at least the weather cleared up a bit, the wind stopped and the clouds sometimes gave way to views into the valleys around us. </p> <p>All in all, it was a fun challenge for me and I would like to repeat it soon!</p> <figure> <p><img alt="Turning back" src="" width="100%" style="max-width:600px"/> <figcaption>Group picture from where we turned back</figcaption></p> </figure> <p>Pictures were taken by Csabi and Silviu!</p> Sun, 01 Apr 2018 14:36:00 GMT,2018-04-01:/2018/04/01/man-versus-nature N.T. Wright in Cluj <p><img alt="NT Wright and his translator" src="" width="100%" style="max-width:600px"/></p> <p>N.T. Wright, one of the world’s leading New Testament scholars and an expert on Pauline theology, was invited by Edictum Dei to hold a conference in Cluj last week. I found out about it the day before, when there were already no more tickets available. But, I persisted, asked around and managed to find some friends who could lend me some, so I went to the first session, out of three. </p> <p>The first topic was based on his book “The Day the Revolution Began”title case and in it, the professor explained the meaning and the power of the cross and how today, many people misunderstand it. You often hear a platonized version of eschatology, about our souls going to heaven or hell when we die, when Dr. Wright says that in the Bible, the end state is a new creation, with heaven and earth together and God living among his people. He brings back in this way the central message of Jesus Christ in the Gospels: the Kingdom of God as a present reality in the lives of those who believed in Messiah, the Kingdom which will conquer the whole cosmos. Our anthropology is often moralized, with Adam and Eve’s disobedience being just a moral failure, instead of a failure to reflect God into creation. And the third mistake we make is paganizing our soteriology. We believe that the Creator God was very angry with us because we sinned and wanted to kill us all, but then somebody just happened to step in the way. That someone happened to be his only innocent Son, so everything is alright, because the angry creator exhausted his anger on him and everyone else got away free. But, according to Dr. Wright, what’s going on is that God wants to live with his people, but the only way for him to do that is if somebody cleans them from their sins. </p> <p>Another thing that I found memorable was his explanation for part of the reason why Galatians was written. He says that one of the main issues there was Jews and Gentiles eating together. Normally, the Gentiles were ritually unclean, so if the two ate together, the Jews would become unclean as well. However, because Jesus died for the Gentiles as well, he cleansed them, so now it was no problem for the two to eat together. And then Dr. Wright goes back to John 12:20, when some Greeks want to talk to Jesus and ask Phillipp for an audience with him. But Jesus doesn’t answer with a time and place to meet them, but seemingly veers off to talk about his death. The explanation for this is the same: for now, Jesus, as a Jew, could not meet with the Gentiles, but, after he dies, he will have cleaned them, so he is waiting for that. </p> <p>At the end, I went up to N.T. Wright and I got two autographs, one in his translation of the New Testament and one in his book Simply Christian. He noticed that I had a T-Shirt which said δούλος Ἰησοῦ Χριστος (servant of Jesus Christ) and then he told me that one of his grandsons, to whom the book is dedicated, will have his catechism on the 1st of April. He was very nice and friendly :) Unfortunately, I forgot to take a selfie with him :(</p> <p><img alt="NT Wright's autograph in my copy of Simply Christian" src="" width="100%" style="max-width:600px"/></p> Mon, 26 Mar 2018 21:47:00 GMT,2018-03-26:/2018/03/26/n-t-wright-in-cluj Dell XPS 13 review <p>While I worked for Google, I had a work Macbook and that was enough for my personal usage as well when I was travelling. But now, I needed a new laptop, so a month ago I splurged on a shiny new toy. Facebook proved itself useful for once: I asked for recommendations and got a lot from my friends. I wasn’t too impressed by my previous Macbook and Lenovo seemed to have ruined their laptops. But one laptop that caught my eye and quickly convinced me to buy it was the Dell XPS 13. The bigger question was which model to buy: the 2018 one, which has only USB C ports, but a smaller battery, or the late 2017 one (they released two models in 2017!!!!), which has a more usual port setup and a bigger battery. I ended up going for the latter. </p> <p>From the beginning, I experienced some interesting issues: when I first started it up and it was doing it’s initial setup thing, it gave me a BSOD. In the next one week, it would randomly restart. A quick search revealed that the BIOS might be the culprit and after upgrading it to the latest version, I didn’t have this problem. Still, shame on you Dell for releasing such a BIOS.</p> <p>I didn’t want the fancy 4K laptop, because I find it an overkill on a 13” laptop, and it comes with a touchscreen, which I find to be a useless gimmick. But the only configuration that had 16GB of RAM came with that and an i7 processor. It comes at the cost of battery life, but I still manage to get it to last around 8 hours at least, so it’s pretty good. </p> <p>And the display is amazing, I have to admit it. It’s extremely sharp and incredibly bright (I normally use it at less than half the luminosity). The colour range is also very good. It has very small bezels, but that comes at the cost of the camera being placed in the bottom left corner, looking up your nose in video calls. </p> <p>Unfortunately, the BIOS bug I mentioned above was not the only software issue the laptop had. For the first month, it felt like it lagged a bit. Sometimes text would should up with a delay, clicks weren’t responsive. At first I blamed it on Javascript and modern bloated web development. Then it happened in Notepad. Then it occured to me to look at my CPU with some monitoring tools - and lo and behold, it was throttled to 1.3 GHz, even though I was on AC and power settings to the maximum. After some fiddling with settings, I read on the internet that maybe plugging it out and back in would help. And indeed it did. Processor clock speed jumped up to over 3Ghz, fans spun up and everything moved snappily, like they should with an i7 processor. What are you doing Dell?</p> <p>The keyboard is pretty good. I didn’t really enjoy chiclet keyboards, but I guess I got used to them and now I don’t mind them. It doesn’t have dedicated media keys, so you have to use the Fn button to toggle between them and the F keys. The trackpad is good, except sometimes it misinterprets my two finger scroll for a pinch to zoom gesture. </p> <p>It has a charging port, an USB C port (which cannot be used for charging unfortunately), two USB A ports, an audio jack and an SD card reader. </p> <p>When on the road, so when not using it for CPU intensive tasks, it doesn’t heat up and is comfortable to hold in my lap. Also, battery lasts around 8 hours, so enough to get through a working day. </p> <p>Given that Windows now has a Linux Subsystem, I haven’t even bothered yet to install Linux. I might do it in the future, but so far, it worked pretty well for me. You can install fish, tmux and a lot of other command line tools. What else can a simple software engineer want? </p> <figure markdown="1"> <video src="/static/videos/dell.mp4" autoplay loop> </video> <figcaption>Opening my Dell XPS 13 laptop</figcaption> </figure> <p>There is one mildly glaring hardware annoyance: it's pretty hard to open the lid with one hand. As you can see in the above video, I pretty much have to open it with two hands. Oh well, I can live with that. </p> <p>The design is surprisingly pleasant to the eye. There is finally a nice looking Windows laptop! I don’t know exactly what it’s made of. On the outside there is a metallic shell and on the inside there is a very smooth, silky thing, which is very nice to touch. Combined with the fact that the laptop is really light weight, it’s an awesome choice. If the software bugs don’t happen again, I think I’ll really enjoy working on this little thing.</p> Sat, 10 Mar 2018 15:21:00 GMT,2018-03-10:/2018/03/10/dell-xps-13-review Goals for 2018 <p>Because of my move to Romania, I decided to postpone any big goal setting until I’m finally here. Packing, driving 3000 kilometers, unpacking, deregistering from Switzerland and registering in Romania all take unpredictable amounts of time, so I just thought about what goals I want and they will be my Martisor’ resolutions.</p> <p>Most of the things will stay similar. For example, my get to 82 kg goal is still there, for a third year in a row. This time it will stick, I promise! But, now that I’m no longer at Google and I no longer have the “all your intellectual property belong to us” clause in my contract, I would like to start being more active in the open source community so I’ll add a goal about that. </p> <h4>Physical</h4> <dl> <dt>Weight watching</dt> <dd>First off, I have to get back to 82 kilograms. Second, I have to stay there. This thing is getting old, that I keep losing the weight, only to put it back. I want to lose 1 kg per month, so that by the end of the year I will be in the right shape. </dd> <dt>Exercise three times a week</dt> <dd>The same amount as before (I found that it worked fairly well for me), but I will extend the activities that count: given that I live in a house now, any house work that gets me sweating and I do it for more than 30 minutes counts. Of course, going to the gym, running, swimming and other traditional sporting activities are still valid.</dd> <dt>Juggle for 5 minutes every day</dt> <dd>I want to learn how to juggle, so I will start practicing for 5 minutes every day. Goal is to be able to juggle 4 balls for 1 minute. </dd> </dl> <h4>Mental</h4> <dl> <dt>Read 20 books this year</dt> <dd>Self-explanatory</dd> <dt>Write three blog posts a month</dt> <dd>Self-explanatory</dd> <dt>Spend 2 hours per week studying German</dt> <dd>Given that I have learned German for three years and I got to an okay level, it would be nice to get a diploma to prove. I’ve finished A2 courses, so I would like to get to B1 level this year and take an exam in that. </dd> <dt>Two open source contributions per month</dt> <dd>There are a lot of cool open source projects, so I would try to contribute to them. Ideally, the contributions would be measured when they are accepted, but because things can get complicated quickly, I’ll accept writing 100 lines of code towards fixing an issue as well. </dd> </dl> <h4>Spiritual</h4> <dl> <dt>Read 3 chapters from the Bible every day</dt> <dd>Self-explanatory</dd> <dt>Memorize 1 Bible verse every week</dt> <dd>Sigh. Here we go again. However, this time I will apply a twist: I’ll try to get a group of friends from church to do this with me. Hopefully, learning in a group setting will help me with this goal.</dd> </dl> Fri, 23 Feb 2018 10:25:00 GMT,2018-02-23:/2018/02/23/goals-for-2018 An update on rolisz <p><img alt="My Google Guest Badge" src="" width="100%" style="max-width:600px"/></p> <p>After 3 years and 5 months at Google, time has come for me to move on to my next adventure. Yesterday was my last day at Google. I will be moving back to Romania, where my wife still has one year of university left. </p> <p>On one hand, I'm super excited about the changes the future will bring. I have a super interesting job lined up. It will be nice to be able to understand again everything that is spoken around me. Having friends and family close by will also be a huge plus. </p> <p>On the other hand, I can't say I'm not a bit sad. I had a good life here in Switzerland, Google is a really good employer and I had some amazing colleagues here in Zurich, whom I will miss.</p> <p>I hope I won't suffer from too much reverse culture shock as I go back to Romania. But, like until now, I trust that God will continue to lead my steps and that He has a place and role for me in Oradea. All I have to do is obey faithfully.</p> Thu, 01 Feb 2018 10:22:00 GMT,2018-02-01:/2018/02/01/an-update-on-rolisz Mass Effect Andromeda <p><img alt="Mass Effect Andromeda poster" src="" width="100%" style="max-width:600px"/></p> <p>I have been a huge Mass Effect fan. I loved the first game as soon as I played. I found the story fascinating, the science there amazing. I played the games and read the books and got to know the universe pretty well. But then Mass Effect 3 came out and I was fairly dissappointed by it. And now, after a long break, the latest game in the series came out, Andromeda.</p> <p>As you can guess from the name, this game happens in the Andromeda galaxy, so it's a complete break from everything we've seen previously. All new planets, all new starts, all new enemies. In fact, the story sets off right before Mass Effect 2 or 3, and then skips 600 hundred years into the future, but in a complete isolation. </p> <p>Because I'm a grownup with a job now, I didn't have time to play it right away. I actually waited for a while and bought it on a sale two months later. And it took me more than half a year to finish the single player campaign.</p> <figure> <p><img alt="The Nexus" src="" width="100%" style="max-width:600px"/> <figcaption>The Nexus, the central hub for the Milky Way species</figcaption></p> </figure> <p>Mass Effect Andromeda had a pretty botched launch. It had a lot of bugs and people especially liked to make fun of the facial animations. Because of this, BioWare had to release several patches for it in the first two months. Because of this, I actually had fairly low expectations (and also because I really didn't like <a href="">Mass Effect 3</a>).</p> <p>But my low expectations were fully met and even gone over. Too bad sales sucked, so Bioware pretty much shelved Mass Effect for a while. I did find that the science part of the game went away. While in the original one they had several pages explaining how Element Zero works and how FTL flight happens, now they very handwavily skip over how they travelled between galaxies in one sentence and that's it. Also how does SAM, the resident AI, interact with the local ancient technology? Well, it's an AI, of course it can do that. Ugh. </p> <p>Once you arrive in Andromeda, you realize that all the planets that the Andromeda Initiative magically saw, before leaving the Milky Way, are uninhabitable and that there is this weird Scourge that fries all electronics. But luckily, you find some abandoned technology, called Remnant for some reason I can't figure out, on the first planet that you crash on and it start terraforming the planet. Unluckily, you find some aliens, called Kett, which you call the same as everyone else, without agreeing to it before, also interested in that tech and they shoot first, ask questions later. But luckily again, you find some other aliens, the Angara (at least they give you their name), who are more friendly. </p> <figure> <p><img alt="The Remnant Vault" src="" width="100%" style="max-width:600px"/> <figcaption>A Remnant vault, which terraforms planets</figcaption></p> </figure> <p>So you go around various planets, doing the usual Mass Effect side missions and levelling up. The difference is that now you have to unlock the vaults that have the Remnant tech and terraform the planets, transforming them from dreadful and deadly places to cozy homes. In the meantime you fight off the Kett and of course you deal with all the politics people brought from Milky Way.</p> <p>The Kett become a bigger and bigger problem, so the solution is to get rid of them. You go for the Kett head honcho, The Archon, who is an arrogant bastard. You fall for a stupid trap, but of course you manage to escape. In the meantime, you learn that the Jardaan created the Remnant, but they were attacked by somebody with the Scourge. And poof, now they're gone. You defeat the Archon, you find a really cool Dyson sphere and everyone is happy. Yaaaay. </p> <p>The storyline is actually promising, but the problem is that they clearly intended to have some DLCs. There are so many hints dropped only to pique your interest, fragments of really cool information, but which are never fully developed, so you are left hanging. And Bioware confirmed that they won't be releasing any DLCs. Why did you do this to us? Whyyyy?</p> <figure> <p><img alt="Part of the gang" src="" width="100%" style="max-width:600px"/> <figcaption>Ryder with 4 of the team members</figcaption></p> </figure> <p>Your new squad mates are quite interesting. You have the 1000 year old krogan, who's seen it all, except a new galaxy. You've got the human biotic who trained with asari commandos. There's the rebellious and playful asari scientist, who's obsessed with the Remnant. One of the angara joins you and spreads the zen around. The turian is the smuggler/black-market dealer type. And lastly, you've got another human, who used to be some sort of special police back on Earth.</p> <p>The dialogue is pretty good, plenty of jokes to go around. Especially while you are driving, your team mates start talking among themselves, usually annoying each other, so it prevents you from being too bored from driving around. </p> <p>Combat is quite fun. A new addition is that you have a jet pack that you can use to jump (finaaaaally!) and hover while shooting. This means that you can jump on your enemies and rain destruction from above on them. There is also a weird profile system (such as soldier or biotic specialist), where you can switch between them at will. In practice, I found it quite cumbersome, because switching profiles trigger the cool downs for all your powers. So I settled for the profile that gave me the syphoning ability: melee attacks would fix my shields. </p> <figure> <p><img alt="The Tempest" src="" width="100%" style="max-width:600px"/> <figcaption>Your ship, the Tempest</figcaption> </p> </figure> <p>Graphics/physics: I was expecting better from a modern AAA game. While the facial animation glitches didn't bother me so much, I was frustrated by seeing how many times solid objects would move through each other or would move in other impossible ways. I mean, it's 2017, how come collision detection is still so buggy? Also, your vehicle can move in... um... funny ways. </p> <p>The worlds where you play are quite big and varied. You've got everything from desert to tropical to frozen wastelands. You drive around in some all terrain vehicle, which occasionally moves in weird ways. The most annoying part is with mountains, which are tempting to try to go over in the car, but it usually ends up being slower than going around. </p> <p>The soundtrack is pretty decent. It gives you the chills sometimes, but most often it's just not memorable. </p> <p>I think I'll give the game a 7/10. It was fun to play, but it had plenty of quirks that were annoying, especially for a Mass Effect game.</p> Sat, 13 Jan 2018 19:44:00 GMT,2018-01-13:/2018/01/13/mass-effect-andromeda