elektito programming & stuff

So you want to know you're talking to a robot?

Just imagine for a second, that people were calling for a law that the nationality of who calls them needs to be clear. “I need to know if a Mexican is calling me,” they would say. What would you call those people? Racists, right?

Now those people are doing something similar, but now they call what they do “ethics”. They are outraged that we can’t be sure if it’s a human or a robot on the other end of the line. Take a look at this video, from Google I/O 2018, if you don’t know what the fuss is all about:

My question is, why would you need to know that? One common argument so far has been that scammers can make convincing robo-calls using this technology. Well, excuse me, but scam calls were invented by humans and they are still made, sometimes on pretty large scales, by human callers.

And besides, say a law was passed that robots had to introduce themselves as such over the phone. Then what? Let me let you in on a little secret. Scammers are already doing something illegal. You think they care? So what happens is legitimate calls, for which you have nothing to worry about anyway, will start with “Hi! I’m Google Assistant calling on behalf of Bob,” while scam calls will still start with “Hey this is Bob from…”, you get the idea.

So let me get a bit of advice. When someone calls you, listen to what they are saying. If it makes sense, go ahead. If not, end the call immediately. Doesn’t make much of a difference if it’s a robot calling you or not.

Benchmarking Python XML Parsers

I’ve written a small benchmarking tool for some of the different XML parsers available to Python programmers. It calculates each option’s throughput by sending a large amount of XML data to each parser. You need to provide it with some XML input.

$ ./pyxmlperftests.py 1.xml 2.xml 3.xml 4.xml

You can find the source code here on Github.

These are the results on my computer:

Results:
   xml.dom.minidom: 7.49 MBps
   lxml.etree: 89.63 MBps
   xml.etree.ElementTree.iterparse: 31.77 MBps
   xml.etree.ElementTree: 58.43 MBps
   xml.sax: 25.68 MBps

As you can see, lxml rocks. Although, to be honest, I’m still looking for something faster than that!

A word of warning. I don’t claim this is in anyway a fair and scientific benchmark. I just wanted to see how these relatively compare and cooked this script to get me some numbers.

Launch Virtual Machines Quickly with spinup

For a long time now, I’ve been using Vagrant to quickly launch a VM or two when I need to. Recently, I’ve been less and less satisfied with Vagrant. It’s usually slow and needs editing the Vagrantfile if I want to change the machine specs. The slowness might be partially due to using VirtualBox by default. There is a vagrant-libvirt plugin that lets you use libvirt/KVM but the plugin seems to be a hit-and-miss affair and I’ve not been able to make it work all the time.

There is always the option of using virsh and other libvirt utilities, of course, to launch VMs, but that is not as simple as I’d like. I finally decided to write some sort of wrapper script for libvirt and here it is: spinup –a simple utility to launch VMs as fast as possible.

You need to clone the repository, run prepare.sh and you’re set to use spinup. I’ll also assume that you’ve made a symlink to spinup.py as spinup in an appropriate place, and installed the dependencies, so that the utility is always easily available to you. There’s of course the option of installing dependencies in a virtualenv and running ./spinup.py from there. You will obviously need libvirtd available, too.

The easiest way to launch a VM is by running this:

$ spinup

This will create an Ubuntu based VM with 1GiB of RAM and one CPU core, downloading the Ubuntu cloud image the first time you run it. To land inside the VM, simply run:

$ spinup ssh

The created VM is tied to the directory you create it in (although no files are created in that directory). So you need to be in that directory in order to have access to the VM.

In order to destroy the VM, simply run:

$ spinup destroy

You can create a VM with different specs like this:

$ spinup coreos 4G 2cpus

This will create a CoreOS based VM with 4GiB of RAM and two CPU cores.

It’s also possible to launch multiple VMs at the same time:

$ spinup :foo ubuntu 2G -- :bar coreos 4G 2cpus

Here we have created two VMs, naming them foo and bar respectively. In order to ssh into bar simply run:

$ spinup ssh bar

Running spinup destroy will destroy both VMs.

One area in which spinup is sorely lacking at the moment is networking. The created VMs are connected to libvirt’s default network, but there are no other options. I’m hoping to fix this in the near future. (Update: configuring network is now available, although you might need to create the appropriate libvirt networks first.)

spinup is in its very early stages of development, released in the “release early, release often” spirit. If you have any questions, you can send me an email at mostafa(at)sepent.com or create an issue or send a pull request over at github.

Building a static flanneld binary on Ubuntu

I just spent some time trying to build flannel and since there were some nuances, I decided to list the instructions here.

  1. Install build dependencies:

     sudo apt-get install linux-libc-dev golang gcc
    
  2. Make Go directories:

     mkdir -p ~/go/src
     cd ~/go/src
     export GOPATH=~/go
    
  3. Clone flannel:

     git clone https://github.com/coreos/flannel.git
    
  4. Install Go dependencies:

     cd flannel
     go install
    
  5. Since I wanted a statically linked binary, I edited the Makefile and updated the build instruction like this:

     dist/flanneld: $(shell find . -type f  -name '*.go')
         go build -o dist/flanneld \
            -ldflags '-extldflags "-static" -X github.com/coreos/flannel/version.Version=$(TAG)'
    
  6. Now build the binary:

     make dist/flanneld
    
  7. flanneld binary should now be created in the dist directory. You can strip it to make it smaller:

     strip dist/flanneld
    

One other problem I encountered was that you need at least 2GB of RAM for this. I was trying this in a VM with 1GB and I ran out of memory.

DPDK, tamed!

DPDK is a fantastic piece of software. I’ve used it both for work and in my hobby projects (yes, I crunch packets as a hobby, say what you will about me!) and it works great. My only grievance with it has always been the complicated build system it forces on you. It’s ugly and inflexible and it sometimes drives you crazy.

Yesterday, I saw that DPDK is in Ubuntu’s repositories. It took me some time to realize wha that means. I was looking for the value of the RTE_SDK variable when it clicked in my head; I could just use the damn thing like any normal library; with -l, -L, -I and all that. I just needed to add a -msse4.2 flag for the compilation to work properly (I’ve been adding that flag to the TOOLCHAIN_CFLAGS make variable on some of my machines to make DPDK compile anyway).

Does it have any drawbacks? I have no idea, as I never managed to penetrate the many layers of DPDK make files to see what they really do. It’s quite possible that this is not as efficient as compiling DPDK yourself so it can detect and use the full capabilities of your machine, but at the very least I learned a thing or two about building DPDK by looking at the the Git repository of the Ubuntu maintainers. Just do a git clone https://git.launchpad.net/~ubuntu-server/dpdk. Switch to the ubuntu-xenial branch to see the interesting bits.

This seems to be a Canonical endeavor, so thank you kind folks in Canonical!