Saturday, May 21, 2016

And the first conference of the year was... Dutch Clojure Days

Clojure is a very interesting programming language. If you haven't checked it yet, go ahead. Apart from being a pragmatic Lisp that offers easy Java interop, it also has a nice community. So far I haven't spent as much time on the language as I would like, but I'm planning to do that.

I was planning to write a few comments about each talk of the first (ever) Dutch Clojure Days event, but I just noticed that the whole conference is now available online.

If you don't want to watch every keynote, there's one that intrigued me: Conversational Computing: How Okasaki made McCarthy right yet again. I liked the way Michiel is trying to connect the dots.

And if you liked my recommendation and want a second one, try Clojure for Data Science: the good, the bad, and the ugly. Simon covers many nice concepts of functional programming, without being afraid to criticize the ugly parts and express his ideas on how things can be improved.

I'm very glad that I attended this conference. It was free (gratis) but thanks to the kind sponsors we had everything (coffee, muffins, and sandwiches). The venue was nice, and the location great (well, I love Amsterdam anyway). So yeah, nice work guys and until next year happy Lisping in Clojure ;)

Sunday, April 17, 2016

Course review: Git key user

Last month I followed a course about Git. The name of the course was "Git key user" and it was organized by TMC.

Obviously Git is a tool that you learn by doing, and indeed the lesson of the day was the following: "Do not afraid to experiment".

In general Git behaves as a purely functional data structure (or a copy-on-write filesystem) in the sense that it never overwrites or directly removes data. All orphan nodes are kept for two weeks (I assume that this is configurable) and only if you don't touch them for that period of time they will be cleaned by Git's garbage collector.

During the course we have learnt how to use:

  • git commit --amend to make changes related to the most recent commit. Examples include writing a better commit message or removing a file that is not required. The same actions can be applied  to any past commit using a combination of git rebase -i and git commit --amend.
  • git cherry-pick to apply specific changes from one or more branches to a destination branch. This is useful when for instance a required feature was developed to an incorrect branch.
  • git checkout combined with git stash to clean up a messed up repository.
  • git rebase -i with the options s(quash) and p(ick) to group/restructure related commits and create a better/cleaner commit history.
  • git bisect to go back to a good version of a branch. That's necessary after finding out (too late) that the current branch is broken but you are not sure when the bug that broke the code was introduced.
Overall, it was nice to see some concrete use cases of the commands because Git has so many features that it's not hard to get lost...

Tuesday, November 24, 2015

Book review: Seven Databases in Seven Weeks

That's a really nice book. From the seven databases that are covered I was familiar with PostgreSQL and only briefly with Neo4j. So the book gave me the chance to explore some more databases and find out about their strengths and weaknesses. In the following paragraphs I'll explain what I found nice and what not so nice about each of them. Before I start: if you are planning to buy this book, I want to warn you that some features are deprecated or even removed, because some of the database systems have evolved since the time the book was written (2012). For example the largest part of the Neo4j chapter is useless, because it doesn't use the Cypher language.

PostgreSQL rocks. It's a very powerful RDBMS and I acknowledge that since I have used it professionally. Postgres is mature, fast, and rock-solid. For those reasons I would choose it for all problems that play nicely with relational DBs. And yes, an RDBMS is not the answer to all problems. For example distributed computations do not fit well into this model. Scaling is limited to making your single DB server/cluster more powerful by upgrading/extending its hardware. And not all problems require full ACID compliance and strict schema enforcement.

Riak is flexible. Being able to interact with a DB using a REST interface and a tool like curl should not be underestimated. What I like about Riak is that you can store whatever resource (be it a document, an image, etc.) you like on the fly and map it to your URL of preference. It just works! I see Riak as a Web filesystem that supports distributed computations through mapreduce. But Riak also supports connecting resources and traversing between those connections (link walking). On the down side configuring, and understanding some Riak concepts (for example conflict resolution and adding indexes) is currently a pain. And you can only find prebuilt binaries for your operating system (Windows is not supported at all) on

HBase is unusual. It takes some time to understand the way a column-oriented DB works. What I found great is that versioning is builtin. If you care about data history that's a big deal. Another plus: compression and fast lookups using bloom filters are also builtin. Great features, that can save a lot of time of development. The negatives: no REST interface, complex configuration, and no prebuilt binaries -- you need to compile HBase on your own, so forget Windows unless you like pain.

MongoDB is all about JavaScript. Having the full support of a powerful language like Javascript while using a DB is very valuable. Being able to save JSON documents adds a lot of flexibility since they can nest arbitrary. But this flexibility comes with a cost: updating a document means replacing it without a warning, deleting specific elements of a document is not supported and debugging JavaScript code is a pain. On the contrary: the mapreduce support of Mongo is nice, and it also supports indexing documents. Configuring replicas and sharding is also quite easy. And operating system support is very good.

CouchDB is cute. The Futon Web interface makes CouchDB very user-friendly. Its REST interface and the ability to use curl makes it developer-friendly. Moreover, CouchDB has an interesting approach regarding replication, since all servers are treated equally (no master-slave model). The same is true for conflict resolution: one of the conflicting updates is automatically considered the winner, and this is consistent through all nodes. But that's not necessarily the "correct" update... One last thing: CouchDB is easy to install on all popular platforms.

Neo4j is the graph database. There are simply no competitors when it comes to modelling relationships (think of social networks, movies, food, drinks) using graphs. Neo4j has its own query language (Cypher) and a very nice browser that makes experimenting easy. The documentation is also extensive and interactive. Building a cluster is easy. The negatives: learning curve (new concepts and new language), the enterprise edition is not free (gratis).

Redis is generic. It's not a DB as such, but more an in-memory data structure storage toolkit. Redis is simple to use, fast, and supports transactions. Its commands have strange names though, probably the result of an effort to avoid verbosity. Because it is very generic, Redis can be used as a fast in-memory cache for applications that require high performance.

Final comments: Some people have proposed a better definition of the name NoSQL: Not only SQL. I like this definition. Similar to programming paradigms and languages, different database systems have both strengths and weaknesses. Why not use more than one to achieve our goals? That's the main idea behind the polyglot persistence concept, as suggested by the authors. Polyglot persistence means using more than one databases to target different application layers. For example Redis for caching, Neo4j for modelling relationships, and PostgreSQL for persistence.

Friday, November 20, 2015

Joy Of Coding 2015 Review

Like last year, Joy of Coding 2015 was a great conference. This year the conference took place during May, for once again in Rotterdam. The organisation was similar to that of last year: A few common talks, but also parallel talks and workshops.

The conference this year started with a keynote by Chris Granger (@ibdknox): "Programming as distributed cognition: Defining a super power". I missed the beginning of the keynote but AFAIU Chris wanted to stress the importance of using programming as an exploration tool. In that sense, we should create programming tools that make it easier for scientists to model problems and experiment quickly. His tools Light Table and Eve focus on those aspects.

Next, I watched the presentation "Joy of testing" by John Hughes. The quote of this presentation was "Do not write tests, generate them!". Indeed, using the Erlang version of QuickCheck, John showed a live demo of discovering and fixing bugs using generated tests. John also explained his personal experiences of using the same tools to discover and fix bugs that existed in concurrent Erlang production code (AFAIR the code was used in the automotive industry).

The next speaker was Laurent Bossavit (@Morendil). This keynote was more over psychology than technology. But it seems that there's a deep connection between the two. Laurent suffered by depression and according to him depression is a feature and not a bug. It is very important to be able to debug ourselves, and not just programs. We should stay away from things that make as sad and focus on the things that make us happy. As an example, you might be able to find a COBOL job that pays well, but does COBOL really make you happy? Maybe a job with a lower salary but more fun (think of python, arduino, etc.) is better for you.

The next keynote was about "Mutation testing" by Roy van Rijn. Roy believes that mutation testing, a technique for measuring the quality of unit tests, is better than code coverage. There's an actual Java tool that can be used to explore this area: Judy. A mutant is a version of a program with a modified operator. For example replacing logical AND with logical OR. Killing a mutant means that the incorrect behaviour of the modified code is detected properly and reported, and that's what basically Judy does. I've never tried mutation testing. Maybe one day I will...

I enjoyed the next talk by Crista Lopes (@cristalopes) a lot. Crista is the author of a really nice programming book that anyone who is involved with programming should read: Exercises in Programming Style. The book uses a simple concept: Implement the same program using the  same language (Python) in 33 different styles! A style is basically a form of a programming paradigm (think of object-oriented, functional, procedural, etc.). During the talk Crista demonstrated a subset of the 33 styles of her book. The purpose of Crista's talk (and AFAIU that's also the focus point of the book) was not to compare the different styles and take sides, but to stress the importance of recognising and understanding the different styles. I can't agree more. There's no best programming style for all purposes, and we should be able to work with all of them. BTW there's a GitHub repository with the styles.

The workshop that I picked for Joy of Coding 2015 was about "Property based testing", by Marc Evers, Rob Westgeest, and Willem van den Ende. Property based testing is about the automatic generation of unit tests for a system by describing its properties. The benefit of using property based testing instead of unit testing is that it (a) takes less time since the tests are generated, (b) is more reliable than manual writing since humans tend to forget to cover all possible cases.
During the workshop we used Javascript (NodeJs and JSVerify) and went through several examples.

The closing keynote couldn't be better. A mix of jokes and programming advices by Kevlin Henney (@KevlinHenney), by checking nice (and not so nice) pieces of code written by various programmers in different languages. Studying code written by others is something that is important and we all need to do.

Yet another good year for Joy of Coding. I hope that it will continue to use the same successful recipe in the years to come... :)

Monday, November 9, 2015

Book review: Pragmatic Guide to JavaScript

This was my first JavaScript book and I consider it a good overview covering the pros and cons of the language. The author gives good advice regarding which features of pure JavaScript are fine to use and for which features a framework should be preferred to avoid browser incompatibilities.

Many popular applications are demonstrated (custom tooltips, infinite scrolling, form validation, autocompletion, lightbox, 3rd party APIs) and concepts such as client vs server programming are clearly explained. Christophe's focus on Prototype is not a problem for me. It's his favorite framework and the one that he knows well, so it make sense that he's using it for the demos.

One practical problem: this book is not maintained any more, and as a consequence a few examples are broken, due to a domain that has expired and changes to the Twitter API. I contacted the author on GitHub and he confirmed it. But still, for a book that was published five years ago it's a nice compact guide to people who are familiar with programming and want to focus on the specifics of JavaScript.

Sunday, March 15, 2015

Playing with microcontrollers

The last training course that I followed was about programming microcontrollers. The course was given by Leon van Snippenberg, who has very good expertise in microcontrollers.

For the practical part of the course we used the Microchip dsPIC33F, a 16-bit architecture 40 MHZ microcontroller (system on a chip solution). I admit that I'm not very fond of this proprietary platform, so I enjoyed the theoretical part of the course much more than the practical. I would be more excited if we have used an open hardware solution like arduino, Raspberry Pi, or something comparable.

A few highlights from the course:
  • A three-operand assembly instruction does not necessarily mean that three registers are used. For example ADD W0, W1, W0 uses only one register.
  • Most microcontrollers use the Harvard instead of the Von Neumann architecture. This means that there are two distinct address buses, as well as two data buses (instead of one address and one data bus).
  • When writing code in assembly we should avoid thinking about code optimisation, since the code is usually very fast to execute (but very slow to produce).
  • A common problem when programming microcontrollers is read-modify-write. One way to solve it is using shadow registers.
  • When programming a microcontroller using a C interface and interrupts, it is very important to use the volatile keyword to disable optimisations that might remove code that seems to be dead but is actually used. Because of that, it is also very important to test the code with all compiler optimisation levels enabled, to ensure that it doesn't break.
  • The hardware timers of a platform do not need to follow the same architecture with the processor. For example a platform might use a 16-bit processor with 32-bit timers.
  • Buffers and interrupts are used to solve communication problems between different devices (e.g. a computer communicating with a microcontroller using the serial port).
  • When dealing with non-deterministic problems, disabling interrupts is the most favoured solution.
  • Using a real-time operating system (RTOS) simplifies programming, because we avoid the need to write complex state machines and custom schedulers (those problems are already solved in the RTOS).
  • Multicore support in RTOS is a challenge (unsolved problem?).

We (me and my colleague) challenged Leon by questioning why would one prefer a much more expensive solution like the dsPIC* family of Microchip instead of Raspberry Pi or arduino. The price of the latest Pi is unbeatable. The response was that we should use whatever fits our purpose, and that the Pi manages to achieve such a low price because its makers can estimate in advance the minimum numbers of units that will be sold. Those manufacturing deals are critical in forming the end price of a prototyping platform.

So far I only own an mbed LPC1768 and I'm very satisfied with it. I hope that I'll build some more advanced prototypes in the future, but you have to start from something. I began with flashing LEDs

Continued with adding some basic components like a button

And at some point I built my first practical prototype: a darkness-activated LED

Isn't that nice? In my future posts the plan is to spend more time on explaining the code of prototypes like the last one. For now you can check my mbed repository page.

Monday, February 2, 2015

On writing a book

After reviewing two books about Python, people from Packt asked me if I was willing to write a Python book. I'm glad to see that my first book, Mastering Python Design Patterns is published!

As I expected, writing a book is much tougher than reviewing one. Especially if you have a full-time job, like in my case. I had to deliver a chapter about every week. This is very challenging, since it means that I had to spend many evenings and weekends focusing on delivering a chapter on time.

I hope that my book will be appreciated by the Python (3.x) community. I tried to focus on doing things the Python way instead of reproducing Java-ish or C++-style solutions. To be honest I preferred a different title: I recommended the title "Idiomatic Python Design Patterns" but my proposal was rejected, mainly for marketing reasons.

If you are also considering writing a book, I think that it is a very good idea, but take into account the following:

  • Do you have the time to do it? Unless your book is self-published, you'll need to sign a contract with a publisher and that means that there will be deadlines. Make sure that you discuss it first with your partner/family, since it is a demanding task.
  • Does it fill a missing gap? I don't recommend you to write a book just for the money (yes, you are paid for writing the book and depending on the contract you can also get a share from the sales). I have seen many examples of poorly-written books that were created only because the author wanted to make some money. Don't do it. It might be good for you pocket, but it can harm your reputation, your career, and your psychology (think of bad reviews).
To expand a little bit more on point two: I feel that my book is indeed filling a gap. Although there are other books about Design Patterns in Python, none of them focuses on Python 3. In fact, I reviewed one of them, and apart from targeting only Python 2.x, IMHO it is not using idiomatic Python solutions in many cases.

My book is not perfect in any way. The lack of time meant that some examples had to be smaller and more trivial than expected. But this is part of the game. If you are working full-time and you are writing a book, time is your enemy! Be prepared to make compromises...