I wanted to write a little grep program that prints the lines of a given file that contain a given term.
I choose to write it in
Python and see how different the thinking process and the result would be.
The aim was to get an idea of these points:
- how different these languages deal with IO
- how conditional flows could be managed
- how clean the code would be
And as baseline I defined the following requirements:
- usage is grep term file (the term and the file to grep would be taken as program arguments)
- in case of incorrect usage (not enough arguments), prints the usage.
- these program arguments should be named and not used directly
- the actual grep work should be defined in a specific function
- ignored cases such as incorrect filepaths
So I started with Haskell.
Haskell has a very interesting way of dealing with IO. IO is part of the “dirty” world, not of the pure one.
So computation that happens to do IO work is bound and “isolated” from the pure code and this separation is explicit, through the IO Monad.
To give a quick simple example of what this implies, a String value obtained from a IO operation has type of IO String and not just String.
Average Time: real 0m0.047s | user 0m0.042s | sys 0m0.004s
mapM_ putStrLn . filter (isInfixOf term) . lines ?
Another way of writing this would be:
putStrLn . unlines . filter (isInfixOf term) . lines.
mapM_ is a function that takes a Monad (putStrLn) and a foldable element (the filtered lines) to which the function gets mapped upon.
Both seem to me like fine solutions so I just decided to leave it like this.
Average Time: real 0m0.239s | user 0m0.354s | sys 0m0.050s
Java 8 introduced Files which makes this work nicer than the traditional while getLine(). Apart from that, not much.
Average Time: real 0m0.060s | user 0m0.040s | sys 0m0.016s
Ok, maybe the foreach could be line-breaked but this looks pretty clean and readable.
Average Time: real 0m0.421s | user 0m0.551s | sys 0m0.092s
Average Time: real 0m0.038s | user 0m0.018s | sys 0m0.015s
I think the Ruby version ends up with a very readable core and looks quite clean, but that’s all.
The Python version doesn’t look very interesting apart from the simplified syntax (like Ruby) without all the curly brackets that Java and Scala take but that is a superficial difference.
The Haskell version ends up being, in my view, the cleanest solution. If I would not specify the function types, I would end up with only 6 lines to satisfy the requirements.
The fact that I don’t need to define flows in an if-else way but, instead, with functions and pattern matching, the way it reads so nicely and the way the language deals with IO as something apart (which I find more interesting and challenging)… It’s just something else!
I used a 4.2MB plain text file for as target file. Here I got a bit surprised with Scala, not just because it took longer than the other languages but also because of its time variation. Sometimes it took almost up to 1.165s, while the other languages always kept a very small and constant time range.
- Python ± 0m0.038s
- Haskell ± 0m0.047s
- Ruby ± 0m0.060s
- Java ± 0m0.239s
- Scala ± 0m0.421s
Find the source repository here