Skip to main content

Posts about programming

Assist time

I sometimes hang out on #org-mode or #nikola and answer questions. I usually am not able to answer off the top of my head and I look up docs or dig through the code. Sometimes, we find ready-made documented answers, but other times we end up contributing some documentation, filing an issue, submitting a patch or hacking up something for that super-special use-case.

Until now, I looked at this time as IRC time, which loosely translates to distraction time. But, I'm usually learning about the tools I use a little bit more. Even when I'm not, I'm helping someone do something they want to do. Sometimes empowering them to fix future problems on their own. And indirectly making the user community a wee-bit happier, and possibly the software a tiny bit better.

This isn't limited just to helping someone on IRC. Helping out a co-worker do something that they are new to, or just adding a comment or editing a Stack-overflow answer may end up in the "distraction time" bin, just because you weren't doing something on your TODO list. It needn't be.

Taking cue from scorers in Football, I decided to call this time assist time and to try to start seeing this time as (semi-)productive. Naming helps.

Reading-time based scheduling

I had posted a link to an poem written on Medium on a Slack channel that I use with friends. A friend said that she liked the fact that the Slack article preview had the reading time from Medium in it. She could decide whether or not she wanted to read the poem or any other article at that moment.

This gave me the idea for a reading time extension for my browser, or my feed reader or my bookmarks – my reading list. The first version should be able to compute or extract the reading time for an article or a tab in my browser, and index them. I want to be able to specify the amount of time I will be able to spend reading, and be presented with something from my reading list. I think this would help with scheduling the reading of longer articles, and also to actually help me get through my reading list.

Reading time estimates that use heuristics based on word-count may not really work, and may do more harm than good. But, it may still be worth a try to see if it helps my reading habits in any way. A quick search pointed me to this extension, that can give the reading time for any page but doesn't really do what I want.

Stepping AFK

In the past few weeks, I noticed three instances where I was forced to take a break exactly when I was ready to jump in and write some code to fix a bug or add a small feature. I had to step out of the house and take a walk to meet someone, etc.

I ended up getting ideas during the walks, which significantly changed and simplified how I would've implemented things, if not for those breaks. Even if I did end up zeroing down to those solutions, I am pretty sure it would've taken a couple of not-so-good attempts and much longer than it did.

Context switches are usually considered to be expensive for programmers, but taking a break at the exact time when I had all the required context loaded into my head seemed to help. It was also probably helpful that I was taking a walk, and there wasn't really any other inputs or outputs competing for the space in my head.

This got me thinking about doing this more deliberately – I'd love to hear about any experiences or any experiments any of you have done with this. Also, I'm reminded of Rich Hickey's Hammock Driven Development talk and I wonder if this is a smaller/different version of it, and makes me want to try out the things he suggests more deliberately. If any of you has thoughts and suggestions, I'd love to hear from you!

Thinking about Data Ethics

Earlier this month, a researcher made a dataset containing the profiles of about 70,000 users public. He didn't really see a problem in doing this because he felt he was only presenting already publicly available data in a more usable form. was only presenting it in a more usable form.

Yesterday, I came across this quote in the very first chapter of Allen Downey's book Think Stats which I liked a lot, and reminded me of this incident.

26969166130_58e4865f47_b.jpg

I hadn't looked at the OKCupid data release and the discussion around it much, but I went back and read this article by a social media researcher who thinks a lot about these things.

She puts forth a lot of interesting ideas to think about ethics. Some things that stood out to me are:

  • Ask yourself how the person whose data you are using feels about the data.
  • Taking a 'what if' impact approach to thinking about data and ethics.

Also, you needn't really call yourself a researcher to be actually doing experiments with (or analyzing) "big-data" and discovering and putting out facts that have an impact – however big or small. You should really go read the article, whether or not you are a researcher using data.

Incidentally, there is a meet-up on Data Ethics this weekend in Bangalore. I'm excited to learn and think more about this, and talk to others who care.

Level-up Tools

Thanks to a friend I got an upgrade to our still-being-setup kitchen. I now have a non-stick pan along with a few more new additions. I would previously use a bowl that people usually use to boil milk etc. for making whatever I did. The non-stick pan feels so great! It has made it a lot simpler to make some of the things I used to, because its non-stick. And it has vastly expanded the possibilities of things I can make, by virtue of being flat and wide based. The pan is such a great addition to my kitchen paraphernalia, and it adds a new dimension to the kind of things I can make. I'm not here to write a user review for it, though.

What are such tools in other things that you do, that drastically changed the way you did something, or added a new dimension to the kinds of things you could do, tools that make you feel like you have a new super-power? Learning to write Python (after starting off with C) seemed to give me so much power allowing me to focus on the problem, rather than fussing over the low level details. Sasha mentions in this post how using a Spaced Repetition System like Anki drastically improved her efficiency because she could focus on thinking about higher level things rather than trying to recall or search for what method or function to use to do something.

What are some such level-up tools for you? Is there a systematic approach to discovering tools?

Tedium in work-flows

I use Nikola for generating this blog. When creating a new post, it prompts for a title, and creates a file for the post.

Often I'm starting off with only a vague idea that needs to be fleshed out before it can be published (or discarded). It is quite difficult to come up with a title at this stage. I just want to start a draft and write things down!

I could use a "draft-title" and change it after finishing a post, but this feels tedious – requires 3 steps – change the title, post filename and post slug. The last two steps are optional, really, but I feel they are important especially when the original title is very different from the new one.

Being forced to come up with a title before anything else, feels tedious and, adds to the effort required to start off a new post. I shouldn't really be worrying about the effort required to change the title of an unwritten post, but it happens subconsciously.

To work around this, I now have a "re-title utility" in my editor that takes care of all the tedious details. I can start with a random title, like Draft-1, and change it when I'm done with the post. I feel this is going to lead to a lot more drafts, at the very least, if not published posts.

Another work-flow related thing I came across recently was @Malabarba's issue on CIDER (an IDE for Clojure in Emacs). The REPL takes a while to startup and this caused him to not use CIDER for running tests, if there wasn't an already open REPL.

The tedium that people feel effects how they use the tool. Not surprisingly, making tedious-feeling tasks a breeze with the tool also effects how and how much they use it. Subtle variations in a work-flow could make or break it. How do you discover such potential work-flow make-or-break-ers? I think, these things would help:

  • Use the tool yourself (dog-food)
  • Talk to (or watch!) people using your tool
  • Look at work-flows in other similar tools
  • Thinking explicitly about various scenarios and simplifying or improving work-flows

I'd love to hear examples of this, and any ideas or thoughts you may have on identifying and fixing such things!

Error messages and new users

I was helping a friend of mine setup his blog and we were trying to use Hexo – a static site generator. We chose a Javascript based tool since he's trying to learn Javascript. I skimmed through active Javascript projects in this list and finally zeroed down upon Hexo based on its popularity. I promised to help my friend to set this up, but he first tried to do it on his own and got back to me after an hour or so, quite frustrated and almost on the verge of giving up setting it up. I didn't expect this from a tool that had so many stars, forks, plugins and so much active development.

We finally got it working, but we found that the error messages were horrendous – even for someone who has been using free and open-source tools for a while now. Printing out errors from compiler or interpreter directly along with the stack trace is almost always the worst thing to do for a tool/utility (as opposed to an API or library). The stack trace is definitely useful, for developers trying to build upon or improve your tool. Have a debug or development mode where developers can get all the information they need.

If you care about your users, especially new users, make sure you spend sufficient time on showing human-readable messages. If possible list the possible causes for every error along with tips for troubleshooting.

How I learnt to use Emacs' profiler

I learnt to use Emacs' profiler yesterday, after many hours of yak-shaving, trying to get Memacs working. Memacs is a memory extension system for Emacs written by Karl Voit, that I have been meaning to try out for a long time now. Seeing lots of review posts at the turn of the year and watching Karl's recent Emacs Chat with Sacha Chua pushed me to try and finally set it up.

I started writing a module to create a Memacs file – an org archive file – from my browser history. It was pretty easy to write, and I had it spitting out a huge file with 22k entries after about a couple of hours of work. Then I excitedly pulled up my agenda, and turned on the option to view archived entries, only to be super-disappointed. It turned out to be extremely slow! Actually, the agenda never came up with the 22k entries file that I had. At least not in 5 or so minutes, before I got impatient. The performance was unacceptable even when I reduced it to 5k entries.

I was pretty sure it wasn't that slow for Karl in his demo and tweeted to him, asking for a workaround. Meanwhile, I looked at his dot-emacs, but wasn't able to dig out what was needed to speed up things. He confirmed that his performance was way better than what I was getting.

First, I ruled out the possibility of it being because of the SSD, since clearly my CPU usage was peaking, and the task was CPU bound and not I/O. Next, I tried using the same file on a different machine (with a different version of Emacs and org-mode), and it worked blazingly fast. So, it was either the version of Emacs or org-mode that I was using.

I should have stopped, thought clearly, and started experimenting with org version, but hindsight is 20-20. I tried Ubuntu's pre-built Emacs and agendas were fast! I suspected my Emacs build, since I recently started building Emacs from git. I built two or three other versions of Emacs, and wasted a lot of time, before realizing that I wasn't using the org-mode source bundled inside Emacs for the tests, and there were two "independent" variables.

Finally, I began bisecting org-mode's source and found that all hell broke loose with an inconspicuous change around release 8.2.6. It turns out that org-overview was broken before this, and collapsing all the trees in a newly opened org-buffer (default option) wasn't working. Once this bug was fixed, opening huge org files would slow down by a great deal, in turn causing agenda generation to be unbearably slow.

All I had to do was add a #+STARTUP: showeverything to the top of the file. This speeded up things by about 50 times! It turns out, I later found out, that all of this is documented on Worg. I did try a few search engine queries, but sadly none of them brought this up. Adding the following to my config, speeded up agenda generation by about 150-200 times!

(setq org-agenda-inhibit-startup t) ;; ~50x speedup
(setq org-agenda-use-tag-inheritance nil) ;; 3-4x speedup

In the course of all this debugging, I learnt how to use Emacs' profiler. The profile reports along with git bisect, eventually helped me figure out what the problem was.

To profile the CPU usage, all you have to do is add a call like

(profiler-start 'cpu)  ;; or M-x profiler-start

at the place where you wish to start it. Emacs will then start collecting information about where time is being spent, by sampling every sampling-interval seconds (default 10^6 nanoseconds = 1 milli second).

You can view the information being collected, at any point of time using

(profiler-report) ;; or M-x profiler-report

The report is a nice, interactive tree with the percentage of time spent in each call. You can stop profiling by calling (profiler-stop). If you have more than one report, you can compare them by hitting = in one of the report buffers. I'm definitely going to use this for other things! (like speeding up my startup?)

Now that I have Memacs working with reasonably fast agenda views, I'm looking forward to collecting as much personal information as I can! Thanks Karl for writing Memacs. I am going to be a pretty heavy user, I think! There seem to be a few rough edges, though, and I hope to help smoothen them out a little bit, over the next few weeks.

Martin Fowler on Refactoring @ RubyRogues

I stumbled on a Ruby Rogues podcast yesterday, which had Martin Fowler as the guest. I really enjoyed the discussion on Refactoring (the noun, the verb and the book!)

Martin clarified in the podcast that refactoring (the verb/process) is a sequence of very small refactorings, while you keep making sure that you can run the test suite always. A refactoring (noun) is a change where you change the structure of the code without any externally observable changes, with the intent of making it easier to understand and cheaper to change in future.

I also really liked the metaphor of a 'healthy code base'. Refactoring is, then, the process of keeping healthy – exercise, speaking metaphorically. You can stack up all the exercise you need to do, until you get really unfit. Refactoring, similarly, needs to be done regularly, to keep the code base healthy. This lets you go faster, in the future.

I also really liked the advise about trying to push back the mental contexts you build, while trying to debug/understand some code that is not very clear, by refactoring the code to make it clearer. Code needn't be one big chunk of cryptographic text. Code is writing. It should be clear and understandable. Or, at least we should strive to make it so!

The discussion, as always on this podcast, was very lively, pleasant and enjoyable! Enjoy!

Recurse Center, 2014-07-07

  • As preparation for a one-on-one this week with one of the facilitators, I was wondering if I was really getting better as a programmer, by doing what I am doing.
  • I have heard at numerous places that reading and reflection are keys to getting better. I feel like I haven't been giving these things much attention in the past couple of weeks. I don't catch up on reading all the awesome reading material shared on Zulip and I switched from writing this blog post first thing in the morning, to any-time-after-lunch. I don't think this worked out very well. Writing the post worked as a way to reflect on what I had done yesterday, and what I should be doing today. So, I am back to writing the blog post, first thing in the morning!
  • Yesterday, I worked on indexing the Python sources in a way that the inspection code can look up, later. During this process, I found that my code to use libclang's AST wasn't generic enough, and I had to clean it up to be able to extract useful information from any file in the cpython sources.
  • We also got to attend a super-awesome talk by Steve Labnik! He talked about his progression from being an application developer, to writing libraries, to working on languages (as a professional developer). He made a lot of interesting and inspiring points during his talk. Some of those that stuck with me are:
    • None of these is particularly harder than any of the other. Depending on each person's personality, or the way their brain works, they are good at doing one or the other.
    • Getting good at programming is a matter of showing up, more than about the "genes". He repeated quite a few times that he disliked the idea of "baby hacker", and left out the story of his childhood and college programming days! I'm totally stealing his idea of meeting every saturday at 1pm, with a bunch of friends and working until it was 10pm or so, when they could get cheap beer and food! And he did this all through his college! It is interesting that this idea is so similar to Hacker School!

    It was a very enjoyable and inspiring talk on the whole.

  • The plan for today is to actually have the parsed information dumped into some persistent format, and modify the inspect code to actually use it.
  • I will also be pairing with Kyle for a few hours on working through some of http://mitpress.mit.edu/books/audio-programming-book