Recurse Center, 2014-07-01

  • I happily let myself get distracted from all the math! I had seen a tweet by Tom during the weekend, asking if there was any package that let users inspect C sources of builtin modules like ruby's pry-doc
  • I spent the morning playing around with pry-doc, reading its code, etc.
  • The latter part of the day was spent pairing with Tom, and figuring out what needs to be done to write a first cut version of something like this. Ugly regexes, FTW! I have something that works for most of the stuff, it seems!
  • I also attended an interesting talk about Bit-torrent clients by an alum.

Recurse Center, 2014-06-30

  • During the weekend, I read up a lot about HMMs and related stuff.
  • During the day, I tried to start writing code for different parts of the pipeline, and ended up reading about Mel frequency cepstrum related stuff, and wrote a little bit of code.
  • Later, in the afternoon I paired with some of his iOS code, and helped him clean up some of it. I'm not sure how useful it was for him to pair with me, but …
  • The day ended with a talk by Jesse at EBay, titled Write an Excellent Programming Blog. I had already read up his blog post, when the talk was announced, and so, most of the content was known, but the Q&A session had some interesting suggestions on how to improve your actual writing.

Recurse Center, 2014-06-27

  • This Friday we worked on building our own URL shorteners. It was a fun exercise, and I built one that is not persistent, in the two hours. I worked on a not so important problem of having sorted query parameters, instead of the more important problem of having a persistent server. Anyway, I should be building something that works, first, before trying to solve the more interesting problems. At least on Fridays.
  • I watched this old but interesting talk by Brandon Rhodes, titled The Mighty Dictionary, and played around with his _dictinfo module after-wards.
  • I spent the rest of the evening reading up about Markov chains and Hidden Markov Models. I'm happy to finally understand what these terms mean, after seeing them being thrown around for so many years!

Recurse Center, 2014-06-26

  • I played around a little bit with my emacs configuration and got jedi working, with virtualenvs, thanks to virtualenvwrapper.el
  • Continue to work on the singal processing stuff and built a DTMF decoder using FFTs. It's able to decode audacity generated audio, but not recorded ones. I'm not sure what I'm doing wrong, yet.
  • As with every Thursday, there were a bunch of interesting presentations:

Recurse Center, 2014-06-25

  • I like how some of the people here are taking check-ins so seriously. A bunch of people posted messages on Zulip, informing that they would not be able to make it to check-ins, and seemed quite sorry about it.
  • Check-ins are indeed useful. At the end of every day, I review my work wondering what I did, and sometimes end up pushing myself harder, to have something reasonable to talk about during the check-ins!
  • Yesterday, I read up about FFTs and ended up playing around with fft in scipy, and played around with audacity, etc. Nothing concrete, really.
  • There was a "goals" workshop by Stacey, where she talked about how to get the most out of this batch of Hacker School, and then we split into groups of 3, and presented our goal-statements to each other, and gave our impressions of each others' goals.
  • I didn't really have a goal, but the exercise seemed to help, anyway.
  • Also, I think it is making me nervous to treat Hacker School as super-important. It is important to make the best use of every hour, every minute here, but at the same time, I shouldn't be constantly worrying about whether or not, whatever I'm doing is the best use of my time here.
  • I think I would like to form some good habits, and make a few good friends, more than finish projects and build something uber-cool. Of course, it would be great to build awesome stuff (and may be the only reasonable way to form habits, and make friends) but I shouldn't worry too much about time running out. Never Graduate!

Recurse Center, 2014-06-24

  • I worked for the whole day on implementing an algorithm to analyze a cipher text, and guess the substitution cipher used. The algorithm was pretty straight forward, and I had it "mostly" working, in a couple of hours.
  • I then began to refactor it, and found that there was what looked like a bug, and I "fixed" it mindlessly. I was essentially trying to swap 2 rows, and columns of a pandas data frame. I had a data-frame D and its copy D_. I was trying to swap 2 rows and columns of D_. I found that the code was initially using the data from D to do the swap. To fix it, I checked if tuple unpacking did the right thing. It looked like it did. So, I used something like _D['a'], D_['b'] = D_['b'], D_['a']. Essentially, changed D on the right hand side to D_. I thought I had tested this on the terminal, but after hours of debugging (along with fixing another minor issue), I later found out that the tuple unpacking doesn't work and the swapped rows and columns actually become equal! I had suspected this initially, and had "tested" this manually, I thought. These are the kinds of things that should have tests for, I think. It wouldn't have taken me too long to write a test, and I could have been totally sure that it works! (I was manually reading off values in the array, and probably messed up somewhere)

    FWIW, the code now reads _D['a'], D_['b'] = D_['b'].copy(), D_['a'].copy()

  • The algo seems to need about 1000 characters to get past the 90% accuracy mark. I could probably tweak it a little to perform better, but I'm going to leave it here, for now, and move on to the signal processing parts. I'm not totally sure how the signal processing would work, and whether we could actually map back the keystrokes to a substitution cipher enciphered text.
  • If required, the tweaks could be -
    • Use trigrams instead of bigrams
    • Add a degree of dictionary look-ups: May be something like, look-up all the deciphered words, and try not swapping the characters that appear in most of the words that are in the dictionary.
  • This paper is 20 years old, and there would surely be work by others building on top of this, or doing it totally differently.

Recurse Center, 2014-06-23

  • The Apprenticeship patterns book is very interesting! I shall continue to read it and start create daily/weekly rituals from the actions prescribed in the book!
  • I was reading this post on git gc, reflog, etc. and wanted to see what the default values were for some of the configurable variables. But, the git config command didn't help. Developer/code defaults are also a part of the "config" system, and any tools to inspect config variables should display them, if no other values have been set explicitly!
  • "If experience is built upon failure as much as success, then you need a more or less private space where you can seek out failure." – Apprenticeship Patterns.
  • Our check-in groups changed, but 3 of us from our last check-in group are together again!
  • Most of the day was spent with Git. In the morning, I learned that setting push.default to nothing will prevent pushing any branches without specifying explicitly which branch I want to push.
  • Also, later in the day, Alan gave a presentation mapping the internals of git to the commands, trying to give everyone a better mental model of Git.
  • I spent some time trying to get jedi integration for emacs. It should've worked out of the box, but I'm not sure the ELPA repositories have proper dependency/version management. Seems to be the problem, everywhere! Dependencies of a package get updated, and the package starts acting up…
  • Chaitu sent me a link about solving mazes using Finite Automata, and I spent some time trying to write a simple solver, and compare it with Nava's breadth first search. But, it looks like we broke her code during the refactor. I may get back to this, at some point.
  • I also spent a few minutes talking to Kyle, while he was trying to implement a simple version of the NTP protocol. The algorithm it turns out, is pretty simple.
  • I started reading a little bit on crypt-analysis for Simple Substitution ciphers, and I'll try implementing one of the algorithms today.
  • The day ended with yummy pizza and video watching! We watched –

Recurse Center, 2014-06-21

  • I spent most of Saturday, thinking about what I should be doing for the next few weeks, and what I really want to get out of Hacker School.
  • I had a good discussion with Naren about things that he works on, and the kind of things that he sees as hard, what kind of things interest me, and what I want to do post Hacker School. I always knew, I didn't really have an area of expertise or an area that I'm super interested in, where I want to work. This discussion only was a verbalizing of those thoughts.
  • I am not sure, I am very happy with the way I have been learning Haskell. The UPenn course has been reasonably good with the exercises provided, and I have been learning through "pseudo" projects. But, I'm wondering if I should pick up a more concrete project.
  • I played around a little bit with Elm and started something to try and write Layout using Elm.
  • I tried cleaning up the phonebook problem from Friday. It works, but can definitely be improved with some help from a more experienced Haskeller.

Recurse Center, 2014-06-22

  • I finished parts of the phonebook task today. The code looks ugly, and very un-haskelly. I asked for feedback on it, but I guess its just too ugly for anyone to take the time out to comment on it, and help me improve it.
  • I also tried to read up about Monads, and worked through exercises from a few different places. I haven't yet finished the exercises in Chapter 12 of the UPenn course. I just looked at them today, and they seem like a complete bouncer at first glance.
  • I went on a really long walk with Naren, and food-hopped with him while he was telling me stories about the various places he worked at and the kinds of things he has been working on, and things he liked and disliked about them. It has been a good weekend to spend with Naren!
  • Also, the long walk and talking to Chaitu cleared my head a bit, and I decided to pause the Haskell tutorials and learning for the next couple of weeks. I don't really have a project that needs me to learn Haskell, right now. I should stick to project and need based learning, which I enjoy the most.
  • Right now, I feel like working on an key stroke acoustic emanation based key-logger. Just before applying to Hacker School, I came across the idea of side-channel attacks, and it sounded pretty exciting. I don't intend to snoop onto anybody, but the project seems to be challenging enough, and I will have quite a bit to learn about. It will involve signal processing, and learning – both areas that I have seemed interesting to me for long, but I haven't really done anything in them. I plan to work on the idea for a couple of weeks, and review my progress and chart my future course. The problem appears to have 2 parts to it.
    • Processing audio, and extracting key-strokes from it.
    • Identifying the text, based on a sequence of key-strokes. Though, in reality each press of a key may not sound the same, for a start assuming we are able to accurately get a sequence of key-strokes, the problem would reduce to solving a simple substitution cipher. I plan to start with this, tomorrow.
  • Also, I plan to hang out a lot more with Hacker Schoolers, and let myself be distracted a lot more by random discussions and activities, than I have allowed myself to be, in the past few weeks. I should be writing as much code as I can, while I'm here. But, I should also be rubbing elbows with the awesome folks, which I will only be able to virtually, once I'm out of here.
  • Thanks to Amber, I also came across this interesting looking book called Apprenticeship patterns. I plan to read it over the next week or two.
  • I haven't been playing almost any Ultimate here. I atleast need to work out a conditioning and throwing practice schedule and start working on it, early next week. Tomorrow won't work since it's 5am and I'm still awake!
  • I also probably need to get enough sleep!
  • Looking forward to the next couple of weeks!

OAuth2 demystified

Motivation

I was trying to pair on writing a simple app that uses Hacker School's OAuth2 API, and hit a roadblock on the first step of requesting an authorization from the user. Once the user authorized my app, I would see an error that said, "The authorization server does not support this response type". I was using a client library that I had used before, and the server was using a what seemed like a popular implementation for ruby on rails. Getting weird errors is not done!

I have used OAuth2 based authentication before, but the thought of using it always makes me a little nervous, just because

  1. I don't understand it very well.
  2. Like almost everything else, there seem to be so many libraries for doing this in Python, and I'm never sure which one to use, or which one I used the last time around. Not understanding the protocol also doesn't let me debug anything that comes up.

To fix this, I set about to read and understand the OAuth2 protocol. This blog post is an attempt to record it for future reference, and possibly act as a reference for others.

Why OAuth

OAuth is simply a way for an end-user to allow third parties to use protected data, without sharing the user's credentials with the third-party.

For example, an end-user (Jane) can grant a printing service (Printo) access to her protected photos stored at a photo-sharing service (Picasa), without sharing her username and password with the printing service. Instead, she authenticates directly with a server trusted by the photo-sharing service, which issues the printing service delegation-specific credentials. (example from the OAuth 2.0 spec)

Protocol Flow

The flow occurs through a sequence of user actions, client requests and user-agent (browser) redirects.

  • (A) Printo asks Jane to allow using Picasa Data. The request can be sent directly to Jane, but is usually routed via Picasa/Google.
  • (B) Printo gets back an authorization grant, which is a credential representing Jane's authorization or approval. The type of the actual grant credential depends on the type of request that Printo used.
  • (C, D) Printo gets back to Google with the credentials it obtained in the previous step and obtains a token that it can use to talk with Picasa.
  • (E, F) Printo asks for the desired photo with the token it obtained previously, and Picasa gives back the photo to print. Jane gets her framed photo!

But before any of this happens, the client needs to register with the authorization server and obtain a client_id and client_secret, that will be used to identify the client making the requests.

Pythonized "authorization code" work-flow.

The OAuth2 spec allows the authorization request/grant to be of 4 different types. It also allows some flexibility in the token type.

In my experience, the most common work-flow seems to be using an authorization code as an authorization grant, and using a Bearer type token. This work-flow is explained in the diagram below (taken from the spec document). This diagram zooms in, onto the steps A-D in the diagram above.

This python code snippet is a simple implementation of this workflow, using the Hacker School API.

Conclusion

I think, I understand the OAuth2 spec a lot better now, and hope that this will help others understand it, too. And more importantly, I won't get nervous when I have to add it to my projects.

Also, oauthlib for Python seems to be a pretty thorough implementation of the spec, and requests-oauthlib seems to wrap it for use with requests. I think I'm going to use this in my future projects.