saush

Third party user authentication with Ruby in a just few lines of code

Posted in DataMapper, Ruby, Sinatra by sausheong on April 25, 2009

Some time ago (8 years to be exact) when I was still chest-deep in Java and working in elipva, I wrote Tapestry, a centralized user management system and also an article in Javaworld describing it. That particular project is long gone (it got merged into elipva’s Zephyr product and disappeared as a standalone system) but the problems I wrote about are really quite universal.

To summarize, the main problem that Tapestry tried to solve was that software applications that serve more than 1 user at a time usually require its users to sign in. To do that, most of the time, we create a user management module. Within this user management module we need to provide these 2 major services:

As you can imagine, authentication is usually simpler than access control. Tapestry’s approach was to externalize these services into a configurable standalone system that provides both services. While I never really completed Tapestry, there are many third-party providers of these services in the market now. In this blog post I will talk about authentication and explain some ways to easily integrate your Ruby application with these third-party authentication providers.

But before that, some background. In the open era, 2 different open technologies have surfaced to solve the the issues of authentication and access control separately. OpenID is a popular digital identity used for authentication, originally developed for LiveJournal in 2005. It’s supported by many large companies including Yahoo, Google, PayPal, IBM, Microsoft and so on. As of Nov 2008, there are over 500 million OpenIDs on the Internet and about 27,000 sites have integrated it. OAuth is a newer protocol used for access control, first drafted in 2007. It allows a user to grant access to information on one site (provider) to another site (consumer) without sharing his identity. OAuth is strongly supported by Google and Yahoo, and is used also in Twitter.

At the same time, many larger organizations started to tout their proprietary own authentication and access control mechanism. Microsoft started its Passport service in 1999 and received much criticism along the way (later it changed its name to Windows Live ID). Soon after, in September 2001, the Liberty Alliance was started by Sun Microsystems, Oracle, Novell and so on (which many considered a counter to Passport) to do federated identity management. As mentioned, Google is one of the initial and strong supporter of OAuth but it also offers an older AuthSub web authentication and ClientLogin client authentication APIs while being an OpenID provider (it calls its OpenID service Federated Login). Yahoo started supporting OAuth officially in 2008, in addition to its older BBAuth APIs and is also an OpenID provider. Facebook joined the race in 2008 with Facebook Connect, which does authentication as well as share Facebook information. MySpace also came in with their platform in 2008, originally called Data Availability, which quickly became MySpaceID. In April 2009, Twitter started offering ‘Sign In with Twitter‘, which true to its name, allow users to sign in with their Twitter accounts. In addition to that Twitter uses HTTP basic authentication to protect its resources, and also recently announced support for OAuth.

As you can see there’re really lots of players in the market. RPXNow, which I blogged about previously, took advantage of these offerings and came up with a unified and simplified set of APIs to allow authentication with Yahoo, Google, Windows Live ID, Facebook, AOL, MySpace and various OpenID providers.

Looking from another perspective, there are 2 types of third party authentication APIs:

  • Web authentication for web applications
  • Client authentication for desktop or mobile client applications

The main difference between these 2 types of authentication is that web authentication can redirect the user to provider’s site for him to enter his username and password, then returns to the calling web application, usually with a success or failure token. Client authentication (usually) cannot do this, and would require the application to capture the username and password, which is sent by the client application to the provider. Obviously from a security point of view, web authentication is more secure and re-assuring to users since the user never needs to pass the username and password to the application at all.

Let’s jump into some code now. I will give 2 examples of both web authentication and client authentication each. You can find all the code here at git://github.com/sausheong/auth.gi

Let’s start with RPX because that’s the easiest. Using RPX is probably the easiest way to do third party web authentication with a large number of different providers. You can either use its provided interfaces (either embedded or pop-up Javascript) or you can call its simple API to authenticate your web application. Here’s a simple Sinatra example.

There are just 2 files. rpx_auth.html is in the public folder:

<form method="POST" action="https://chirp-saush.rpxnow.com/openid/start?token_url=http://localhost:4567/login">
	<input type="hidden" name="openid_identifier" value="yahoo.com" />
	<input type="image" src='https://a248.e.akamai.net/sec.yimg.com/i/yahoo.gif' />
</form>

This is the input form. Notice that it is entirely in HTML (it’s just a simple form submit with one post parameter). The actual work is done by auth.rb in the root folder:

%w(rubygems sinatra rest_client json).each  { |lib| require lib}

get '/login' do
  'You are logged in!' if authenticate(params[:token])
end

def authenticate(token)
  response = JSON.parse(RestClient.post 'https://rpxnow.com/api/v2/auth_info', :token => token, 'apiKey' => <rpx API KEY>, :format => 'json', :extended => 'true')
  return true if response['stat'] == 'ok'
  return false
end

You will need to register an application in RPXNow. In this example, I’m reusing Chirp-Chirp’s application in RPXNow, which is why you see the post action  is to the server chirp-saush.rpxnow.com. The token is the URL that RPX will redirect to after it authenticates the user. The pleasant surprise here is that it allows you to redirect to localhost, which eases the development effort. In this example also, I’m using Yahoo as the OpenID provider. RPXNow provides a host of other popular providers — in Chirp Chirp I actually used Yahoo, Google and Windows Live ID though it provides a lot more others. I used an image button to submit but that’s to prettify the page.

Notice that RPX redirects to the localhost with the URL ‘/login’. I created a get method for this, which in turn passes the returned token to the authenticate method. Inside this method, I used Adam Wiggin’s (of Heroku fame) really convenient rest-client library to send a post request with the token and the API key provided when you register for an application. I indicated that I want to receive the response in JSON. The response is parsed for the status and can also be parsed for data. For simplicity I didn’t return the Yahoo user information (but it’s a really small subset of data only) though I could potentially parse the JSON data for it.

And that’s it! You now have web application authentication by a third party provider, courtesy of RPX.

What if you don’t want to have a middleman like RPX to do web authentication? In this example, I was using RPX Basic, which is free because I’m using an rpxnow.com domain. If I want my own domain for authentication, I need to pay for commercial account with RPX (which is really their business model). An alternative is to use OpenID directly instead, which is very popular and doesn’t add too much complexity.

This is the input form for OpenID authentication, openid_auth.html:

<form method="get" action='/login'>
  OpenID identifier:
  <input type="text" class="openid" name="openid_identifier" size='50'/>
  <input type="submit" value="Verify" /><br />
</form>

The input requires the user to enter an OpenID identifier. If you’re not sure what an OpenID identifier is like you can check it out here. Most likely you’d already have one. Just like in the RPX case, the user is redirected to the OpenID provider, where he is requested to log in to that provider. After logging in, he will be returned to the calling application.

This is the Sinatra app that does this work, in a file called openid_auth.rb:

%w(rubygems sinatra openid openid/store/filesystem).each  { |lib| require lib}

REALM = 'http://localhost:4567'
RETURN_TO = "#{REALM}/complete"

get '/login' do
  checkid_request = openid_consumer.begin(params[:openid_identifier])
  redirect checkid_request.redirect_url(REALM, RETURN_TO) if checkid_request.send_redirect?(REALM, RETURN_TO)
  checkid_request.html_markup(REALM, RETURN_TO)
end

get '/complete' do
  response = openid_consumer.complete(params, RETURN_TO)
  return 'You are logged in!' if response.status == OpenID::Consumer::SUCCESS
  'Could not log on with your OpenID'
end

def openid_consumer
  @consumer = OpenID::Consumer.new(session, OpenID::Store::Filesystem.new('auth/store')) if @consumer.nil?
  return @consumer
end

Note that it is not that much longer than the RPX code. I used the ruby-openid gem from JanRain, which is probably the defacto OpenID Ruby library. The steps are only a bit more complicated:

  1. The input form asks the user to enter an OpenID identifier and sends it to the ‘/login’ get block (it could easily be a post block)
  2. The ‘/login’ block first gets a consumer object. To create the consumer object, you need to pass in the session where you will store the request information (this is usually provided by the web framework) as well as a store. The store is the place where we keep associations, the shared secrets between the relying party (your application) and an OpenID provider as well as nonces, cryptographic one-time alphanumeric strings used to prevent replay attacks. In the example above, I created the store in the file system.
  3. Once we have the consumer, we initiate the start of the authentication process by calling its begin method with the identifier, which returns a CheckIDRequest object. This object contains the state necessary to generate the actual OpenID request
  4. Next we ask the CheckIDRequest object if we should send the HTTP request as a redirect or if we should produce a html post form for the user
  5. Then we either redirect the user to the OpenID provider or create a HTML form that does the same thing. Here we provide a return_to path for the OpenID provider to return the user to our application with a bunch of other data
  6. Once the user authenticates himself, he will be sent back to our application, and I used another get block ‘/complete’ to process the returned data
  7. We don’t need to go through the returned parameters one by one, we can re-use the same consumer object we created earlier and use the complete method to process those parameters
  8. The complete method returns a response object, which we can check for status

The process actually looks more complicated than it really is. The ruby-openid library simplifies the processing a great deal. In the example above, I only wanted to check if the user is valid or not, if I wanted to get more data on the user, there are a bunch of other things we can do.

We’ve just covered the web authentication. Let’s move on to client authentication, which doesn’t involve a browser. Google provides ClientLogin, which is an authentication API specifically desktop or mobile clients. Using it is actually simpler than web authentication. For this example and the next, I will use Shoes, the excellent Ruby GUI toolkit by why the lucky stiff. To run the code, you need to download and install Shoes, then follow the instructions on how to execute Shoes applications. I won’t go through how to create a Shoes application here though, that’s for another day. Here’s the complete code for using Google’s ClientLogin API, in a file called goog-auth-client.rb:

Shoes.setup do
   gem 'rest-client'
   gem 'net-ssh'

end
%w(rest_client openssl.so openssl/bn openssl/cipher openssl/digest openssl/ssl openssl/x509 net/ssh).each  { |lib| require lib}

class GoogleAuthClient < Shoes
  url '/', :login
  url '/list', :list

  def login
    stack do
      title "Please login", :margin => 4
      para "Log in using your Google credentials!"
      flow :margin_left => 4 do para "Email"; @me = edit_line; end
      flow :margin_left => 4 do para "Password"; @password = edit_line :secret => true; end
      button "login" do $me = @me.text; authenticate(@me.text,@password.text) ? visit('/list') : alert("Wrong email or password"); end
    end
  end

  def list
    title "You are logged in!"
  end

  def authenticate(email,password)
    response = RestClient.post 'https://www.google.com/accounts/ClientLogin', 'accountType' => 'HOSTED_OR_GOOGLE', 'Email' => email, 'Passwd' => password, :service => 'xapi', :source => 'Goog-Auth-1.0'
    return true if response.code == 200
    return false
  end
end
Shoes.app :width => 400, :height => 200, :title => 'Google Authenticator'

You might notice that I actually included a large number of libraries. This is because there is a problem in the current Shoes build, which didn’t include the OpenSSL libraries so we’ll need to add them in manually. The method to focus on is really the authenticate method. I used the rest_client library again to send a post request to the ClientLogin URL. the parameter ‘service’ indicates which Google service you want to access. In our case, we just want to authenticate a Google user, so we can use the generic service name ‘xapi’. The ‘source’ parameter is a short string that identifies our application, this is for logging purposes only.

An alternative to ClientLogin and just as easy is Twitter’s basic authentication service. The code is also exactly the same. Here’s the code, in a file named twit-auth_client.rb.

class AuthClient < Shoes
  url '/', :login
  url '/list', :list

  def login
    stack do
      title "Please login", :margin => 4
      para "Log in using your Twitter credentials!"
      flow :margin_left => 4 do para "Username"; @me = edit_line; end
      flow :margin_left => 4 do para "Password"; @password = edit_line :secret => true; end
      button "login" do authenticate(@me.text,@password.text) ? visit('/list') : alert("Wrong user or password"); end
    end
  end

  def list
    title "You are logged in!"
  end

  def authenticate(user,password)
    open("http://twitter.com/account/verify_credentials.xml", :http_basic_authentication=>[user, password]) rescue return false
    return true
  end
end
Shoes.app :width => 400, :height => 200, :title => 'Blab!'

Let’s jump into the authenticate method again. I used OpenURI to send a get request to the verify credentials API. Twitter uses basic authentication to protect its resources, so we need to pass in the user and password information as well (notice that OpenURI allows you to use basic authentication). I could parse the returned results, but what happens here is that if the user and password information is invalid, it will throw an error anyway, so I just do a rescue and return false. If all is well, I can just return true.

You might notice that the code that allows me to use both Google and Twitter as a the third-party authentication provider is just a single line! Don’t you just love Ruby?

Clone TinyURL in 40 lines of Ruby code

Posted in DataMapper, Ruby, Sinatra by sausheong on April 13, 2009

I’m officially hooked. After writing 2 blog posts on cloning popular web applications on the Internet, I was raring to take on another one. TinyURL looked like the easiest so that’s the one I did. In fact it’s so easy there’s at least 100+ such applications in the market already. I called my TinyURL clone, Snip (http://snip.heroku.com).

I wrote Snip with Sinatra then deployed it up to Heroku so this is also a good excuse also to describe Heroku, a truly amazing service for the Ruby programming community. The total number of lines in Snip is actually 43, in a single file named snip.rb. including the view template and layout. To check it out go to git://github.com/sausheong/snip.git.

A quick word about TinyURL and its army of clones. TinyURL is the first of its kind (started in January 2002) providing a very simple but useful service of replacing a full URL with a much shorter one. Going to the shortened URL will redirect the user to the actual full URL. Its usage really exploded with the rising popularity of Twitter, which required its users to send messages in only 140 or less words, really making URL shortening a necessity. In March 2009, the service with second largest market share (13%) bit.ly, raised $2 million as funding and TechCrunch had a field day, even estimated that TinyURL with 75% of market share worth up to $46 million! Unfortunately till date, TinyURL’s actual business model and the question of it and its ilk can make money is still unanswered.

Source : TechCrunch

Source : TechCrunch

Enough on the business and money. Let’s look at the code and start with the first line. Most of the time we write require files in multiple lines, but actually we can use an array and iteratively require each of the library. I only used Sinatra and DataMapper libraries here though Sinatra itself includes HAML, which is the template markup language I used in Snip.

I will not go in depth on using Sinatra. If you’re interested, please go to the Sinatra documentation or my previous post. I will assume you roughly know how Sinatra works and jump right in (anyway there is so little code you’d probably understand it easily).

I have only 2 get blocks. The ‘/’ get block primarily just shows the main page, while the ‘/:snipped’ get block takes in the snipped code fetches the original URL, then redirects the user to it. I also have a post block. The ‘/’ post block (notice that using get will not reach the post block, which neatly demonstrates the pure simplicity of Sinatra) first makes sure that the URL is valid (by running it through the URI parse method). Then it either creates a new URL or fetches the existing one, if it was shortened beforehand.

A note on alphanumeric code used to represent a URL. I store each URL in the database as a row and the code is really just the row ID in the database table. However I cannot use the row ID directly because as the number of URLs grows, the number of characters representing the code grows quickly as well. For example, as I hit 1 million URLs, I would have 7 characters (all numbers) in the code. This is not so efficient. To reduce the number of characters used for representing the row ID, I use a base 36 numbering system (a-z, 0-9). This means 1 million records in the database would only require 4 characters (1,000,000 base 10 is ‘lfls’ base 36). And the 200 million records that TinyURL claims to store, will use only 6 characters instead of 9. I have no idea how TinyURL actually generates their code, but looking into their current number of characters in the code (6) I would say this is not a far off guess. (Reverse engineering their 6 character code this way though, results in showing that TinyURL have > 700 million records).

Doing base 36 conversion seems daunting but in reality most programming languages would have some sort of support for non-decimal numbering system conversion. Ruby’s implementation is particularly simple. A Ruby Fixnum (i.e. all whole numbers) has a method called to_s. This method is probably familiar with all Ruby programmers as it converts any object to a String representation. What is likely less well-known is that Fixnum’s implementation takes in a parameter, which is the radix for the base numbering system used. For example:

>> 1234.to_s(2)
=> "10011010010"
>> 1234.to_s(36)
=> "ya"

Conversely, String’s to_i implementation does the reverse, which is to take a String and convert it into a Fixnum representation of that String, given the radix:

>> "hello world".to_i(36)
=> 29234652

For templating, I used HAML, the delightful albeit more programmer-centric templating system. Most templating systems strike a compromise between HTML and a programming language. This is evident in the popular systems like JSP, ASP, PHP and even ERB (ERB is the templating system used by default in Rails). However HAML abandons HTML altogether and goes for a purely programming approach, very much like Seaside (Seaside is a Smalltalk web application framework). Skipping comparisons on the difference approaches in templating, one advantage HAML has is that it allows programmers to code the interface in a more natural way, which just suits me fine in this case.

I used a Sinatra trick that allows me to embed the template within the same source code itself. While normally I would need to create 2 additional template files (1 for the layout and another for the index), I added in the code for the templates in the same source file but after the __END__ keyword.

To prettify the interface, I used one of the ready-made W3C core stylesheets, which provide me with a standard and well defined set of styles without going through the headache of creating one myself.

And we’re done! A full URL shortening service in a 40 lines of code file. To start it up just do this:

$ ruby snip.rb

Then go to http://localhost:4567.

The next step is to deploy it to Heroku. I can’t say enough about this service, which is heaven-sent for the Ruby web application programming community. Just register an account here at http://heroku.com. In fact the main page more or less explains the steps you need to do to deploy the app! However, there are a couple more steps for Sinatra. Here is the complete list of steps:

1. Create a config.ru file

This is the Rack configuration file, which is actually just another Ruby script. All you need to have in this file is this:

require 'sinatra'
require 'snip'
run Sinatra.application

This tells Rack to include the Sinatra and Snip libraries, then run the Sinatra application.

2. Install the Heroku gem

$ sudo gem install heroku

Heroku provides us with a set of useful tools packaged in a gem, very much like Capistrano.

3. Initialize an empty Git repository in the snip folder

$ cd snip
snip $ git init
Initialized empty Git repository in .git/
snip $ git add .
snip $ git commit -m 'initial import'
Created initial commit 5581d23: initial import
2 files changed, 52 insertions(+), 0 deletions(-)
create mode 100644 config.ru
create mode 100644 snip.rb

This just creates and initializes an empty git repository on your computer.

4. Create the Heroku application

snip $ heroku create snip
Created http://snip.heroku.com/ | git@heroku.com:snip.git
Git remote heroku added

You will be prompted for your username and password the first time you run a heroku command. Subsequently this will be saved in ~/.heroku/credentials and you won’t be prompted. It will also upload your public key to allow you to push and pull code.

5. Push your code to Heroku

snip $ git push heroku master
Counting objects: 4, done.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 999 bytes, done.
Total 4 (delta 0), reused 0 (delta 0)
-----> Heroku receiving push
-----> Rack app detected
Compiled slug size is 004K
-----> Launching....... done
App deployed to Heroku
To git@heroku.com:snip.git
* [new branch]      master -> master

Notice that this pushes your code and loads your application into deployment.

6. Log in to the Heroku console and create the database

snip $ heroku console
Ruby console for snip.heroku.com
>> DataMapper.auto_migrate!
=> [Url]

Heroku allows you access to a console similar to irb but with the environment of your deployment loaded up, like script/console in Ruby on Rails. To create the database, I just run DataMapper.auto_migrate! and it will create the database accordingly.

This is it! Now go to your application on Heroku and you should be able to see this:

One of the main sticking points in doing Ruby web application development is finding a place to host the application. Most people either do web host sharing or slice hosting but it is either underpowered or costly (or both) and requires mucking around with servers (which many application programmers like myself do so only reluctantly). Heroku is an amazing service that heralds a new way deploying Ruby web applications that saves the day for many Ruby programmers.

A final word on the URL shortening service. For a more full-fledged service, I would probably register a much shorter domain name (something like what is.gd has) and also add more features that provide statistics to each redirected URL and so on.

Why it is difficult to accept the realities of a downturn

Posted in general by sausheong on April 7, 2009

I bought a 17″ Sony Flatron monitor for $535 in 1999, one of my prized possessions that accompanied me during the few years which I hopped from rented room to rented room. It weighed almost 20kg and was a tremendous monster to lug around, maybe 20% of the total weight of my other stuff put together. I retired it (still working) when I got my Samsung 19″ LCD monitor ($220) about 2 years ago and my wife persuaded me to sell it off. However, I baulked when I was offered $50 by the karung guni man, after all, it was still working. So I refused to sell it.

The next time I mentioned it to a karung guni man, he wouldn’t even want to take it and wanted to charge me $5 to take it off my hands. Naturally I didn’t pay.

All these brought me to the couple of conversations I had with Winston and with Simon. Conversations that somehow drifted to talking about our lifestyles in the current state of the economy. Perhaps this was on my mind during our lunchtime chatter, since we talked about completely different things to begin with. Winston talked about how he struggled through his startup days to achieve where he is today. Simon and I talked about how people upgraded their lifestyles over the few good years before the current crisis, and having done so found it extremely hard to accept that they need to lower their expectations now. Stories of people who lost their jobs and cannot accept that their new jobs now pay 20% – 25% lower than their previous one. People who just bought their 42″ LCD TVs, high-end electronic gadgets, cars and even apartments, depended entirely on their salaries to pay for the installments and lost their jobs. People with shiny promising lives ahead of them during the prosperous times but now facing bleak, burgeoning loans and unable to come to terms with the new realities of a recession.

Which led me to think — what makes it so difficult for us to accept the realities of an economic downturn?

My guess is a combination of these three hypotheses:

Endowment effect
This simply means that people place a higher value on objects they own than objects that they do not. An experiment conducted in Duke University by Ziv Carmon and Dan Ariely showed how surprisingly irrational yet humanly plausible this is.

Duke University has a small basketball stadium which doesn’t have enough seats for all its fans so it designed an elaborate lottery system to allocate tickets instead. The experiment involved fans who has won tickets to an NCAA Final Four basketball tournament and fans who had not, 93 respondents in all. Each respondent was asked, one day prior to the match, to indicate their selling or buying price for a ticket to the match. The buying price question asked for the highest price he or she would pay for the ticket and the selling price question asked for the lowest price he or she would agree to sell. In addition, the would-be buyers would be asked to think of other items or experiences that would be equivalent in value to the tickets. The test was to find out how much the buyer value the experience as compared to an equivalent experience and how much the seller value losing that experience. In other words, it was an experiment to determine the existence of the endowment effect.

When the experiment ended, the 10% trimmed mean selling price was about $2,400 while the 10% trimmed mean buying price was about $170. That’s a difference by a factor of 14! Rationally speaking, the would-be buyer and the would-be seller should think of the experience in the same way since they were randomly picked fans. The only difference was that some of the respondents won a lottery for the ticket and others didn’t. It’s pretty hard to explain the differences between the selling price and the buying price but at the same time it is strangely familiar and human.

This is why it’s particularly bitter for us in this downturn — we have already owned our new lifestyle, and it has gained particularly high value. If we have not owned that brand new Honda Civic it wouldn’t have mattered so much, but now that we have it, losing it is much more painful, even though we had lived happily in the past without it.

Loss aversion
This again should be familiar to most of us. It refers to the tendency for people to prefer avoiding losses than to acquiring gains of similar value. For example, it is more painful to lose $1000 than it is pleasurable to gain $1000. Given a choice people would prefer the chance not to lose $1000 than the chance that they might gain $1000. In a 1981 paper by Amos Tversky and Daniel Kahneman, researchers described an experiment conducted at the University of Stanford and the University of British Columbia where 2 groups of students (152 and 155 respectively) were asked to answer this brief questionnaire in a classroom setting:

Imagine that the US is preparing for the outbreak of an unusual Asian disease, which is expected to kill 60 people. Two alternative programs to combat the disease have been proposed. Assume that the exact scientific estimate of the consequences of the programs are as follows:

If Program A is adopted, 200 people will be saved.

If Program B is adopted, there is a 1/3 probability that 600 people will be saved and 2/3 probability that no on will be saved.

Which of the two programs would you favor?

As it turns out 72% of the respondents chose Program A and 28% chose Program B. However, when the questions were phrased like this:

If Program C is adopted 400 people will die.

If Program D is adopted, there is a 1/3 probability that nobody will die and 2/3 probability that 600 people will die.

Which of the two programs would you favor?

the reverse happens i.e. 78% of the respondents chose Program D and 22% chose Program C! As you can see Program A and C are essentially the same, as with Program B and D. The difference was how the questions were worded. Whereas Program A describes people being saved (a gain) while Program C describes people dying (a loss). Intuitively it seems right, but it is still pretty startling that mere words could affect us this way.

Cognitive dissonance
Cognitive dissonance is one of the most influential and extensively studied theories in social psychology. It is the uncomfortable feeling when we have two contradictory ideas at the same time. The theory is that in such situations we would be driven to change or justify our attitude or behavior. For example, if you believe that you are a careful driver and an accident happens, you would normally either blame it on the other driver or some external factor that doesn’t contradict with your self-belief that you’re a careful driver.

In this economic downturn, the self-belief that we are worth this much (in terms of salary or other compensation) contradicts with the reality that we have been offered a job that is paying less. The cognitive dissonance then drives us to angrily reject the offer as being unreasonable, the employer is trying to take advantage of the downturn or some other justification that doesn’t take away that self-belief.

So?
So what can we learn from this? Loss aversion tells us that looking at it in a different perspective makes things less painful for us. It’s not losing a job, it’s galvanizing us to try out the other stuff we’ve always been too busy with. What we learn from the endowment effect is that losing our stuff is painful. That will not go away. However we can make things less painful not only by spending only we can afford, but also spending a few levels less than what we can afford in order to buffer for this pain. I can afford that Honda Civic if I have my job and current salary, but shouldn’t I buy a cheaper car such that I can afford it even if I lose 50% of my salary? Finally, cognitive dissonance is part of human nature but what we can’t change, we should adapt. Recognizing that our self-worth is not measured in terms of our salaries but other things our families and friends, that salaries and jobs, like most things in the world goes through cycles should give us better sanity in dealing with a rapidly changing economic crisis.

Having knowledge of what makes is hurt so much in this downturn doesn’t really help us get back our jobs or salaries. But knowing that it is human to feel this way, and that everyone is feeling the same way probably helps us face it better.

I still have that 17″ Sony Flatron monitor lying somewhere. One of these days I will need to overcome the endowment effect and dump it.

Write a Sinatra-based Twitter clone in 200 lines of Ruby code

Posted in DataMapper, general, Ruby, Sinatra, Twitter by sausheong on April 2, 2009

After trying out Sinatra as the interface to my search engine, I got hooked to it. I liked the no-frills approach to developing a web application. I liked it so much that I decided to write a fuller web app using Sinatra. Of course I had no idea what to write, so I decided to clone a random popular application. That was how I ended up writing a Twitter clone.

It was inevitable that halfway writing the Chirp-Chirp, my Sinatra-based Twitter clone, I found out that there are already quite a few Twitter clones out there, 262 to be specific, at least as of August 2008.

The design goals for Chirp-Chirp are very simple. I want to build a reasonably feature complete Twitter clone in the simplest way possible. The code should be minimal and easily understood. The user interface should be usable without additional extensive instructions. Also, the deployment should be straightforward.

To begin with, let’s look at a Twitter clone. Fundamentally it’s all about telling people what you’re doing with minimal text entry, in real-time. The people are friends who ‘follow’ you (naturally, called ‘followers’). The text that you type is  short, which is why such apps are also known as micro-blogs.

Let’s look at the various components of Chirp-Chirp:

  • User interface
  • Login and user management
  • Data modeling
  • Home page
  • Friends and followers
  • Public and direct messages

User interface
Simply put I just copied Twitter‘s user interface and slapped on a self-drawn logo.

Login and user management
I don’t really want to write a login module, nor manage users who log in to Chirp-Chirp but I certainly need one. So my solution is to ‘borrow’ someone else’s login module and kept minimal user management. The first thing I looked into is OpenID. Using OpenID is relatively simple but it still involved some code integration using an OpenID library.

Finally I settled on using RPX to handle the login, and doing minimal integration to single sign-on with Yahoo!, Google and Windows Live ID, 3 of the biggest players on the Internet scene. I figured almost everyone would have at least 1 of these 3 accounts. The advantage in using RPX is that there is really little integration needed, and if the user is already logged into either one of the 3 acounts, he doesn’t need to log in again. There is no user registration, choosing password, taking care of security and all that stuff. Just a simple click on the account the user wants to use, and he’s in.

User management is inevitable since I need to keep track on who write what messages. As the unique user ID, I use the email that is returned by the provider. OpenID (which all 3 of the providers support) returns me a unique identifier but that is comparatively difficult to type and remember. Email is a good compromise as it is something easily remembered and is unique.

Data modeling
I used DataMapper again for data modeling. The data model consists of 3 classes — User, Relationship and Chirp. Simply put — a User has Relationships with other Users, and a User has many Chirps (corny but, hey :)). A Chirp with a specific recipient User is a direct message.

Home page
All chirps sent out by the user and his friends are listed in his home page in a reverse timeline order (i.e. the latest chirp is listed at the top). Direct messages received are only shown in the direct message inbox and the direct messages sent are only shown in the sent box.

Friends and followers
The concept of friends and followers is a one way relationship. A friend is someone you follow, while a follower is someone following you. A user can be a friend or a follower or both. Following does not require the permission of the user. For example, you can follow anyone in the system without the explicit approval of that user.

Public messages (chirps) and direct messages
A chirp is a public message that is sent by a user that is viewable by the user and all his followers. Chirps belong to a user. Chirps containing the user ID (email) starting with ‘@’ will be hyperlinked to the user’s home page. 2 other features involving the message is direct messaging and following.

You can send direct message to specific users, who will receive them in their direct message inbox. To send a direct message start the chirp with ‘dm’, followed by the user ID. To follow a user, start the chirp with ‘follow’, followed by the user ID.

With this in mind let’s jump into the code. Here’s the stuff that I used:

Without looking the at view templates, the total number of lines of code is around 200 or so. I have only 2 non-view template files — chirp.rb is the web application while models.rb contains the 3 DataMapper data models. As we go along I’ll be extracting code fragments from these 2 files to explain in details.

Let’s talk about chirp.rb first. The first line after the requires tells Sinatra that I want to use sessions. This is followed by telling Sinatra to use the Rack::Flash library to use session-based flash. What you see next are a bunch of get and post blocks. Let’s look at the first one.

['/', '/home'].each do |path|
get path do
if session[:userid].nil? then
erb :login
else
redirect "/#{User.get(session[:userid]).email}"
end
end
end

You will notice that I wrap the get block with an array block, and the array contains 2 strings. These 2 strings are the routes that are associated with the subsequent get block. In the code above, we associate the ‘/’ and ‘/home’ routes to the get block. The get block here checks if the current session contains a user ID. If it doesn’t, it will redirect the user to the login view template. Otherwise it will get the user with this ID and redirect it to the route that contains the email. For example, if the user’s email is user@yahoo.com, then the route is ‘/user@yahoo.com’.

We’re going to look at the login module next. As mentioned, I used RPX (to be specific, I use RPX Basic, which is free!) to do single sign-on so to begin, go to RPXNow and register a web site. Once you do that, you will get an API key and that is really the only thing you need. For Windows Live ID, you’ll need to do a few more steps to register a Windows Live project but you can just follow the steps provided.

Now let’s look at the login template to understand more about the RPX single-sign on mechanism. You will notice that this is entirely HTML, and it contains 3 forms. The URLs for the single sign-on for Google and Yahoo! are OpenID based (which is not surprising as RPX is run by JanRain, one of the main supporters for OpenID). Each form has only 1 input, which is an image button. The user clicks on the account he wants to log in with, and following the given instructions, will be authenticated by the provider and redirected back to our web app. If the login is correct, our web app will receive some data on the user (the amount of data we get is different with different providers, and also configurable from RPX). I won’t go into the actual mechanism, the RPX documentation is pretty detailed.

Once RPX authenticates the user, it will redirect him to our web app along with a token. For Chirp-Chirp this goes back to ‘/login’. The ‘/login’ block then retrieves this token and uses it to retrieve the user information from RPX.

get '/login' do
openid_user = get_user(params[:token])
user = User.find(openid_user[:identifier])
user.update_attributes({:nickname => openid_user[:nickname], :email => openid_user[:email], :photo_url => "http://www.gravatar.com/avatar/#{Digest::MD5.hexdigest(openid_user[:email])}"}) if user.new_record?
session[:userid] = user.id # keep what is stored small
redirect "/#{user.email}"
end

For Chirp-Chirp, I store the unique identifier from RPX, the user’s nickname and his email. For his avatar picture, I use Gravatar which neatly also uses email address as a unique identifier although we need to use MD5 to hash it first. If this is the first time the user is logging in, I save the user information in the database. Finally I store the user’s serial ID (not the unique identifier) in the session and redirect the user to the user home page.

This is the code fragment that retrieves the user information from RPX.

def get_user(token)
u = URI.parse('https://rpxnow.com/api/v2/auth_info')
req = Net::HTTP::Post.new(u.path)
req.set_form_data({'token' => token, 'apiKey' => '<insert RPX API KEY HERE>', 'format' => 'json', 'extended' => 'true'})
http = Net::HTTP.new(u.host,u.port)
http.use_ssl = true if u.scheme == 'https'
json = JSON.parse(http.request(req).body)

if json['stat'] == 'ok'
identifier = json['profile']['identifier']
nickname = json['profile']['preferredUsername']
nickname = json['profile']['displayName'] if nickname.nil?
email = json['profile']['email']
{:identifier => identifier, :nickname => nickname, :email => email}
else
raise LoginFailedError, 'Cannot log in. Try another account!'
end
end

Let’s look at the home page next. I take the example of a user with the email user@yahoo.com, so the home page for him would be ‘/user@yahoo.com’. This is the code fragment with the get block for the home page:

get '/:email' do
@myself = User.get(session[:userid])
@user = @myself.email == params[:email] ? @myself : User.first(:email => params[:email])
@dm_count = dm_count
erb :home
end

The code is pretty self explanatory. The home page retrieves the currently logged in user (@myself) and if the requested user (@user) is not the currently logged in user, it will retrieve that use from the database as well. It also retrieves a count of the direct messages that this user has, and this is for display purposes.

def dm_count
Chirp.count(:recipient_id => session[:userid]) + Chirp.count(:user_id => session[:userid], :recipient_id.not => nil)
end

The number of direct messages is the number of direct messages sent and received.

Let’s look at the friending and following features.

get '/follow/:email' do
Relationship.create(:user => User.first(:email => params[:email]), :follower => User.get(session[:userid]))
redirect '/home'
end

This get block is called when you want to follow a particular user. Again, I just create a relationship between the person whom you want to follow and you. Deleting that relationship is also very simple.

delete '/follows/:user_id/:follows_id' do
Relationship.first(:follower_id => params[:user_id], :user_id => params[:follows_id]).destroy
redirect '/follows'
end

Note that this is a delete block and not get block any more. Finally to display the friends and followers, I have 2 separate routes, but I want to re-use the same code and view template:

['/follows', '/followers'].each do |path|
get path do
@myself = User.get(session[:userid])
@dm_count = dm_count
erb :follows
end
end

Next let’s look at the chirps and direct messages. To chirp, I use a post block:

post '/chirp' do
user = User.get(session[:userid])
Chirp.create(:text => params[:chirp], :user => user)
redirect "/#{user.email}"
end

However, to implement the various features like URL shortening and command level following and direct messaging, I placed the processing logic in the Chirp class itself.

before :save do
case
when starts_with?('dm ')
process_dm
when starts_with?('follow ')
process_follow
else
process
end
end

Before the chirp is save, I check if the text starts with ‘dm’ or ‘follow’ and I process the text accordingly. Let’s look at these various processing blocks of code.

def process_follow
Relationship.create(:user => User.first(:email => self.text.split[1]), :follower => self.user)
throw :halt # don't save
end

Implementing the ‘follow’ command in the chirp box is probably the easier. I just create the relationship and then throw halt to stop saving the chirp.

def process_dm
self.recipient = User.first(:email => self.text.split[1])
self.text = self.text.split[2..self.text.split.size].join(' ') # remove the first 2 words
process
end

Implementing direct message is also relatively simple. I just set the recipient of the message to be the person that is indicated in the chirp box and when saving the chirp, I remove the command. Finally I pass it on to the common processing block.

def process
# process url
urls = self.text.scan(URL_REGEXP)
urls.each { |url|
tiny_url = open("http://tinyurl.com/api-create.php?url=#{url[0]}") {|s| s.read}
self.text.sub!(url[0], "<a href='#{tiny_url}'>#{tiny_url}</a>")
}
# process @
ats = self.text.scan(AT_REGEXP)
ats.each { |at| self.text.sub!(at, "<a href='/#{at[2,at.length]}'>#{at}</a>") }
end

URL_REGEXP = Regexp.new('\b ((https?|telnet|gopher|file|wais|ftp) : [\w/#~:.?+=&%@!\-] +?) (?=[.:?\-] * (?: [^\w/#~:.?+=&%@!\-]| $ ))', Regexp::EXTENDED)
AT_REGEXP = Regexp.new('\s@[\w.@_-]+', Regexp::EXTENDED)

The common processing block scans the chirp text and does a few things:

  • I find the URLs in the code and replace it with a shortened URL from TinyURL.
  • I find all emails that has ‘@’ prefixed and I replace it with a link to that user email.

As for viewing direct messages, again I reuse the same view template, but for the sake of variety instead of having 2 routes, I used 1 route with a parameter:

get '/direct_messages/:dir' do
@myself = User.get(session[:userid])
case params[:dir]
when 'received' then @chirps = Chirp.all(:recipient_id => @myself.id)
when 'sent'     then @chirps = Chirp.all(:user_id => @myself.id, :recipient_id.not => nil)
end
@dm_count = dm_count
erb :direct_messages
end

In the case above, when I’m looking at the received direct messages (i.e. I’m looking at the inbox) I just want to find all chirps which recipient is myself. For the direct messages I sent, I find all chirps that belong to me, which recipient is not null.

That’s it! Of course there’s the view templates but that’s mainly HTML and some ERB snippets embedded in them. You can check out the entire repository at git://github.com/sausheong/chirp.git.

To run the app, get the code. Then go to the models.rb file and change the database URL accordingly. After that, go to irb, require the models file, and run this:

> DataMapper.auto_migrate!

This will create the necessary database. Then in the directory, run this

> ruby chirp.rb

Go to http://localhost:4567 and you’re up and running locally! Of course, you need to register your RPX account and then change login.erb accordingly.

To view this in a live site, go to http://chirp-chirp.heroku.com. This is deployed to Heroku, which is another amazing service, but that’s another story for another post.

Follow

Get every new post delivered to your Inbox.

Join 447 other followers