Clone TinyURL in 40 lines of Ruby code
I’m officially hooked. After writing 2 blog posts on cloning popular web applications on the Internet, I was raring to take on another one. TinyURL looked like the easiest so that’s the one I did. In fact it’s so easy there’s at least 100+ such applications in the market already. I called my TinyURL clone, Snip (http://snip.heroku.com).
I wrote Snip with Sinatra then deployed it up to Heroku so this is also a good excuse also to describe Heroku, a truly amazing service for the Ruby programming community. The total number of lines in Snip is actually 43, in a single file named snip.rb. including the view template and layout. To check it out go to git://github.com/sausheong/snip.git.
A quick word about TinyURL and its army of clones. TinyURL is the first of its kind (started in January 2002) providing a very simple but useful service of replacing a full URL with a much shorter one. Going to the shortened URL will redirect the user to the actual full URL. Its usage really exploded with the rising popularity of Twitter, which required its users to send messages in only 140 or less words, really making URL shortening a necessity. In March 2009, the service with second largest market share (13%) bit.ly, raised $2 million as funding and TechCrunch had a field day, even estimated that TinyURL with 75% of market share worth up to $46 million! Unfortunately till date, TinyURL’s actual business model and the question of it and its ilk can make money is still unanswered.
Enough on the business and money. Let’s look at the code and start with the first line. Most of the time we write require files in multiple lines, but actually we can use an array and iteratively require each of the library. I only used Sinatra and DataMapper libraries here though Sinatra itself includes HAML, which is the template markup language I used in Snip.
I will not go in depth on using Sinatra. If you’re interested, please go to the Sinatra documentation or my previous post. I will assume you roughly know how Sinatra works and jump right in (anyway there is so little code you’d probably understand it easily).
I have only 2 get blocks. The ‘/’ get block primarily just shows the main page, while the ‘/:snipped’ get block takes in the snipped code fetches the original URL, then redirects the user to it. I also have a post block. The ‘/’ post block (notice that using get will not reach the post block, which neatly demonstrates the pure simplicity of Sinatra) first makes sure that the URL is valid (by running it through the URI parse method). Then it either creates a new URL or fetches the existing one, if it was shortened beforehand.
A note on alphanumeric code used to represent a URL. I store each URL in the database as a row and the code is really just the row ID in the database table. However I cannot use the row ID directly because as the number of URLs grows, the number of characters representing the code grows quickly as well. For example, as I hit 1 million URLs, I would have 7 characters (all numbers) in the code. This is not so efficient. To reduce the number of characters used for representing the row ID, I use a base 36 numbering system (a-z, 0-9). This means 1 million records in the database would only require 4 characters (1,000,000 base 10 is ‘lfls’ base 36). And the 200 million records that TinyURL claims to store, will use only 6 characters instead of 9. I have no idea how TinyURL actually generates their code, but looking into their current number of characters in the code (6) I would say this is not a far off guess. (Reverse engineering their 6 character code this way though, results in showing that TinyURL have > 700 million records).
Doing base 36 conversion seems daunting but in reality most programming languages would have some sort of support for non-decimal numbering system conversion. Ruby’s implementation is particularly simple. A Ruby Fixnum (i.e. all whole numbers) has a method called to_s. This method is probably familiar with all Ruby programmers as it converts any object to a String representation. What is likely less well-known is that Fixnum’s implementation takes in a parameter, which is the radix for the base numbering system used. For example:
>> 1234.to_s(2) => "10011010010" >> 1234.to_s(36) => "ya"
Conversely, String’s to_i implementation does the reverse, which is to take a String and convert it into a Fixnum representation of that String, given the radix:
>> "hello world".to_i(36) => 29234652
For templating, I used HAML, the delightful albeit more programmer-centric templating system. Most templating systems strike a compromise between HTML and a programming language. This is evident in the popular systems like JSP, ASP, PHP and even ERB (ERB is the templating system used by default in Rails). However HAML abandons HTML altogether and goes for a purely programming approach, very much like Seaside (Seaside is a Smalltalk web application framework). Skipping comparisons on the difference approaches in templating, one advantage HAML has is that it allows programmers to code the interface in a more natural way, which just suits me fine in this case.
I used a Sinatra trick that allows me to embed the template within the same source code itself. While normally I would need to create 2 additional template files (1 for the layout and another for the index), I added in the code for the templates in the same source file but after the __END__ keyword.
To prettify the interface, I used one of the ready-made W3C core stylesheets, which provide me with a standard and well defined set of styles without going through the headache of creating one myself.
And we’re done! A full URL shortening service in a 40 lines of code file. To start it up just do this:
$ ruby snip.rb
Then go to http://localhost:4567.
The next step is to deploy it to Heroku. I can’t say enough about this service, which is heaven-sent for the Ruby web application programming community. Just register an account here at http://heroku.com. In fact the main page more or less explains the steps you need to do to deploy the app! However, there are a couple more steps for Sinatra. Here is the complete list of steps:
1. Create a config.ru file
This is the Rack configuration file, which is actually just another Ruby script. All you need to have in this file is this:
require 'sinatra' require 'snip' run Sinatra.application
This tells Rack to include the Sinatra and Snip libraries, then run the Sinatra application.
2. Install the Heroku gem
$ sudo gem install heroku
Heroku provides us with a set of useful tools packaged in a gem, very much like Capistrano.
3. Initialize an empty Git repository in the snip folder
$ cd snip snip $ git init Initialized empty Git repository in .git/ snip $ git add . snip $ git commit -m 'initial import' Created initial commit 5581d23: initial import 2 files changed, 52 insertions(+), 0 deletions(-) create mode 100644 config.ru create mode 100644 snip.rb
This just creates and initializes an empty git repository on your computer.
4. Create the Heroku application
snip $ heroku create snip Created http://snip.heroku.com/ | email@example.com:snip.git Git remote heroku added
You will be prompted for your username and password the first time you run a heroku command. Subsequently this will be saved in ~/.heroku/credentials and you won’t be prompted. It will also upload your public key to allow you to push and pull code.
5. Push your code to Heroku
snip $ git push heroku master Counting objects: 4, done. Compressing objects: 100% (4/4), done. Writing objects: 100% (4/4), 999 bytes, done. Total 4 (delta 0), reused 0 (delta 0) -----> Heroku receiving push -----> Rack app detected Compiled slug size is 004K -----> Launching....... done App deployed to Heroku To firstname.lastname@example.org:snip.git * [new branch] master -> master
Notice that this pushes your code and loads your application into deployment.
6. Log in to the Heroku console and create the database
snip $ heroku console Ruby console for snip.heroku.com >> DataMapper.auto_migrate! => [Url]
Heroku allows you access to a console similar to irb but with the environment of your deployment loaded up, like script/console in Ruby on Rails. To create the database, I just run DataMapper.auto_migrate! and it will create the database accordingly.
This is it! Now go to your application on Heroku and you should be able to see this:
One of the main sticking points in doing Ruby web application development is finding a place to host the application. Most people either do web host sharing or slice hosting but it is either underpowered or costly (or both) and requires mucking around with servers (which many application programmers like myself do so only reluctantly). Heroku is an amazing service that heralds a new way deploying Ruby web applications that saves the day for many Ruby programmers.
A final word on the URL shortening service. For a more full-fledged service, I would probably register a much shorter domain name (something like what is.gd has) and also add more features that provide statistics to each redirected URL and so on.