saush

Clone TinyURL in 40 lines of Ruby code

Posted in DataMapper, Ruby, Sinatra by sausheong on April 13, 2009


I’m officially hooked. After writing 2 blog posts on cloning popular web applications on the Internet, I was raring to take on another one. TinyURL looked like the easiest so that’s the one I did. In fact it’s so easy there’s at least 100+ such applications in the market already. I called my TinyURL clone, Snip (http://snip.heroku.com).

I wrote Snip with Sinatra then deployed it up to Heroku so this is also a good excuse also to describe Heroku, a truly amazing service for the Ruby programming community. The total number of lines in Snip is actually 43, in a single file named snip.rb. including the view template and layout. To check it out go to git://github.com/sausheong/snip.git.

A quick word about TinyURL and its army of clones. TinyURL is the first of its kind (started in January 2002) providing a very simple but useful service of replacing a full URL with a much shorter one. Going to the shortened URL will redirect the user to the actual full URL. Its usage really exploded with the rising popularity of Twitter, which required its users to send messages in only 140 or less words, really making URL shortening a necessity. In March 2009, the service with second largest market share (13%) bit.ly, raised $2 million as funding and TechCrunch had a field day, even estimated that TinyURL with 75% of market share worth up to $46 million! Unfortunately till date, TinyURL’s actual business model and the question of it and its ilk can make money is still unanswered.

Source : TechCrunch

Source : TechCrunch

Enough on the business and money. Let’s look at the code and start with the first line. Most of the time we write require files in multiple lines, but actually we can use an array and iteratively require each of the library. I only used Sinatra and DataMapper libraries here though Sinatra itself includes HAML, which is the template markup language I used in Snip.

I will not go in depth on using Sinatra. If you’re interested, please go to the Sinatra documentation or my previous post. I will assume you roughly know how Sinatra works and jump right in (anyway there is so little code you’d probably understand it easily).

I have only 2 get blocks. The ‘/’ get block primarily just shows the main page, while the ‘/:snipped’ get block takes in the snipped code fetches the original URL, then redirects the user to it. I also have a post block. The ‘/’ post block (notice that using get will not reach the post block, which neatly demonstrates the pure simplicity of Sinatra) first makes sure that the URL is valid (by running it through the URI parse method). Then it either creates a new URL or fetches the existing one, if it was shortened beforehand.

A note on alphanumeric code used to represent a URL. I store each URL in the database as a row and the code is really just the row ID in the database table. However I cannot use the row ID directly because as the number of URLs grows, the number of characters representing the code grows quickly as well. For example, as I hit 1 million URLs, I would have 7 characters (all numbers) in the code. This is not so efficient. To reduce the number of characters used for representing the row ID, I use a base 36 numbering system (a-z, 0-9). This means 1 million records in the database would only require 4 characters (1,000,000 base 10 is ‘lfls’ base 36). And the 200 million records that TinyURL claims to store, will use only 6 characters instead of 9. I have no idea how TinyURL actually generates their code, but looking into their current number of characters in the code (6) I would say this is not a far off guess. (Reverse engineering their 6 character code this way though, results in showing that TinyURL have > 700 million records).

Doing base 36 conversion seems daunting but in reality most programming languages would have some sort of support for non-decimal numbering system conversion. Ruby’s implementation is particularly simple. A Ruby Fixnum (i.e. all whole numbers) has a method called to_s. This method is probably familiar with all Ruby programmers as it converts any object to a String representation. What is likely less well-known is that Fixnum’s implementation takes in a parameter, which is the radix for the base numbering system used. For example:

>> 1234.to_s(2)
=> "10011010010"
>> 1234.to_s(36)
=> "ya"

Conversely, String’s to_i implementation does the reverse, which is to take a String and convert it into a Fixnum representation of that String, given the radix:

>> "hello world".to_i(36)
=> 29234652

For templating, I used HAML, the delightful albeit more programmer-centric templating system. Most templating systems strike a compromise between HTML and a programming language. This is evident in the popular systems like JSP, ASP, PHP and even ERB (ERB is the templating system used by default in Rails). However HAML abandons HTML altogether and goes for a purely programming approach, very much like Seaside (Seaside is a Smalltalk web application framework). Skipping comparisons on the difference approaches in templating, one advantage HAML has is that it allows programmers to code the interface in a more natural way, which just suits me fine in this case.

I used a Sinatra trick that allows me to embed the template within the same source code itself. While normally I would need to create 2 additional template files (1 for the layout and another for the index), I added in the code for the templates in the same source file but after the __END__ keyword.

To prettify the interface, I used one of the ready-made W3C core stylesheets, which provide me with a standard and well defined set of styles without going through the headache of creating one myself.

And we’re done! A full URL shortening service in a 40 lines of code file. To start it up just do this:

$ ruby snip.rb

Then go to http://localhost:4567.

The next step is to deploy it to Heroku. I can’t say enough about this service, which is heaven-sent for the Ruby web application programming community. Just register an account here at http://heroku.com. In fact the main page more or less explains the steps you need to do to deploy the app! However, there are a couple more steps for Sinatra. Here is the complete list of steps:

1. Create a config.ru file

This is the Rack configuration file, which is actually just another Ruby script. All you need to have in this file is this:

require 'sinatra'
require 'snip'
run Sinatra.application

This tells Rack to include the Sinatra and Snip libraries, then run the Sinatra application.

2. Install the Heroku gem

$ sudo gem install heroku

Heroku provides us with a set of useful tools packaged in a gem, very much like Capistrano.

3. Initialize an empty Git repository in the snip folder

$ cd snip
snip $ git init
Initialized empty Git repository in .git/
snip $ git add .
snip $ git commit -m 'initial import'
Created initial commit 5581d23: initial import
2 files changed, 52 insertions(+), 0 deletions(-)
create mode 100644 config.ru
create mode 100644 snip.rb

This just creates and initializes an empty git repository on your computer.

4. Create the Heroku application

snip $ heroku create snip
Created http://snip.heroku.com/ | git@heroku.com:snip.git
Git remote heroku added

You will be prompted for your username and password the first time you run a heroku command. Subsequently this will be saved in ~/.heroku/credentials and you won’t be prompted. It will also upload your public key to allow you to push and pull code.

5. Push your code to Heroku

snip $ git push heroku master
Counting objects: 4, done.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 999 bytes, done.
Total 4 (delta 0), reused 0 (delta 0)
-----> Heroku receiving push
-----> Rack app detected
Compiled slug size is 004K
-----> Launching....... done
App deployed to Heroku
To git@heroku.com:snip.git
* [new branch]      master -> master

Notice that this pushes your code and loads your application into deployment.

6. Log in to the Heroku console and create the database

snip $ heroku console
Ruby console for snip.heroku.com
>> DataMapper.auto_migrate!
=> [Url]

Heroku allows you access to a console similar to irb but with the environment of your deployment loaded up, like script/console in Ruby on Rails. To create the database, I just run DataMapper.auto_migrate! and it will create the database accordingly.

This is it! Now go to your application on Heroku and you should be able to see this:

One of the main sticking points in doing Ruby web application development is finding a place to host the application. Most people either do web host sharing or slice hosting but it is either underpowered or costly (or both) and requires mucking around with servers (which many application programmers like myself do so only reluctantly). Heroku is an amazing service that heralds a new way deploying Ruby web applications that saves the day for many Ruby programmers.

A final word on the URL shortening service. For a more full-fledged service, I would probably register a much shorter domain name (something like what is.gd has) and also add more features that provide statistics to each redirected URL and so on.

23 Responses

Subscribe to comments with RSS.

  1. Kamal said, on April 13, 2009 at 3:29 pm

    I didn’t know Fixnum’s to_s took a param!

    As a comparison, on my URL shortener, I’m using Base 62 so that I get uppercase characters too (0-9, A-Z, a-z). Conversion is via the alphadecimal gem. Another thing is that I store the hash to url mapping in a key-value store so I can look up directly instead of converting back to int.

  2. sausheong said, on April 13, 2009 at 3:43 pm

    That’s why we love Ruby so :)

    Snip is a 40-liner web app, my priority was to create the simplest full-featured URL shortening service possible. I’m pretty sure yours is much more sophisticated.

  3. Kamil said, on April 14, 2009 at 3:15 am

    Very nice. Also you can write on one line:
    @url = Url.first(:original => uri.to_s) || Url.create(:original => uri.to_s) if @url.nil?

  4. Kamil said, on April 14, 2009 at 3:16 am

    @url = Url.first(:original => uri.to_s) || Url.create(:original => uri.to_s)

  5. Dary Merckens said, on April 14, 2009 at 4:36 am

    Funny coincidence. I just wrote one too :)

    One thing I did, not sure if you did too, was the following in an after_create filter:


    def filter_token
    token.insert(1,"^").insert(3,"@") if token.match(FILTER_REGEX)
    end

    FILTER_REGEX has a list of bad words. So that changes like f*** to f^u@ck. A little more family friendly I guess :)

    Dary

    • Gimeti said, on August 22, 2009 at 11:11 am

      good point,

      as a PHP guy I have no idea what is going on in that code but making a url shortener in 40 lines got my attention.

  6. Valery said, on April 16, 2009 at 4:47 am

    Great article, thanks. Keep going in the same way. What about AdSense like services in 50 lines? ;) It should be interesting :)

  7. ? said, on April 17, 2009 at 4:20 am

    saush.com » Blog Archive » Clone TinyURL in 40 lines of Ruby code…

  8. sausheong said, on April 21, 2009 at 8:35 am

    @dary Thks for the tip, I didn’t really look too far ahead I guess.

  9. [...] walkthrough of how to implement a TinyURL service in 40 lines of Ruby code, thanks to [...]

  10. Greg said, on April 28, 2009 at 2:00 am

    You can actually shorten that “first or create” section even more by using a DataMapper method named, appropriately, first_or_create:

    @url = Url.first_or_create(:original => uri.to_s)

  11. Nicolas Jacobeus said, on April 28, 2009 at 2:19 am

    Nice! I did the same exercice a couple of weeks ago. My version is slightly longer but uses Base62 (algorithm mine, can probably be improved) and includes the CSS: http://i5.be/s

    It’s in production here: http://i5.be

  12. Scott Woods said, on April 30, 2009 at 9:26 am

    Funny how so many similar apps were created in isolation around the same time! Here’s mine:

    http://gist.github.com/93599

    I was also going for the one-file sinatra app, but brevity was not the first priority.

    Very cool to see a slightly different approach! Thanks!

  13. sausheong said, on April 30, 2009 at 10:17 am

    Scott, really cool stuff! I’m still amazed at Sinatra and what it can do.

    I guess brevity and simplicity is my personal philosophy in programming.

  14. [...] Clone TinyURL in 40 lines of Ruby code I wrote Snip with Sinatra then deployed it up to Heroku so this is also a good excuse also to describe Heroku, a truly amazing service for the Ruby programming community. The total number of lines in Snip is actually 43, in a single file named snip.rb. including the view template and layout. [It's amazing what you can accomplish with Sinatra and Heroku.] [...]

  15. [...] of the Java support for AppEngine, it became a lot more interesting. A few weeks back I wrote Snip, a TinyURL clone, in about 40 lines of Ruby code, and deployed it on Heroku. It seems like a good idea to take Snip out for a spin on the Google [...]

  16. [...] Clone TinyURL in 40 lines of Ruby code. ↩ Posted by Satish [...]

  17. [...] is where Rails comes in. It’s possible to clone TinyURL using Sinatra, but I’m more interested in Rails (I think Rails probably scales better). The basic setup is [...]

  18. [...] постами на западных блогах врoдe «Clone TinyURL with 40 lines of Ruby» или «Clone Pastie in 15 Minutes with Sinatra & DataMapper» я решил [...]

  19. Владимир Крылов said, on December 24, 2009 at 6:35 am

    Ага, на самом деле все очень просто :)

  20. rubypdf said, on December 25, 2009 at 1:03 am

    really cool application

  21. Григорий Филатов said, on December 28, 2009 at 10:37 am

    Очень признателен, на самом деле полезная информация.

  22. мaкap said, on January 14, 2010 at 6:19 am

    Данной информации, уверен, и так вполне достаточно, чтобы сделать вывод, как не надо делать.


Leave a Reply