HTML Forms and Go

This is an excerpt out of my new Go Web Programming book that talks about using Go to process HTML forms sent from the browser. This sounds pretty trivial but as in much of web programming (and much of programming per se), it’s often the trivial things that stumble us.

Go Web Programming

Before we get into getting form data from a POST request, let’s take a closer look into HTML forms and see what they are. Most of the time, POST requests come in the form (pun intended) of a HTML form and often look like this:

<form action="/process" method="post">
<input type="text" name="first_name"/>
<input type="text" name="last_name"/>
<input type="submit"/>
</form>

Within the form tag, we place a number of HTML form elements like text input, text area, radio buttons, checkboxes, file uploads and so on. These elements allow users to enter data to be submitted to the server. Data is submitted to the server when the user clicks a button or somehow triggers the form submission.

We know the data is sent to the server through a HTTP POST request, and is placed in the body of the request. But how is the data formatted? The HTML form data is always sent as name-value pairs but how are these name-value pairs formatted in the POST body? It is important for us to know this because as we receive the POST request from the browser, we need to be able to parse the data and extract the name-value pairs.
The format of the name-value pairs sent through a POST request is specified by the content type of the HTML form. This is defined using the enctype attribute like this:

<form action="/process" method="post" enctype="application/x-www-form-urlencoded">
<input type="text" name="first_name"/>
<input type="text" name="last_name"/>
<input type="submit"/>
</form>

The default values for enctype is application/x-www-form-urlencoded but browsers are required to support at least application/x-www-form-urlencoded and multipart/form-data (HTML5 also supports a text/plain value).

If we set enctype to application/x-www-form-urlencoded, the browser will encode the HTML form data a long query string with the name-value pairs separated by an ampersand (&) and the name is separated from the values by an equal (=), that is the same as URL encoding, hence the name. In other words, the HTTP body will look something like this:

first_name=sau%20sheong&last_name=chang

If we set enctype to multipart/form-data, each name-value pair is converted into a MIME message part, each with its own content type and content disposition. For example, the same form data as above will now look something like this:

------WebKitFormBoundaryMPNjKpeO9cLiocMw
Content-Disposition: form-data; name="first_name"

sau sheong
------WebKitFormBoundaryMPNjKpeO9cLiocMw
Content-Disposition: form-data; name="last_name"

chang
------WebKitFormBoundaryMPNjKpeO9cLiocMw--

When would we use either one or the other? If we’re sending simple text data, the URL encoded form is better as it is simpler, more efficient and less processing is needed. If we’re sending large amounts of data, especially when uploading files the multipart-MIME form is better. We can even specify to do base64 encoding to send binary data as text.

So far we’ve only talked about POST requests, what about GET requests in a HTML form? HTML allows the method attribute to be either POST or GET, so this is also a valid format.

<form action="/process" method="get">
<input type="text" name="first_name"/>
<input type="text" name="last_name"/>
<input type="submit"/>
</form>

In this case, there is no request body (GET requests have no request body), all the data are set in the URL as name-value pairs.

Now that we know how data is sent from a HTML form to the server, let’s go back to the server and see how we use net/http to process the request.

Form

One way to extract data from the HTTP request is to extract data from the URL and the body in the raw form, which requires us to parse the data ourselves. However we normally do not need to, because the net/http library provides us with a rather comprehensive set of functions, although not named entirely correctly, normally provides us with all we need. Let’s talk about each one of them in turn.

The functions in Request that allows us to extract data from the URL and/or the body revolve around the Form, PostForm and MultipartForm fields. The data are in the form of key-value pairs (which is what we normally get from a POST request anyway). The general algorithm is:

  • Call ParseForm or ParseMultipartForm to parse the request
  • Access Form, PostForm or MultipartForm accordingly

Let’s take a look at some code.

package main

import (
"fmt"
"net/http"
)

func process(w http.ResponseWriter, r *http.Request) {
r.ParseForm()
fmt.Fprintln(w, r.Form)
}

func main() {
server := http.Server{
Addr: "127.0.0.1:8080",
}
http.HandleFunc("/process", process)
server.ListenAndServe()
}

The focus of this server is on these 2 lines:

r.ParseForm()
fmt.Fprintln(w, r.Form)

As mentioned earlier, we need to first parse the request using ParseForm, and then access the Form field.

Let’s take a look at the client that is going to call this server. We’ll create a simple, minimal HTML form to send the request to the server. Place the code in a file named client.html.

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Go Web Programming</title>
</head>
<body>
<form action="http://127.0.0.1:8080/process?hello=world&thread=123" method="post" enctype="application/x-www-form-urlencoded">
<input type="text" name="hello" value="sau sheong"/>
<input type="text" name="post" value="456"/>
<input type="submit"/>
</form>
</body>
</html>

In this form we are:

  • Sending the URL http://localhost:8080/process?hello=world&thread=123 to the server using the POST method
  • Specifying the content type (in the enctype field) to be application/x-www-form-urlencoded
  • Sending 2 HTML form key-value pairs – hello=sau sheong and post=456 to the server

Note that we have 2 values for the key hello. One of them is world in the URL and the other is sau sheong in the HTML form.

Open the client.html file directly in your browser (you don’t need to serve it out from a web server, just running it locally on your browser is fine) and click on the submit button. What you will see on the browser is:

map[thread:[123] hello:[sau sheong world] post:[456]]

This is the raw string converted version of the Form struct in the POST request, after the request has been parsed. The Form struct is a map, which keys are strings and values are a slice of strings. Notice that the map is not sorted so you might get a different sorting of the returned values. Nonetheless what we get is the combination of the query values hello=world and thread=123 as well as form values hello=sau sheong and post=456. As you can see, the values are URL decoded (there is a space between sau and sheong).

PostForm

Of course if you wanted to just get the value to the key post, you can use r.Form["post"] which will give you a map with 1 element – [456]. If the form and the URL have the same key, both of them will be placed in a slice, with the form value always prioritized before the URL value.

What if we need just the form key-value pairs and want to totally ignore the URL key-value pairs? For this we have the PostForm, which only provides key-value pairs for the form and not the URL. If we change from using r.Form to using r.PostForm in the code this is what we get:

map[post:[456] hello:[sau sheong]]

We used application/x-www-form-urlencoded for the content type. What happens if we use multipart/form-data? Make the change to the client HTML form, switch back to using r.Form and let’s find out:

map[hello:[world] thread:[123]]

What happened here? We only get the URL query key-value pairs this time and not the form key-value pairs, because PostForm only supports application/x-www-form-urlencoded. To get multipart key-value pairs from the body, we need to use the MultipartForm.

MultipartForm

Instead of using ParseForm and then calling Form on the request, we have to use ParseMultipartForm then use MultipartForm on the request. ParseMultipartForm also calls ParseForm when necessary.

r.ParseMultipartForm(1024)
fmt.Fprintln(w, r.MultipartForm)

We need to tell ParseMultipartForm how much data we want to extract from the multipart form, in bytes. Now let’s see what happens:

&{map[hello:[sau sheong] post:[456]] map[]}

This time we see the form key-value pairs, but not the URL key-value pairs. This is because MultipartForm only contains the form key-value pairs. Notice that the returned value is no longer a map, but a struct that contains 2 maps. The first map has keys that are strings and values that are slices of string while the second map is empty. It’s empty because it’s a map with keys that are strings but values that are files.

There is one last set of functions that allows us to access the key-value pairs even easier than what we’ve just went through. The FormValue function allows us to access the key-value pairs just like in Form, except that it is for a specific key and we don’t need to call ParseForm or ParseMultipartForm beforehand – the FormValue function does that for us.

From our previous example, this means if we do this in our handler function:

fmt.Fprintln(w, r.FormValue("hello"))

And we set the client’s form enctype to application/x-www-form-urlencoded, we will get this:

sau sheong

We get only sau sheong because FormValue only retrieves the first value, even though we actually have both values in the Form struct. To prove this, let’s add another line below the earlier line of code, like this:

fmt.Fprintln(w, r.FormValue("hello"))
fmt.Fprintln(w, r.Form)

This time we’ll see:

sau sheong
map[post:[456] hello:[sau sheong world] thread:[123]]

The PostFormValue function does the same thing, except that it is for PostForm instead of Form. Let’s make some changes to the code to use the PostFormValue function:

fmt.Fprintln(w, r.PostFormValue("hello"))
fmt.Fprintln(w, r.PostForm)

This time we get this instead:

sau sheong
map[hello:[sau sheong] post:[456]]

As you can see we get only the form key-value pairs.

Both FormValue and PostFormValue call ParseMultipartForm for us so we don’t need to call it ourselves, but there’s a slightly confusing gotcha that you should be careful with (at least as of Go 1.4). If we set the client form’s enctype to be multipart/form-data and try to get the value using either FormValue or PostFormValue, we won’t be able to get it even though MultipartForm has been called!

To be clearer, let’s make some changes to the server’s handler function again:

fmt.Fprintln(w, "(1)", r.FormValue("hello"))
fmt.Fprintln(w, "(2)", r.PostFormValue("hello"))
fmt.Fprintln(w, "(3)", r.PostForm)
fmt.Fprintln(w, "(4)", r.MultipartForm)

This is our result from using our form with enctype set to multipart/form-data:

(1) world
(2)
(3) map[]
(4) &{map[hello:[sau sheong] post:[456]] map[]}

The first line in the results gives us the value for hello that’s found in the URL and not the form. The second line and third line tells us why, because if we just take the form key-value pairs, we actually get nothing. That’s because FormValue and PostFormValue corresponds to Form and PostForm, and not MultipartForm. The last line in the results proves to us that ParseMultipartForm was actually called, that’s why if we try to access the MultipartForm we’ll get the data there.

We covered quite a bit in this blog post so let’s recap how these functions are different, in a nice table.

html_form_and_go

Undoubtedly the naming convention leaves much to be desired!

Advertisements

Go Web Programming

I’ve gone and done it. I’ve started writing another book. And not just any book, but a book on good old honest-to-goodness web programming. If I’m qualified to write about any programming topic that’s probably it. I’ve been doing web application programming so long now that almost the first thing I check out in any new programming language I get to know is their http library. I’ve written web applications in almost every programming language I know, and some I’m not even sure I actually know.

So now this. And on the Go language. We’ll see.

http://www.manning.com/chang/