The Network as the Database

Data Packager‘s schema editor uses a simple technique that was really helpful with implementing the editing interface. The editor’s task is to let the user edit metadata associated with the columns of a tabular data file. The interface shows a preview of the data file and each of its columns. At any time one column is selected, and the metadata editor form on the left-hand-side shows the metadata for that column. Clicking on any of the other columns changes to editing the metadata for that column:

Screenshot of the schema editor — The schema editor

It was a requirement that this interface be usable without JavaScript, so each time you click on one of those columns it’s actually submitting the form and reloading the whole page.

The challenge here is that if the user edits the metadata of one column, then goes to another column, then goes back to the first column, the changes they made to the first column should still be there. But the changes haven’t been saved yet, the user hasn’t clicked the Save button and they could still bail out and not save their edits. When they do click Save it should save the metadata for all the columns, not just the one that’s currently showing.

At first glance this looks like a pain to implement. We need to keep a draft version of the metadata in the database and keep updating it whenever the user changes columns. When they finally hit Save we need to overwrite the real version of the data with the draft. Lots of complicated logic in the view controller, and we could end up with abandoned drafts lying around in the db.

This is where the network as the database comes in: instead of storing the draft data in the database server-side, you store it in the HTML form itself. When you’re looking at the schema editor form above, showing the form fields for the selected column, the form actually contains hidden form fields for all the unselected columns as well. Each time you change columns you submit the form, the server captures your submitted form data and reflects it back to you in hidden form fields when it re-renders the form with the new column selected.

Server-side, you just need three simple functions:

Take a convenient, internal representation of all the data file’s columns and their metadata (from the database) and render it as an HTML form.
When the user clicks on a column take the HTML form submitted by the user and parse it, turning the submitted form data back into its convenient internal representation.
Take that internal representation and actually save it to the database.

You just keep repeating 1 and 2, and finally do 3 when they hit Save. Nothing is written to the database until 3 happens.

There are a few more details, but not much. In step 1 you need a little logic to decide which column to render as the selected column each time, and you need to render the metadata for all the other columns as hidden form fields. In 2 you have to validate the submitted form data and send any warnings or validation errors to 1 to be shown to the user.

Validation errors present a user interaction problem. Any errors are shown next to the form fields themselves. When they hit Save the server may find problems with the metadata fields on any of the columns, but the interface only shows the fields for the selected column. How to present the errors to the user?

Screenshot showing a validation error — A validation error

The solution we went with was to validate the data whenever the user changes columns as well as when they hit Save, and don’t let them change to a new column as long as the current column has any validation errors. The page will re-render still showing the same column, with the error messages in the form.

What it actually does if there are any validation errors is to render the form showing the first column that has any errors. If somehow there are errors in multiple columns this will work just fine, although the user will have to submit the form multiple times to correct each column. But by validating on each column change, that should never happen anyway. The logic for deciding which column to render as selected is:

Choose the first column that has any validation errors, or
Choose the column that the user just clicked on, or
Choose the first column (this happens the first time the form is rendered)

I think the network as the database technique would be a nice one to use for any multi-form editing interface, whether it involves jumping between different “pages” like the schema editor, or going through multiple sequential steps (maybe with the option to jump back to previous steps), etc. It does have the disadvantage that if the user’s browser crashes or if they accidentally close the tab, they may lose their work in progress if their browser doesn’t save it for them. Nothing is saved server-side until the final step. But it’s a simple technique that makes it easy to write pleasing interfaces without many logic bugs or edge cases.

Googling the network as the database brings up an interesting blog post from Armin Ronacher about using similar techniques to store sensitive things like access tokens.