Table of Contents
Okay, perhaps the title is a slight exaggeration. But because of CouchDB's ease of use, 10¾ pages (the original length of this article in PDF format, formatting JSON code increased the number) gets you a long way. You'll learn how to create a database, bulk load data into it, reconfigure the server for remote access, replicate to an online server, create a view and run the server as a daemon. You'll learn how to do this not just from the web UI but also from the command line. That's pretty compleat, especially in 10¾ pages.
But to answer the question posed by the title of this section: CouchDB is a NoSQL database, a project of the Apache software foundation. It's easy to install and comes with its own administration client called "Futon". You can interface with it from the command line through its REST API, by running the CouchDB application or through your favourite browser. If you use the binaries provided by CouchOne you can easily install CouchDB on Mac OS X, Ubuntu or Windows. This means you can quickly check it out without having to mess around with failed dependencies or compiling from source.
This article deals with CouchDB on Mac OS X. However, apart from file paths and the section on running CouchDB as a service, this article only requires an operating system that supports command line use of curl. Using Futon is a quick way to become familiar with CouchDB and using curl commands to access the REST API is a good, language-neutral way to get a feel for how you can access CouchDB from your preferred programming language.
To download and install CouchDB you just need to point your browser at the url, CouchOne. Download and unzip the file and install the application as you would any other.
Start up CouchDB and you'll see something similar to the following:
The application opens on the Overview page. From this page you can view all databases and navigate to a specific database by clicking the database name.
On the right sidebar are links to other Futon pages. We'll mostly be concerned with the Overview, Configuration and Replicator links.
Using Futon is an easy way to perform tasks such as:
Creating a database – perform this task from the Overview page by clicking the Create Database link.
Entering a record – Once you've created a database, click it and then choose the New Document option.
Viewing records – Clicking a database lists all its records. Clicking a record displays its details.
Editing the configuration – Choose Configuration from the sidebar and view the configuration options
Replicating a database – Choose Replicator from the sidebar and follow the instructions.
Creating a view – Select a database from the Overview page and then Design Documents from the View drop-down list box. You'll probably want to read on before making any changes here.
Futon is particularly good for some tasks and not quite so good for others—you may already have a sense for which. What it's very good for is giving you a feel for CouchDB. Play with it.
This section shows you how to use curl to interface with the REST API. It can serve as a quick reference for basic syntax.
One of the first things you'll want to do is check that you can
communicate with the server. You can do this by sending a curl GET to
localhost on the default port for CouchDB (the -X GET
is optional):
shell> curl -X GET http://127.0.0.1:5984 { "couchdb": "Welcome", "version": "1.0.2" }
The JSON response shown above indicates a successful connection to the server.
In this article, all curl commands are followed by the output that is returned. In some cases this output is formatted for easier reading.
Creating a database is as simple as the following PUT:
shell> curl -X PUT http://127.0.0.1:5984/test_db { "ok":true }
Inspect your newly created database in the following way:
shell> curl http://127.0.0.1:5984/test_db { "update_seq": 0, "disk_size": 79, "purge_seq": 0, "doc_count": 0, "compact_running": false, "db_name": "test_db", "doc_del_count": 0, "instance_start_time": "1298762620491062", "committed_update_seq": 0, "disk_format_version": 5 }
Once you've created a database, you'll want to add a record. There are no tables within a CouchDB database—it's not a relational database—so add a record directly to the database by issuing the following command:
shell> curl -X POST http://127.0.0.1:5984/test_db/ \ -H 'Content-Type: application/json' \ -d '{"name":"blue jay", "location":"Malton"}' { "rev": "1-3eab48829ff6a6882f8d049456fa9e21", "ok": true, "id": "e1595400fcee306e82219bb0d400068c" }
By using POST
you create a new document
with a server-generated ID. Notice the Universal Unique Identifier
(UUID) in the message returned above. The id field of a record is
similar to the autoincrement field used by other databases. However,
since there are no tables in a CouchDB database, it makes more sense
to use UUIDs rather than incremented integers.
You can use PUT
to create a document
when you do not want an autogenerated ID. Do this by appending the
desired ID to the database URI as shown below:
shell> curl -X PUT http://127.0.0.1:5984/test_db/fruit \ -H 'Content-Type: application/json' \ -d '{"name":"granny smith", "type":"sour"}' { "rev": "1-a9a02531ffa21cf0da0f9e49658ef642", "ok": true, "id": "granny_smith" }
Notice that the id is specified as part of the URI rather than being autogenerated.
You can also create an id by including it in the JSON object as the
value associated with the special _id
field.
To view all the records in a database, use the following syntax:
shell> curl -X GET http://127.0.0.1:5984/test_db/_all_docs { "rows": [ { "value": { "rev": "1-e49ecdd681345e490f1061ecd54d06dc" }, "id": "820510f01e98a2a20dcffdb8f0000052", "key": "820510f01e98a2a20dcffdb8f0000052" }, { "value": { "rev": "1-c7410567a14b274b7b931674520082de" }, "id": "fruit", "key": "fruit" } ], "total_rows": 2, "offset": 0 }
Look at the output of the id
field and
you can readily see the difference between an autogenerated id field
and one that has been specified.
Using the REST API test_db/_all_docs
as
shown above doesn't retrieve data in the manner of a SELECT
statement such as SELECT * FROM
. To make a "wildcard"
selection from a database add the tblname
;include_docs=true
query parameter:
shell> curl -X GET http://127.0.0.1:5984/test_db/_all_docs?include_docs=true { "rows": [ { "value": { "rev": "1-e49ecdd681345e490f1061ecd54d06dc" }, "id": "820510f01e98a2a20dcffdb8f0000052", "key": "820510f01e98a2a20dcffdb8f0000052", "doc": { "_rev": "1-e49ecdd681345e490f1061ecd54d06dc", "_id": "820510f01e98a2a20dcffdb8f0000052", "name": "blue jay", "location": "Malton" } }, { "value": { "rev": "1-c7410567a14b274b7b931674520082de" }, "id": "fruit", "key": "fruit", "doc": { "_rev": "1-c7410567a14b274b7b931674520082de", "_id": "fruit", "type": "sour", "name": "granny smith" } } ], "total_rows": 2, "offset": 0 }
It's easy to add single records to a database using Futon but that's not the ideal way to add records to a database especially if you have existing data that you would like to bulk load. You can't bulk load data from the web UI. You must use the REST API directly.
You can bulk load data using the bulk document API which simply
requires that you append _bulk_docs
to
the base database URI.
To test use of this method, create a file named mydata.json
in the following format:
{ "docs": [ { "url": "example.com", "password": "secret", "type": "personal", "user": "fred" }, { "url": "other.example.com", "password": "super_secret", "type": "personal", "user": "admin" } ] }
Using the same test_db
database, try
bulk loading from the command line by issuing the following curl
command:
shell> curl -X POST http://127.0.0.1:5984/test_db/_bulk_docs \ -H "Content-type: application/json" -d @mydata.json
When using curl with
the -d
option you must precede the data
file name with the ’@
‘ character.
The data file does not contain an _id
field so UUIDs will be generated for the records in the
mydata.json
file.
You may well want to export data from another database for bulk loading into CouchDB. This section describes how to export data from MySQL—a quick way to get up and running.
You can export data from a MySQL database in JSON format using a SQL statement such as the following:
SELECT CONCAT("{\"field1_name\":\"",field1,"\",\"field2_name\":\"",field2, "\", \"field3\":\"",field3,"\",\"field4\":\"",field4,"\"},") FROM tablename INTO OUTFILE '/var/tmp/myout.json';
If any of your data fields contain quotation marks they will need to be escaped.
Given a table with the column names, url, type,
username
and password
the output
will look something like this:
{"url":"http://localhost","type":"personal","username":"peter","password":"secret"}, {"url":"http://example.com","type":"work","username":"root","password":"secret}, ...
To load this data file as described in the section called
“Importing Data”, you'll need to adjust the output file and place
its contents inside {"docs": [
. As
you've already seen, the bulk load REST API call is as follows:
file_contents_here
]}
shell> curl -X POST http://127.0.0.1:5984/db_name
/_bulk_docs \
-H "Content-type: application/json" -d @myout.json
Remember that a unique ID will be created if you do not include one.
If your existing database table contains a unique ID that you would
like to use in CouchDB, export data using the field name _id
for that unique ID field.
You can nest a GROUP_CONCAT
statement
inside a CONCAT
statement to create a
complete file that doesn't need to be adjusted manually but data
will be truncated unless you change the MySQL group_concat_max_len
option from its default,
1024
. However, this technique only
makes sense for small data sets.
Easy replication is one of the attractive features of CouchDB and once you have some data in a database you'll want to try replicating that data.
By default, CouchDB is configured for local access only so let's set up CouchDB for remote access because replicating to or from a remote database is much more fun.
If you want to replicate to a server other than one running on your
local machine, you'll have to change the default CouchDB
configuration. In the default configuration the bind_address
option is set to 127.0.0.1
. This setting limits REST requests to
the local machine. If you change the bind_address
option to 0.0.0.0
, then you'll be able to access CouchDB
remotely.
You can easily modify this option through Futon. Click the
Configuration
tool on the right
sidebar and scroll down until you find the httpd
section. Double click the value opposite
bind_address
and replace it with
0.0.0.0
. Once you've made this change
you can access CouchDB using the IP address or hostname of the
machine hosting CouchDB. You can also still use 127.0.0.1
or localhost
to access it locally. Note: Restart
CouchDB after making this change.
To change the configuration from the command line open /Applications/CouchDBX.app/Contents/Resources/couchdbx-core/couchdb_1.0.2/etc/couchdb/local.ini
in your favourite text editor, locate the [httpd]
section and adjust the value of the
bind_address
option.
Once you allow remote access it may be time to end the "Admin
Party". With Futon open, click the Fix
this link on the lower right to add a username and
password. You can also do this by altering the local.ini
file.
With remote access enabled you can now replicate a database across your LAN. You will, of course, need to have CouchDB running on another machine on your network. Install CouchDB elsewhere on your network. You can use the web UI to perform the replication. The following command shows how to do this from the command line:
shell> curl -X POST http://127.0.0.1:5984/_replicate \ -d '{"source":"test_db", "target":"http://192.168.0.196:5984/test_db"}' \ -H "Content-type: application/json" { "ok": true, "source_last_seq": 103, "session_id": "c5fde0fc0905238800cc89f03f1c2fb8", "history": [ { "missing_found": 99, "end_last_seq": 103, "docs_written": 99, "start_time": "Sun, 27 Feb 2011 14:55:06 GMT", "recorded_seq": 103, "docs_read": 99, "session_id": "c5fde0fc0905238800cc89f03f1c2fb8", "start_last_seq": 0, "end_time": "Sun, 27 Feb 2011 14:55:06 GMT", "doc_write_failures": 0, "missing_checked": 0 } ] }
Add "continuous":true
to the JSON
object if you want the database to update continuously.
Continuous updates work unless the server stops. You also might want to read up about push versus pull replication (http://wiki.apache.org/couchdb/Replication).
What's even more fun than replicating a database locally is replicating to an online database. You can do this for no charge by signing up for CouchOne hosting at http://www.couchone.com/get. Do this by clicking the Hosting via CouchOne link on the right sidebar.
From the command line you can replicate a local database to a CouchOne online database in the following way:
shell> curl -X POST http://127.0.0.1:5984/_replicate \ -d '{"source":"test_db", "target":"http://admin:password@yourUI.couchone.com/test_db"}' \ -H "Content-type: application/json" { "ok": true, "source_last_seq": 1, "session_id": "21dcf771f6c780bec0223f4dd6feeb55", "history": [ { "missing_found": 1, "end_last_seq": 1, "docs_written": 1, "start_time": "Sun, 27 Feb 2011 21:14:01 GMT", "recorded_seq": 1, "docs_read": 1, "session_id": "21dcf771f6c780bec0223f4dd6feeb55", "start_last_seq": 0, "end_time": "Sun, 27 Feb 2011 21:14:01 GMT", "doc_write_failures": 0, "missing_checked": 0 } ] }
The preceding curl POST assumes that a database named
birds
already exists locally and
online. If the online database does not yet exist, you can create
it during replication by adding "create_target":true
to the JSON object. Also,
the online database requires that a username and password be
included as part of the URI. Replace admin:password@yourUI
with appropriate vales.
Use your browser to navigate to the CouchOne web interface for your online database and view the replicated database. That was easy. No master, no slave. Just democratic, peer-to-peer replication.
If CouchDB is a NoSQL database with no tables, how do you look at
your data? Views substitute for the SELECT
statements used in a Relational Database
Management system (RDBMS). You can create temporary or permanent
views.
From the command line create a temporary view in the following way:
shell> curl -X POST http://127.0.0.1:5984/test_db/_temp_view \ -d '{"map":"function(doc) { emit(null, doc);}"}' -H "Content-type: application/json" { "rows": [ { "value": { "_rev": "1-e49ecdd681345e490f1061ecd54d06dc", "_id": "820510f01e98a2a20dcffdb8f0000052", "name": "blue jay", "location": "Malton" }, "id": "820510f01e98a2a20dcffdb8f0000052", "key": null }, { "value": { "_rev": "1-3eab48829ff6a6882f8d049456fa9e21", "_id": "e1595400fcee306e82219bb0d4000b87", "type": "sweet", "name": "myapple" }, "id": "e1595400fcee306e82219bb0d4000b87", "key": null }, { "value": { "_rev": "1-c7410567a14b274b7b931674520082de", "_id": "fruit", "type": "sour", "name": "granny smith" }, "id": "fruit", "key": null }, { "value": { "_rev": "1-e49ecdd681345e490f1061ecd54d06dc", "_id": "whatever", "name": "blue jay", "location": "Malton" }, "id": "whatever", "key": null } ], "total_rows": 4, "offset": 0 }
If your syntax is incorrect, as it can quite easily be with even simple views, you might see a message such as:
{"error":"bad_content_type","reason":"Content-Type must be application/json"}
If you see such a message, check quotation marks and open and closing braces.
Whereas bulk-loading is a task for the REST API, creating views is
most easily done from the web UI. With Futon open, select a database
and choose Temporary view
from the
View
list box. You should see something
like the following:
The View Code box shows a map function
written in JavaScript that will be invoked against every record in
the database. The doc
parameter passed
in is a single document in the database and emit
is a built-in function that takes a key and a
value argument. Consequently, this default map function outputs each
document in the current database. Add the line if(doc._id=="fruit")
before emit(null, doc);
and, if you entered data as described
in the section called “The REST API
from the Command Line”, you will see the record with the id field
fruit
when this view is run.
You can turn temporary views into permanent views by clicking the
test_db
database with the design
document name id
and the view name
fruit
, you can invoke it from the
command line like so:
shell> curl -X GET http://127.0.0.1:5984/test_db/_design/id/_view/fruit?limit= 11&descending=true [1] 3534 { "rows": [ { "value": { "_rev": "1-c7410567a14b274b7b931674520082de", "_id": "fruit", "type": "sour", "name": "granny smith" }, "id": "fruit", "key": null } ], "total_rows": 1, "offset": 0 }
Once you've got things up and running and are familiar with CouchDB,
you'll want to run it in the background as a service. On Mac OS X
this means using launchctl with a plist file. There
is an existing plist file called org.apache.couchdb.plist
and it's found in the
/Applications/CouchDBX.app/Contents/Resources/couchdbx-core/couchdb_1.0.2/Library/LaunchDaemons
directory. Copy this file to your home directory so that you can
easily make changes to it. The org.apache.couchdb.plist
file is reproduced below:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>org.apache.couchdb</string> <key>EnvironmentVariables</key> <dict> <key>HOME</key> <string>~</string> <key>DYLD_LIBRARY_PATH</key> <string>/opt/local/lib:$DYLD_LIBRARY_PATH</string> </dict> <key>ProgramArguments</key> <array> <string>/Users/jan/usr/src/couchdbx-core/dist/couchdb_1.0.2/bin/couchdb</string> </array> <key>UserName</key> <string>couchdb</string> <key>StandardOutPath</key> <string>/dev/null</string> <key>StandardErrorPath</key> <string>/dev/null</string> <key>RunAtLoad</key> <true/> <key>KeepAlive</key> <true/> </dict> </plist>
You can't use this file "as is". Make the following changes:
The EnvironmentVariables
key is
not required so remove that key and its dictionary.
Locate the couchdb start-up script on your computer. It should
be in the /Applications/CouchDBX.app/Contents/Resources/couchdbx-core/couchdb_1.0.2/bin
directory. Change the string in the ProgramArguments
key array to this value and
also change the UserName
key
string to your system username.
One final change is required because the couchdb script expects to be
run from the /Applications/CouchDBX.app/Contents/Resources/couchdbx-core
directory. Add a WorkingDirectory
key immediately followed by <string>/Applications/CouchDBX.app/Contents/Resources/couchdbx-core</string>
.
The final result should look like the following, with your username
replacing your_username
.
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>org.apache.couchdb</string> <key>ProgramArguments</key> <array> <string>/Applications/CouchDBX.app/Contents/Resources/couchdbx-core/couchdb_1.0.2/bin/couchdb</string> </array> <key>UserName</key> <string>your_username</string> <key>StandardOutPath</key> <string>/dev/null</string> <key>StandardErrorPath</key> <string>/dev/null</string> <key>RunAtLoad</key> <true/> <key>WorkingDirectory</key> <string>/Applications/CouchDBX.app/Contents/Resources/couchdbx-core</string> <key>KeepAlive</key> <true/> </dict> </plist>
Once you've made changes to the org.apache.couchdb.plist
file, shut down the
CouchDB application if it is running, and copy the newly created
plist file to the /Library/LaunchDaemons/
directory.
shell> sudo cp ~/org.apache.couchdb.plist \ /Library/LaunchDaemons/org.apache.couchdb.plist
Start up the daemon in the following way:
shell> sudo launchctl load /Library/LaunchDaemons/org.apache.couchdb.plist
You can confirm that CouchDB is running by pointing your browser at
http://127.0.0.1:5984/_utils/index.html
.
You can do everything from a browser that you can do when running the
CouchDB application—except shut down the server. If you want to stop
the server use the command:
shell> sudo launchctl unload /Library/LaunchDaemons/org.apache.couchdb.plist
If there are errors in your plist file, there will be no notification at the command line. If CouchDB does not start up properly you can open the Console App and check the messages. If you wish to relaunch CouchDB, you must first unload it as shown above.
A surprising number of topics can be covered in a minimal number of pages but lots more could be said, especially about the following topics:
Views
CouchDB on Android
How CouchDB handles revisions
Those topics will be covered in "The More Compleat Guide to CouchDB" coming sometime soon.
http://wiki.apache.org/couchdb/ – the CouchDB project website
http://www.couchone.com/get – the CouchOne website
http://guide.couchdb.org/ – "CouchDB: The Definitive Guide", a free online book from O'Reilly
http://www.ibm.com/developerworks/opensource/library/os-couchdb/index.html – "Exploring CouchDB", an article at IBM developerWorks
Peter Lavin has been published in a number of print and online magazines. He is also the author of Object Oriented PHP, published by No Starch Press and a contributor to PHP Hacks by O'Reilly Media.
Being a full-time technical writer, Peter occasionally feels the need to depart from the restrained style typical of his profession by writing articles with code snippets that use background colours other than grey.