Cookies

HTTP cookies are key/value pairs that a client stores on the behalf of a
server. They are sent back in subsequent requests to
allow clients to keep state between requests. Remember that the HTTP protocol
itself has no state but instead the client has to resend all data in subsequent
requests that it wants the server to be aware of.

Cookies are set by the server with the Set-Cookie: header and with each
cookie the server sends a bunch of extra properties that need to match for the
client to send the cookie back. Like domain name and path and perhaps most
important for how long the cookie should live on.

The expiry of a cookie is either set to a fixed time in the future (or to live
a number of seconds) or it gets no expiry at all. A cookie without an expire
time is called a “session cookie” and is meant to live for the duration of the
“session” but not longer. A session in this aspect is typically thought to be
the life time of the browser used to view a site. When you close the
browser, you end your session. Doing HTTP operations with a command-line client
that supports cookies begs the question of when a session really ends…

The general concept of curl only doing the bare minimum unless you tell it
differently makes it not acknowledge cookies by default. You need to switch on
“the cookie engine” to make curl keep track of cookies it receives and then
subsequently send them out on requests that have matching cookies.

You enable the cookie engine by asking curl to read or write cookies. If you
tell curl to read cookies from a non-existing file, you will only switch on
the engine but start off with an empty internal cookie store:

  1. curl -b non-existing http://example.com

But just switching on the cookie engine, getting a single resource and then
quitting would be pointless as curl would have no chance to actually send any
cookies it received. Assuming the site in this example would set cookies and
then do a redirect we would do:

  1. curl -L -b non-existing http://example.com

Reading cookies from file

Starting off with a blank cookie store may not be desirable. Why not start off
with cookies you stored in a previous fetch or that you otherwise acquired?
The file format curl uses for cookies is called the Netscape cookie format
because it was once the file format used by browsers and then you could easily
tell curl to use the browser’s cookies!

As a convenience, curl also supports a cookie file being a set of HTTP
headers that set cookies. It’s an inferior format but may be the only thing
you have.

Tell curl which file to read the initial cookies from:

  1. curl -L -b cookies.txt http://example.com

Remember that this only reads from the file. If the server would update the
cookies in its response, curl would update that cookie in its in-memory store
but then eventually throw them all away when it exits and a subsequent invocation
of the same input file would use the original cookie contents again.

Writing cookies to file

The place where cookies are stored is sometimes referred to as the “cookie
jar”. When you enable the cookie engine in curl and it has received cookies,
you can instruct curl to write down all its known cookies to a file, the
cookie jar, before it exists. It is important to remember that curl only
updates the output cookie jar on exit and not during its lifetime, no matter
how long the handling of the given input takes.

You point out the cookie jar output with -c:

  1. curl -c cookie-jar.txt http://example.com

-c is the instruction to write cookies to a file, -b is the instruction
to read cookies from a file. Oftentimes you want both.

When curl writes cookies to this file, it will save all known cookies
including those that are session cookies (without a given lifetime). curl
itself has no notion of a session and it doesn’t know when a session ends so
it will not flush session cookies unless you tell it to.

Instead of telling curl when a session ends, in order to flush session cookies
and with this basically signal to the server that we are starting a new
session, curl features an option that lets the user decide when a new session
begins.

A new cookie session means that all the session cookies will be thrown
away. It is the equivalent of closing a browser and starting it up again.

Tell curl a new cookie session starts by using -j, --junk-session-cookies:

  1. curl -j -b cookies.txt http://example.com/