Echo server

Echo server

Let’s practice doing some asynchronous I/O. We will be writing an echo server.

The echo server binds a TcpListener and accepts inbound connections in a loop. For each inbound connection, data is read from the socket and written immediately back to the socket. The client sends data to the server and receives the exact same data back.

We will implement the echo server twice, using slightly different strategies.

Using `io::copy()`

To start, we will implement the echo logic using the io::copy utility.

This is a TCP server and needs an accept loop. A new task is spawned to process each accepted socket.

use tokio::io;
use tokio::net::TcpListener;
#[tokio::main]
async fn main() -> io::Result<()> {
    let mut listener = TcpListener::bind("127.0.0.1:6142").await.unwrap();
    loop {
        let (mut socket, _) = listener.accept().await?;
        tokio::spawn(async move {
            // Copy data here
        });
    }
}

As seen earlier, this utility function takes a reader and a writer and copies data from one to the other. However, we only have a single TcpStream. This single value implements both AsyncRead and AsyncWrite. Because io::copy requires &mut for both the reader and the writer, the socket cannot be used for both arguments.

// This fails to compile
io::copy(&mut socket, &mut socket).await

Splitting a reader + writer

To work around this problem, we must split the socket into a reader handle and a writer handle. The best way to split a reader/writer combo depends on the specific type.

Any reader + writer type can be split using the io::split utility. This function takes a single value and returns separate reader and writer handles. These two handles can be used independently, including from separate tasks.

For example, the echo client could handle concurrent reads and writes like this:

use tokio::io::{self, AsyncReadExt, AsyncWriteExt};
use tokio::net::TcpStream;
#[tokio::main]
async fn main() -> io::Result<()> {
    let socket = TcpStream::connect("127.0.0.1:6142").await?;
    let (mut rd, mut wr) = io::split(socket);
    // Write data in the background
    let write_task = tokio::spawn(async move {
        wr.write_all(b"hello\r\n").await?;
        wr.write_all(b"world\r\n").await?;
        // Sometimes, the rust type inferencer needs
        // a little help
        Ok::<_, io::Error>(())
    });
    let mut buf = vec![0; 128];
    loop {
        let n = rd.read(&mut buf).await?;
        if n == 0 {
            break;
        }
        println!("GOT {:?}", &buf[..n]);
    }
    Ok(())
}

Because io::split supports any value that implements AsyncRead + AsyncWrite and returns independent handles, internally io::split uses an Arc and a Mutex. This overhead can be avoided with TcpStream. TcpStream offers two specialized split functions.

TcpStream::split takes a reference to the stream and returns a reader and writer handle. Because a reference is used, both handles must stay on the same task that split() was called from. This specialized split is zero-cost. There is no Arc or Mutex needed. TcpStream also provides into_split which supports handles that can move across tasks at the cost of only an Arc.

Because io::copy() is called on the same task that owns the TcpStream, we can use TcpStream::split. The task that processes the echo logic becomes:

tokio::spawn(async move {
    let (mut rd, mut wr) = socket.split();
    if io::copy(&mut rd, &mut wr).await.is_err() {
        eprintln!("failed to copy");
    }
});

You can find the entire code here.

Manual copying

Now lets look at how we would write the echo server by copying the data manually. To do this, we use AsyncReadExt::read and AsyncWriteExt::write_all.

The full echo server is as follows:

use tokio::io::{self, AsyncReadExt, AsyncWriteExt};
use tokio::net::TcpListener;
#[tokio::main]
async fn main() -> io::Result<()> {
    let mut listener = TcpListener::bind("127.0.0.1:6142").await.unwrap();
    loop {
        let (mut socket, _) = listener.accept().await?;
        tokio::spawn(async move {
            let mut buf = vec![0; 1024];
            loop {
                match socket.read(&mut buf).await {
                    // Return value of `Ok(0)` signifies that the remote has
                    // closed
                    Ok(0) => return,
                    Ok(n) => {
                        // Copy the data back to socket
                        if socket.write_all(&buf[..n]).await.is_err() {
                            // Unexpected socket error. There isn't much we can
                            // do here so just stop processing.
                            return;
                        }
                    }
                    Err(_) => {
                        // Unexpected socket error. There isn't much we can do
                        // here so just stop processing.
                        return;
                    }
                }
            }
        });
    }
}

Let’s break it down. First, since the AsyncRead and AsyncWrite utilities are used, the extension traits must be brought into scope.

use tokio::io::{self, AsyncReadExt, AsyncWriteExt};

Allocating a buffer

The strategy is to read some data from the socket into a buffer then write the contents of the buffer back to the socket.

let mut buf = vec![0; 1024];

A stack buffer is explicitly avoided. Recall from earlier, we noted that all task data that lives across calls to .await must be stored by the task. In this case, buf is used across .await calls. All task data is stored in a single allocation. You can think of it as an enum where each variant is the data that needs to be stored for a specific call to .await.

If the buffer is represented by a stack array, the internal structure for tasks spawned per accepted socket might look something like:

struct Task {
    // internal task fields here
    task: enum {
        AwaitingRead {
            socket: TcpStream,
            buf: [BufferType],
        },
        AwaitingWriteAll {
            socket: TcpStream,
            buf: [BufferType],
        }
    }
}

If a stack array is used as the buffer type, it will be stored inline in the task structure. This will make the task structure very big. Additionally, buffer sizes are often page sized. This will, in turn, make Task an awkward size: $page-size + a-few-bytes.

The compiler optimizes the layout of async blocks further than a basic enum. In practice, variables are not moved around between variants as would be required with an enum. However, the task struct size is at least as big as the largest variable.

Because of this, it is usually more efficient to use a dedicated allocation for the buffer.

Handling EOF

When the read half of the TCP stream is shut down, a call to read() returns Ok(0). It is important to exit the read loop at this point. Forgetting to break from the read loop on EOF is a common source of bugs.

loop {
    match socket.read(&mut buf).await {
        // Return value of `Ok(0)` signifies that the remote has
        // closed
        Ok(0) => return,
        // ... other cases handled here
    }
}

Forgetting to break from the read loop usually results in a 100% CPU infinite loop situation. As the socket is closed, socket.read() returns immediately. The loop then repeats forever.

Full code can be found here.

Echo server

Echo server

Using io::copy()

Splitting a reader + writer

Manual copying

Allocating a buffer

Handling EOF

Using `io::copy()`