Reading files

In this chapter we’re going to learn how to read some files, get some data out of them, and do something useful.

Pretend you’re working with your friend to create some blog software. The idea is an author will write their posts in markdown, with some metadata at the top of the file. On startup, the web server will read a folder to create some Posts, and then a separate NewHandler function will use those Posts as a datasource for the blog’s webserver.

We’ve been asked to create the package that converts a given folder of blog post files into a collection of Posts.

Example data

hello world.md

  1. Title: Hello, TDD world!
  2. Description: First post on our wonderful blog
  3. Tags: tdd, go
  4. ---
  5. Hello world!
  6. The body of posts starts after the `---`

Expected data

  1. type Post struct {
  2. Title, Description, Body string
  3. Tags []string
  4. }

Iterative, test-driven development

We’ll take an iterative approach where we’re always taking simple, safe steps toward our goal.

This requires us to break up our work, but we should be careful not to fall into the trap of taking a “bottom up” approach.

We should not trust our over-active imaginations when we start work. We could be tempted into making some kind of abstraction that is only validated once we stick everything together, such as some kind of BlogPostFileParser.

This is not iterative and is missing out on the tight feedback loops that TDD is supposed to bring us.

Kent Beck says:

Optimism is an occupational hazard of programming. Feedback is the treatment.

Instead, our approach should strive to be as close to delivering real consumer value as quickly as possible (often called a “happy path”). Once we have delivered a small amount of consumer value end-to-end, further iteration of the rest of the requirements is usually straightforward.

Thinking about the kind of test we want to see

Let’s remind ourselves of our mindset and goals when starting:

  • Write the test we want to see. Think about how we’d like to use the code we’re going to write from a consumer’s point of view.
  • Focus on what and why, but don’t get distracted by how.

Our package needs to offer a function that can be pointed at a folder, and return us some posts.

  1. var posts []blogposts.Post
  2. posts = blogposts.NewPostsFromFS("some-folder")

To write a test around this, we’d need some kind of test folder with some example posts in it. There’s nothing terribly wrong with this, but you are making some trade-offs:

  • for each test you may need to create new files to test a particular behaviour
  • some behaviour will be challenging to test, such as failing to load files
  • the tests will run a little slower because they will need to access the file system

We’re also unnecessarily coupling ourselves to a specific implementation of the file system.

File system abstractions introduced in Go 1.16

Go 1.16 introduced an abstraction for file systems; the io/fs package.

Package fs defines basic interfaces to a file system. A file system can be provided by the host operating system but also by other packages.

This lets us loosen our coupling to a specific file system, which will then let us inject different implementations according to our needs.

On the producer side of the interface, the new embed.FS type implements fs.FS, as does zip.Reader. The new os.DirFS function provides an implementation of fs.FS backed by a tree of operating system files.

If we use this interface, users of our package have a number of options baked-in to the standard library to use. Learning to leverage interfaces defined in Go’s standard library (e.g. io.fs, io.Reader, io.Writer), is vital to writing loosely coupled packages. These packages can then be re-used in contexts different to those you imagined, with minimal fuss from your consumers.

In our case, maybe our consumer wants the posts to be embedded into the Go binary rather than files in a “real” filesystem? Either way, our code doesn’t need to care.

For our tests, the package testing/fstest offers us an implementation of io/FS to use, similar to the tools we’re familiar with in net/http/httptest.

Given this information, the following feels like a better approach,

  1. var posts []blogposts.Post
  2. posts = blogposts.NewPostsFromFS(someFS)

Write the test first

We should keep scope as small and useful as possible. If we prove that we can read all the files in a directory, that will be a good start. This will give us confidence in the software we’re writing. We can check that the count of []Post returned is the same as the number of files in our fake file system.

Create a new project to work through this chapter.

  • mkdir blogposts
  • cd blogposts
  • go mod init github.com/{your-name}/blogposts
  • touch blogposts_test.go
  1. package blogposts_test
  2. import (
  3. "testing"
  4. "testing/fstest"
  5. )
  6. func TestNewBlogPosts(t *testing.T) {
  7. fs := fstest.MapFS{
  8. "hello world.md": {Data: []byte("hi")},
  9. "hello-world2.md": {Data: []byte("hola")},
  10. }
  11. posts := blogposts.NewPostsFromFS(fs)
  12. if len(posts) != len(fs) {
  13. t.Errorf("got %d posts, wanted %d posts", len(posts), len(fs))
  14. }
  15. }

Notice that the package of our test is blogposts_test. Remember, when TDD is practiced well we take a consumer-driven approach: we don’t want to test internal details because consumers don’t care about them. By appending _test to our intended package name, we only access exported members from our package - just like a real user of our package.

We’ve imported testing/fstest which gives us access to the fstest.MapFS type. Our fake file system will pass fstest.MapFS to our package.

A MapFS is a simple in-memory file system for use in tests, represented as a map from path names (arguments to Open) to information about the files or directories they represent.

This feels simpler than maintaining a folder of test files, and it will execute quicker.

Finally, we codified the usage of our API from a consumer’s point of view, then checked if it creates the correct number of posts.

Try to run the test

  1. ./blogpost_test.go:15:12: undefined: blogposts

Write the minimal amount of code for the test to run and check the failing test output

The package doesn’t exist. Create a new file blogposts.go and put package blogposts inside it. You’ll need to then import that package into your tests. For me, the imports now look like:

  1. import (
  2. blogposts "github.com/quii/learn-go-with-tests/reading-files"
  3. "testing"
  4. "testing/fstest"
  5. )

Now the tests won’t compile because our new package does not have a NewPostsFromFS function, that returns some kind of collection.

  1. ./blogpost_test.go:16:12: undefined: blogposts.NewPostsFromFS

This forces us to make the skeleton of our function to make the test run. Remember not to overthink the code at this point; we’re only trying to get a running test, and to make sure it fails as we’d expect. If we skip this step we may skip over assumptions and, write a test which is not useful.

  1. package blogposts
  2. import "testing/fstest"
  3. type Post struct {
  4. }
  5. func NewPostsFromFS(fileSystem fstest.MapFS) []Post {
  6. return nil
  7. }

The test should now correctly fail

  1. === RUN TestNewBlogPosts
  2. blogposts_test.go:48: got 0 posts, wanted 2 posts

Write enough code to make it pass

We could “slime” this to make it pass:

  1. func NewPostsFromFS(fileSystem fstest.MapFS) []Post {
  2. return []Post{{}, {}}
  3. }

But, as Denise Yu wrote:

Sliming is useful for giving a “skeleton” to your object. Designing an interface and executing logic are two concerns, and sliming tests strategically lets you focus on one at a time.

We already have our structure. So, what do we do instead?

As we’ve cut scope, all we need to do is read the directory and create a post for each file we encounter. We don’t have to worry about opening files and parsing them just yet.

  1. func NewPostsFromFS(fileSystem fstest.MapFS) []Post {
  2. dir, _ := fs.ReadDir(fileSystem, ".")
  3. var posts []Post
  4. for range dir {
  5. posts = append(posts, Post{})
  6. }
  7. return posts
  8. }

fs.ReadDir reads a directory inside a given fs.FS returning []DirEntry.

Already our idealised view of the world has been foiled because errors can happen, but remember now our focus is making the test pass, not changing design, so we’ll ignore the error for now.

The rest of the code is straightforward: iterate over the entries, create a Post for each one and, return the slice.

Refactor

Even though our tests are passing, we can’t use our new package outside of this context, because it is coupled to a concrete implementation fstest.MapFS. But, it doesn’t have to be. Change the argument to our NewPostsFromFS function to accept the interface from the standard library.

  1. func NewPostsFromFS(fileSystem fs.FS) []Post {
  2. dir, _ := fs.ReadDir(fileSystem, ".")
  3. var posts []Post
  4. for range dir {
  5. posts = append(posts, Post{})
  6. }
  7. return posts
  8. }

Re-run the tests: everything should be working.

Error handling

We parked error handling earlier when we focused on making the happy-path work. Before continuing to iterate on the functionality, we should acknowledge that errors can happen when working with files. Beyond reading the directory, we can run into problems when we open individual files. Let’s change our API (via our tests first, naturally) so that it can return an error.

  1. func TestNewBlogPosts(t *testing.T) {
  2. fs := fstest.MapFS{
  3. "hello world.md": {Data: []byte("hi")},
  4. "hello-world2.md": {Data: []byte("hola")},
  5. }
  6. posts, err := blogposts.NewPostsFromFS(fs)
  7. if err != nil {
  8. t.Fatal(err)
  9. }
  10. if len(posts) != len(fs) {
  11. t.Errorf("got %d posts, wanted %d posts", len(posts), len(fs))
  12. }
  13. }

Run the test: it should complain about the wrong number of return values. Fixing the code is straightforward.

  1. func NewPostsFromFS(fileSystem fs.FS) ([]Post, error) {
  2. dir, err := fs.ReadDir(fileSystem, ".")
  3. if err != nil {
  4. return nil, err
  5. }
  6. var posts []Post
  7. for range dir {
  8. posts = append(posts, Post{})
  9. }
  10. return posts, nil
  11. }

This will make the test pass. The TDD practitioner in you might be annoyed we didn’t see a failing test before writing the code to propagate the error from fs.ReadDir. To do this “properly”, we’d need a new test where we inject a failing fs.FS test-double to make fs.ReadDir return an error.

  1. type StubFailingFS struct {
  2. }
  3. func (s StubFailingFS) Open(name string) (fs.File, error) {
  4. return nil, errors.New("oh no, i always fail")
  5. }
  1. // later
  2. _, err := blogposts.NewPostsFromFS(StubFailingFS{})

This should give you confidence in our approach. The interface we’re using has one method, which makes creating test-doubles to test different scenarios trivial.

In some cases, testing error handling is the pragmatic thing to do but, in our case, we’re not doing anything interesting with the error, we’re just propagating it, so it’s not worth the hassle of writing a new test.

Logically, our next iterations will be around expanding our Post type so that it has some useful data.

Write the test first

We’ll start with the first line in the proposed blog post schema, the title field.

We need to change the contents of the test files so they match what was specified, and then we can make an assertion that it is parsed correctly.

  1. func TestNewBlogPosts(t *testing.T) {
  2. fs := fstest.MapFS{
  3. "hello world.md": {Data: []byte("Title: Post 1")},
  4. "hello-world2.md": {Data: []byte("Title: Post 2")},
  5. }
  6. // rest of test code cut for brevity
  7. got := posts[0]
  8. want := blogposts.Post{Title: "Post 1"}
  9. if !reflect.DeepEqual(got, want) {
  10. t.Errorf("got %+v, want %+v", got, want)
  11. }
  12. }

Try to run the test

  1. ./blogpost_test.go:58:26: unknown field 'Title' in struct literal of type blogposts.Post

Write the minimal amount of code for the test to run and check the failing test output

Add the new field to our Post type so that the test will run

  1. type Post struct {
  2. Title string
  3. }

Re-run the test, and you should get a clear, failing test

  1. === RUN TestNewBlogPosts
  2. === RUN TestNewBlogPosts/parses_the_post
  3. blogpost_test.go:61: got {Title:}, want {Title:Post 1}

Write enough code to make it pass

We’ll need to open each file and then extract the title

  1. func NewPostsFromFS(fileSystem fs.FS) ([]Post, error) {
  2. dir, err := fs.ReadDir(fileSystem, ".")
  3. if err != nil {
  4. return nil, err
  5. }
  6. var posts []Post
  7. for _, f := range dir {
  8. post, err := getPost(fileSystem, f)
  9. if err != nil {
  10. return nil, err //todo: needs clarification, should we totally fail if one file fails? or just ignore?
  11. }
  12. posts = append(posts, post)
  13. }
  14. return posts, nil
  15. }
  16. func getPost(fileSystem fs.FS, f fs.DirEntry) (Post, error) {
  17. postFile, err := fileSystem.Open(f.Name())
  18. if err != nil {
  19. return Post{}, err
  20. }
  21. defer postFile.Close()
  22. postData, err := io.ReadAll(postFile)
  23. if err != nil {
  24. return Post{}, err
  25. }
  26. post := Post{Title: string(postData)[7:]}
  27. return post, nil
  28. }

Remember our focus at this point is not to write elegant code, it’s just to get to a point where we have working software.

Even though this feels like a small increment forward it still required us to write a fair amount of code and make some assumptions in respect to error handling. This would be a point where you should talk to your colleagues and decide the best approach.

The iterative approach has given us fast feedback that our understanding of the requirements is incomplete.

fs.FS gives us a way of opening a file within it by name with its Open method. From there we read the data from the file and, for now, we do not need any sophisticated parsing, just cutting out the Title: text by slicing the string.

Refactor

Separating the ‘opening file code’ from the ‘parsing file contents code’ will make the code simpler to understand and work with.

  1. func getPost(fileSystem fs.FS, f fs.DirEntry) (Post, error) {
  2. postFile, err := fileSystem.Open(f.Name())
  3. if err != nil {
  4. return Post{}, err
  5. }
  6. defer postFile.Close()
  7. return newPost(postFile)
  8. }
  9. func newPost(postFile fs.File) (Post, error) {
  10. postData, err := io.ReadAll(postFile)
  11. if err != nil {
  12. return Post{}, err
  13. }
  14. post := Post{Title: string(postData)[7:]}
  15. return post, nil
  16. }

When you refactor out new functions or methods, take care and think about the arguments. You’re designing here, and are free to think deeply about what is appropriate because you have passing tests. Think about coupling and cohesion. In this case you should ask yourself:

Does newPost have to be coupled to an fs.File ? Do we use all the methods and data from this type? What do we really need?

In our case we only use it as an argument to io.ReadAll which needs an io.Reader. So we should loosen the coupling in our function and ask for an io.Reader.

  1. func newPost(postFile io.Reader) (Post, error) {
  2. postData, err := io.ReadAll(postFile)
  3. if err != nil {
  4. return Post{}, err
  5. }
  6. post := Post{Title: string(postData)[7:]}
  7. return post, nil
  8. }

You can make a similar argument for our getPost function, which takes an fs.DirEntry argument but simply calls Name() to get the file name. We don’t need all that; let’s decouple from that type and pass the file name through as a string. Here’s the fully refactored code:

  1. func NewPostsFromFS(fileSystem fs.FS) ([]Post, error) {
  2. dir, err := fs.ReadDir(fileSystem, ".")
  3. if err != nil {
  4. return nil, err
  5. }
  6. var posts []Post
  7. for _, f := range dir {
  8. post, err := getPost(fileSystem, f.Name())
  9. if err != nil {
  10. return nil, err //todo: needs clarification, should we totally fail if one file fails? or just ignore?
  11. }
  12. posts = append(posts, post)
  13. }
  14. return posts, nil
  15. }
  16. func getPost(fileSystem fs.FS, fileName string) (Post, error) {
  17. postFile, err := fileSystem.Open(fileName)
  18. if err != nil {
  19. return Post{}, err
  20. }
  21. defer postFile.Close()
  22. return newPost(postFile)
  23. }
  24. func newPost(postFile io.Reader) (Post, error) {
  25. postData, err := io.ReadAll(postFile)
  26. if err != nil {
  27. return Post{}, err
  28. }
  29. post := Post{Title: string(postData)[7:]}
  30. return post, nil
  31. }

From now on, most of our efforts can be neatly contained within newPost. The concerns of opening and iterating over files are done, and now we can focus on extracting the data for our Post type. Whilst not technically necessary, files are a nice way to logically group related things together, so I moved the Post type and newPost into a new post.go file.

Test helper

We should take care of our tests too. We’re going to be making assertions on Posts a lot, so we should write some code to help with that

  1. func assertPost(t *testing.T, got blogposts.Post, want blogposts.Post) {
  2. t.Helper()
  3. if !reflect.DeepEqual(got, want) {
  4. t.Errorf("got %+v, want %+v", got, want)
  5. }
  6. }
  1. assertPost(t, posts[0], blogposts.Post{Title: "Post 1"})

Write the test first

Let’s extend our test further to extract the next line from the file, the description. Up until making it pass should now feel comfortable and familiar.

  1. func TestNewBlogPosts(t *testing.T) {
  2. const (
  3. firstBody = `Title: Post 1
  4. Description: Description 1`
  5. secondBody = `Title: Post 2
  6. Description: Description 2`
  7. )
  8. fs := fstest.MapFS{
  9. "hello world.md": {Data: []byte(firstBody)},
  10. "hello-world2.md": {Data: []byte(secondBody)},
  11. }
  12. // rest of test code cut for brevity
  13. assertPost(t, posts[0], blogposts.Post{
  14. Title: "Post 1",
  15. Description: "Description 1",
  16. })
  17. }

Try to run the test

  1. ./blogpost_test.go:47:58: unknown field 'Description' in struct literal of type blogposts.Post

Write the minimal amount of code for the test to run and check the failing test output

Add the new field to Post.

  1. type Post struct {
  2. Title string
  3. Description string
  4. }

The tests should now compile, and fail.

  1. === RUN TestNewBlogPosts
  2. blogpost_test.go:47: got {Title:Post 1
  3. Description: Description 1 Description:}, want {Title:Post 1 Description:Description 1}

Write enough code to make it pass

The standard library has a handy library for helping you scan through data, line by line; bufio.Scanner

Scanner provides a convenient interface for reading data such as a file of newline-delimited lines of text.

  1. func newPost(postFile io.Reader) (Post, error) {
  2. scanner := bufio.NewScanner(postFile)
  3. scanner.Scan()
  4. titleLine := scanner.Text()
  5. scanner.Scan()
  6. descriptionLine := scanner.Text()
  7. return Post{Title: titleLine[7:], Description: descriptionLine[13:]}, nil
  8. }

Handily, it also takes an io.Reader to read through (thank you again, loose-coupling), we don’t need to change our function arguments.

Call Scan to read a line, and then extract the data using Text.

This function could never return an error. It would be tempting at this point to remove it from the return type, but we know we’ll have to handle invalid file structures later so, we may as well leave it.

Refactor

We have repetition around scanning a line and then reading the text. We know we’re going to do this operation at least one more time, it’s a simple refactor to DRY up so let’s start with that.

  1. func newPost(postFile io.Reader) (Post, error) {
  2. scanner := bufio.NewScanner(postFile)
  3. readLine := func() string {
  4. scanner.Scan()
  5. return scanner.Text()
  6. }
  7. title := readLine()[7:]
  8. description := readLine()[13:]
  9. return Post{Title: title, Description: description}, nil
  10. }

This has barely saved any lines of code, but that’s rarely the point of refactoring. What I’m trying to do here is just separating the what from the how of reading lines to make the code a little more declarative to the reader.

Whilst the magic numbers of 7 and 13 get the job done, they’re not awfully descriptive.

  1. const (
  2. titleSeparator = "Title: "
  3. descriptionSeparator = "Description: "
  4. )
  5. func newPost(postFile io.Reader) (Post, error) {
  6. scanner := bufio.NewScanner(postFile)
  7. readLine := func() string {
  8. scanner.Scan()
  9. return scanner.Text()
  10. }
  11. title := readLine()[len(titleSeparator):]
  12. description := readLine()[len(descriptionSeparator):]
  13. return Post{Title: title, Description: description}, nil
  14. }

Now that I’m staring at the code with my creative refactoring mind, I’d like to try making our readLine function take care of removing the tag. There’s also a more readable way of trimming a prefix from a string with the function strings.TrimPrefix.

  1. func newPost(postBody io.Reader) (Post, error) {
  2. scanner := bufio.NewScanner(postBody)
  3. readMetaLine := func(tagName string) string {
  4. scanner.Scan()
  5. return strings.TrimPrefix(scanner.Text(), tagName)
  6. }
  7. return Post{
  8. Title: readMetaLine(titleSeparator),
  9. Description: readMetaLine(descriptionSeparator),
  10. }, nil
  11. }

You may or may not like this idea, but I do. The point is in the refactoring state we are free to play with the internal details, and you can keep running your tests to check things still behave correctly. We can always go back to previous states if we’re not happy. The TDD approach gives us this license to frequently experiment with ideas, so we have more shots at writing great code.

The next requirement is extracting the post’s tags. If you’re following along, I’d recommend trying to implement it yourself before reading on. You should now have a good, iterative rhythm and feel confident to extract the next line and parse out the data.

For brevity, I will not go through the TDD steps, but here’s the test with tags added.

  1. func TestNewBlogPosts(t *testing.T) {
  2. const (
  3. firstBody = `Title: Post 1
  4. Description: Description 1
  5. Tags: tdd, go`
  6. secondBody = `Title: Post 2
  7. Description: Description 2
  8. Tags: rust, borrow-checker`
  9. )
  10. // rest of test code cut for brevity
  11. assertPost(t, posts[0], blogposts.Post{
  12. Title: "Post 1",
  13. Description: "Description 1",
  14. Tags: []string{"tdd", "go"},
  15. })
  16. }

You’re only cheating yourself if you just copy and paste what I write. To make sure we’re all on the same page, here’s my code which includes extracting the tags.

  1. const (
  2. titleSeparator = "Title: "
  3. descriptionSeparator = "Description: "
  4. tagsSeparator = "Tags: "
  5. )
  6. func newPost(postBody io.Reader) (Post, error) {
  7. scanner := bufio.NewScanner(postBody)
  8. readMetaLine := func(tagName string) string {
  9. scanner.Scan()
  10. return strings.TrimPrefix(scanner.Text(), tagName)
  11. }
  12. return Post{
  13. Title: readMetaLine(titleSeparator),
  14. Description: readMetaLine(descriptionSeparator),
  15. Tags: strings.Split(readMetaLine(tagsSeparator), ", "),
  16. }, nil
  17. }

Hopefully no surprises here. We were able to re-use readMetaLine to get the next line for the tags and then split them up using strings.Split.

The last iteration on our happy path is to extract the body.

Here’s a reminder of the proposed file format.

  1. Title: Hello, TDD world!
  2. Description: First post on our wonderful blog
  3. Tags: tdd, go
  4. ---
  5. Hello world!
  6. The body of posts starts after the `---`

We’ve read the first 3 lines already. We then need to read one more line, discard it and then the remainder of the file contains the post’s body.

Write the test first

Change the test data to have the separator, and a body with a few newlines to check we grab all the content.

  1. const (
  2. firstBody = `Title: Post 1
  3. Description: Description 1
  4. Tags: tdd, go
  5. ---
  6. Hello
  7. World`
  8. secondBody = `Title: Post 2
  9. Description: Description 2
  10. Tags: rust, borrow-checker
  11. ---
  12. B
  13. L
  14. M`
  15. )

Add to our assertion like the others

  1. assertPost(t, posts[0], blogposts.Post{
  2. Title: "Post 1",
  3. Description: "Description 1",
  4. Tags: []string{"tdd", "go"},
  5. Body: `Hello
  6. World`,
  7. })

Try to run the test

  1. ./blogpost_test.go:60:3: unknown field 'Body' in struct literal of type blogposts.Post

As we’d expect.

Write the minimal amount of code for the test to run and check the failing test output

Add Body to Post and the test should fail.

  1. === RUN TestNewBlogPosts
  2. blogposts_test.go:38: got {Title:Post 1 Description:Description 1 Tags:[tdd go] Body:}, want {Title:Post 1 Description:Description 1 Tags:[tdd go] Body:Hello
  3. World}

Write enough code to make it pass

  1. Scan the next line to ignore the --- separator.
  2. Keep scanning until there’s nothing left to scan.
  1. func newPost(postBody io.Reader) (Post, error) {
  2. scanner := bufio.NewScanner(postBody)
  3. readMetaLine := func(tagName string) string {
  4. scanner.Scan()
  5. return strings.TrimPrefix(scanner.Text(), tagName)
  6. }
  7. title := readMetaLine(titleSeparator)
  8. description := readMetaLine(descriptionSeparator)
  9. tags := strings.Split(readMetaLine(tagsSeparator), ", ")
  10. scanner.Scan() // ignore a line
  11. buf := bytes.Buffer{}
  12. for scanner.Scan() {
  13. fmt.Fprintln(&buf, scanner.Text())
  14. }
  15. body := strings.TrimSuffix(buf.String(), "\n")
  16. return Post{
  17. Title: title,
  18. Description: description,
  19. Tags: tags,
  20. Body: body,
  21. }, nil
  22. }
  • scanner.Scan() returns a bool which indicates whether there’s more data to scan, so we can use that with a for loop to keep reading through the data until the end.
  • After every Scan() we write the data into the buffer using fmt.Fprintln. We use the version that adds a newline because the scanner removes the newlines from each line, but we need to maintain them.
  • Because of the above, we need to trim the final newline, so we don’t have a trailing one.

Refactor

Encapsulating the idea of getting the rest of the data into a function will help future readers quickly understand what is happening in newPost, without having to concern themselves with implementation specifics.

  1. func newPost(postBody io.Reader) (Post, error) {
  2. scanner := bufio.NewScanner(postBody)
  3. readMetaLine := func(tagName string) string {
  4. scanner.Scan()
  5. return strings.TrimPrefix(scanner.Text(), tagName)
  6. }
  7. return Post{
  8. Title: readMetaLine(titleSeparator),
  9. Description: readMetaLine(descriptionSeparator),
  10. Tags: strings.Split(readMetaLine(tagsSeparator), ", "),
  11. Body: readBody(scanner),
  12. }, nil
  13. }
  14. func readBody(scanner *bufio.Scanner) string {
  15. scanner.Scan() // ignore a line
  16. buf := bytes.Buffer{}
  17. for scanner.Scan() {
  18. fmt.Fprintln(&buf, scanner.Text())
  19. }
  20. return strings.TrimSuffix(buf.String(), "\n")
  21. }

Iterating further

We’ve made our “steel thread” of functionality, taking the shortest route to get to our happy path, but clearly there’s some distance to go before it is production ready.

We haven’t handled:

  • when the file’s format is not correct
  • the file is not a .md
  • what if the order of the metadata fields is different? Should that be allowed? Should we be able to handle it?

Crucially though, we have working software, and we have defined our interface. The above are just further iterations, more tests to write and drive our behaviour. To support any of the above we shouldn’t have to change our design, just implementation details.

Keeping focused on the goal means we made the important decisions, and validated them against the desired behaviour, rather than getting bogged down on matters that won’t affect the overall design.

Wrapping up

fs.FS, and the other changes in Go 1.16 give us some elegant ways of reading data from file systems and testing them simply.

If you wish to try out the code “for real”:

  • Create a cmd folder within the project, add a main.go file
  • Add the following code
  1. import (
  2. blogposts "github.com/quii/fstest-spike"
  3. "log"
  4. "os"
  5. )
  6. func main() {
  7. posts, err := blogposts.NewPostsFromFS(os.DirFS("posts"))
  8. if err != nil {
  9. log.Fatal(err)
  10. }
  11. log.Println(posts)
  12. }
  • Add some markdown files into a posts folder and run the program!

Notice the symmetry between the production code

  1. posts, err := blogposts.NewPostsFromFS(os.DirFS("posts"))

And the tests

  1. posts, err := blogposts.NewPostsFromFS(fs)

This is when consumer-driven, top-down TDD feels correct.

A user of our package can look at our tests and quickly get up to speed with what it’s supposed to do and how to use it. As maintainers, we can be confident our tests are useful because they’re from a consumer’s point of view. We’re not testing implementation details or other incidental details, so we can be reasonably confident that our tests will help us, rather than hinder us when refactoring.

By relying on good software engineering practices like dependency injection our code is simple to test and re-use.

When you’re creating packages, even if they’re only internal to your project, prefer a top-down consumer driven approach. This will stop you over-imagining designs and making abstractions you may not even need and will help ensure the tests you write are useful.

The iterative approach kept every step small, and the continuous feedback helped us uncover unclear requirements possibly sooner than with other, more ad-hoc approaches.

Writing?

It’s important to note that these new features only have operations for reading files. If your work needs to do writing, you’ll need to look elsewhere. Remember to keep thinking about what the standard library offers currently, if you’re writing data you should probably look into leveraging existing interfaces such as io.Writer to keep your code loosely-coupled and re-usable.

Further reading