The Go Cookbook

A community built and contributed collection of practical recipes for real world Golang development.

View project on GitHub

Processing a String One Word or Character at a Time

Given a string, how do I break it into words or characters and process each one in turn?

Each Character

Because of Go’s built in support for Unicode “runes”, processing a string one character at a time is quite straightforward. Simply iterate over the range of that string:

test_each_char.go
package main

import "fmt"

func main() {
	for i, c := range "abc" {
		fmt.Println(i, " => ", string(c))
	}
}
$ go run test_each_char.go
0  =>  a
1  =>  b
2  =>  c

Each Word

Processing a string one word at a time is a bit more involved, and depends on your specific needs. If you’re fine with the unsophisticated approach of cutting the string into words based on whitespace, then you’re in luck - strings.Fields was built just for you:

test_words.go
package main

import (
	"fmt"
	"strings"
)

func main() {
	words := strings.Fields("This, that, and the other.")
	for i, word := range words {
		fmt.Println(i, " => ", word)
	}
}
$ go run test_words.go
0  =>  This,
1  =>  that,
2  =>  and
3  =>  the
4  =>  other.

Without Punctuation

However, most applications will need a more grammatically tolerant approach, where punctuation is taken into account. Here we have two options. We can either make use of a strings.Replacer, which we generate via the strings.NewReplacer function:

test_without_punctuation.go
package main

import (
	"fmt"
	"strings"
)

func main() {
	s := "This, that, and the other."
	replacer := strings.NewReplacer(",", "", ".", "", ";", "")
	s = replacer.Replace(s)
	words := strings.Fields(s)
	for i, word := range words {
		fmt.Println(i, " => ", word)
	}
}
$ go run test_without_punctuation.go
0  =>  This
1  =>  that
2  =>  and
3  =>  the
4  =>  other

Or we can achieve a bit more clarity by making use of strings.Map:

test_without_punctuation_using_map.go
package main

import (
	"fmt"
	"strings"
)

func main() {
	removePunctuation := func(r rune) rune {
		if strings.ContainsRune(".,:;", r) {
			return -1
		} else {
			return r
		}
	}

	s := "This, that, and the other."
	s = strings.Map(removePunctuation, s)
	words := strings.Fields(s)
	for i, word := range words {
		fmt.Println(i, " => ", word)
	}
}
$ go run test_without_punctuation_using_map.go
0  =>  This
1  =>  that
2  =>  and
3  =>  the
4  =>  other

Special Separators

There are other situations where you’d want to split a string based on a separator other than whitespace. The UNIX /etc/passwd file, for example, contains lines of tokens separated by colons. Splitting each line into the relevant pieces is easy in Go, with the strings.Split function, which is a more generic form of strings.Fields:

test_separator.go
package main

import (
	"fmt"
	"strings"
)

func main() {
	s := "root:*:0:0:System Administrator:/root:/bin/sh"
	words := strings.Split(s, ":")
	for i, word := range words {
		fmt.Println(i, " => ", word)
	}
}
$ go run test_separator.go
0  =>  root
1  =>  *
2  =>  0
3  =>  0
4  =>  System Administrator
5  =>  /root
6  =>  /bin/sh