Getting our backend Go-ing

Datetime:2016-08-23 05:15:50          Topic: Test Engineer  Golang           Share

Since the latter half of last year, my teammates at Wiredcraft have started using tools written with Golang (also known simply as Go) and using the Go directory on projects such as building the voter registration software for the Myanmar elections last year, their first in a quarter century. Check out thetech we used and why chose to use Golang for this project. As a fan of strongly-typed programming languages, I think it’s certainly time to check out this relatively young one.

Go, like other programming languages, offers many standard libraries to help users handler files by their buffer, position, or line. However, unlike Javascript or Node.js , Go doesn’t have any standard asynchronous input-output (I/O) libs. To get around this obstacle, you can easily use Go’s concurrency to implement non-blocking I/O. As a beginner with this language, I’ll try to list some methods for finding and catching wanted information in files using Go.

Examples

You can explore the demo code in one of my Github repos, query_file_demo .

Ok, let us start with the the simplest way: using the ioutil lib

func SimpleReader(path string) string {
 f, err := ioutil.ReadFile(path)
 CheckError(err)
 lines := strings.Split(string(f), "\n")
 re := regexp.MustCompile(`\bslowpoke\b`)
 var result string
for _, line := range lines {
  if re.MatchString(line) {
   result = line
  }
 }
 return result
}

After this, strings.Split() can help to convert loaded content to a string array and regexp.MatchString() will match the regex expression with each element of the array. Be careful: you should not read a file with the ioutil.ReadFile()method if the file is too large to load in memory at once. You can learn why from the Go lib’s source code .

func ReadFile(filename string) ([]byte, error) {
...
return readAll(f, n+bytes.MinRead)
}

Here, n is the size of the file and you’ll get bytes.ErrTooLarge if the file overflows the buffer.

Next, you can try to use the bufio lib:

func Scanner(path string) string {
 f, err := os.Open(path)
 CheckError(err)
 defer f.Close()
var result string
 scanner := bufio.NewScanner(f)
 re := regexp.MustCompile(`\bslowpoke\b`)
 for scanner.Scan() {
  s := scanner.Text()
  if re.MatchString(s) {
   result = s
  }
 }
 return result
}

Again, you can read the source code of NewScanner function:

const ( 
// MaxScanTokenSize is the maximum size used to buffer a token 
// unless the user provides an explicit buffer with Scan.Buffer. 
// The actual maximum token size may be smaller as the buffer 
// may need to include, for instance, a newline. 
  MaxScanTokenSize = 64 * 1024  
  startBufSize = 4096 // Size of initial allocation for buffer.
)
func NewScanner(r io.Reader) *Scanner { 
  return &Scanner{  
    r:            r,  
    split:        ScanLines,  
    maxTokenSize: MaxScanTokenSize, 
    }
}

In this snippet, MaxScanTokenSize is the max size for each line, but it is only reading one line at time. Only if you line size is above MaxScanTokenSize, you can read the file as big as you want. If you don’t want the file to be read line by line, you can change to file.Read() and IO.copy(). These APIs allow you to define a buffer size for reading a file.

Still, if you want to use Go concurrency to help you make the job run faster, you can use the Go channel and goroutines .

func ChannelReader(path string) string {
 works := 10
 f, err := os.Open(path)
 CheckError(err)
 defer f.Close()
 jobs := make(chan string)
 results := make(chan string)
 complete := make(chan bool)
go func() {
  scanner := bufio.NewScanner(f)
  for scanner.Scan() {
   jobs <- scanner.Text()
  }
  close(jobs)
 }()
for i := 0; i <= works; i++ {
  go grepLine(jobs, results, complete)
 }
for i := 0; i < works; i++ {
  <-complete
 }
 return <-results
}
func grepLine(jobs <-chan string, results chan<- string, complete chan bool) {
 re := regexp.MustCompile(`\bslowpoke\b`)
 for j := range jobs {
  if re.MatchString(j) {
   results <- j
  }
 }
 complete <- true
}

This code creates 10 goroutines for the grep information job. When every goroutine is finished, the complete channel with get all true values, then any blocking code <-complete can pass.

Bechmark

The Go Testing Suite ( testing ) not only supports automated testing of Go packages, but it also contains some benchmark tools. All you need to do is write your test case, run the command, turn on the -benchmem flag, and add the result with memory consumption.

go test -bench=. -benchmem

When I ran the test, I used a small file, so the results and bench directory in my repo are merely showing how to benchmark a function. Here are the results I collected from my computing:

testing: warning: no tests to run
PASS
BenchmarkChannelReader-4 grep line: 79,slowpoke,79,12,360,63,98,1
    1000    1712159 ns/op   490096 B/op     1149 allocs/op
BenchmarkScanner-4       79,slowpoke,79,12,360,63,98,1
    1000    1766001 ns/op    77130 B/op      846 allocs/op
BenchmarkSimpleReader-4  grep line: 79,slowpoke,79,12,360,63,98,1
    1000    1791945 ns/op   112651 B/op       37 allocs/op
ok   github.com/chopperlee2011/query_file_demo/bench 5.816s

Conclusion

I am glad to know that Go supports a rather convenient and fast lib for doing some concurrency work, which can help developers break through difficult technical bottlenecks when building apps with other languages. On top of that, the performance of Go’s tools and framework, such as Gin and NSQ really impress me.

If you want to dig more into Go, I think Gopher Academy is a great place to start and you can chat with and meet more Go gophers in their Slack channel . The annual conference, Gophercon , took place last month, so look out for updates from that and look for announcements about next year’s.

I hope you enjoy exploring this cool language as much as I do. It really helped us a lot with the Myanmar project. If you have any thoughts to share about Golang or its related tools, send us an email (info@wiredcraft.com) or ping us on Twitter ( @wiredcraft ).





About List