Monday, March 10, 2014

Channels and pipes: close and errors

This post describes changes made to Go channels and tools built upon those to provide system and network-wide services for a new OS under construction. Further posts will follow and describe other related set of tools, but the post is already long enough. Yes, belts are gone, we use our modified channels directly now.

A channel is an artifact that can be used to send (typed) data through it. The Go language operations on channels include sending, receiving, and selection among a set of send and receive operations. Go permits also to close a channel after the last item has been sent.
On most systems, processes and applications talk through pipes, network connections, FIFOS, and related artifacts. In short, they are just file descriptors once open, permit the application to write data (for sending) and/or read data (for receiving). Some of these are duplex, but they can be considered to be a pair of devices (one for each direction). In what follows we will refer to all these artifacts as pipes (e.g., a net- work connection may be considered as a pair of simplex pipes).
There is a mismatch between channels and pipes and this paper shows what we did to try to bridge the gap between both abstractions for a new system. The aim is to let applications leverage the CSP style of programming while, at the same time, let them work across the network.
We assume that the reader is familiar with channels in the Go language, and describes only our modi- fications and additions.


Close and errors

When using pipes, each end of the pipe may close and the pipe implementation takes care of propagating the error to the other end. That is not the case with standard Go channels. Furthermore, upon errors, it is desirable for one end of the pipe to learn about the error that did happen at the other end.
We have modified the standard Go implementation to:
  1. 1  Accept an optional error argument to close.
  2. 2  Make the send operation return false when used on a closed channel (instead of panicing; the receive operation already behaves nicely the case of a closed channel).
  3. 3  Provide a primitive, cerror, that returns the error given when the channel was closed.
  4. 4  Make a close of an already closed channel a no-operation (instead of a panic).
With this modified tool in hand, it is feasible to write the following code:

var inc, outc chan[]byte
...
for data := range inc {
ndata := modify(data)
if ok := outc <-ndata; !ok {
close(inc, cerror(outc))
break
}
}
close(outc, cerror(inc))


Here, a process consumes data from inc and produces new data through outc for another one. The image to have in mind is


where the code shown corresponds to the middle process. Perhaps the first process terminates normally (or abnormally), in which case it would close inc. In this case, our code closes outc as expected. But this time, the error given by the first process is known to the second process, and it can even forward such error to the third one.
A more interesting case is when the third process decides to cease consuming data from outc and calls close. Now, our middle process will notice that ok becomes false when it tries to send more data, and can break its loop cleanly, closing also the input channel to singal to the first process that there is no point in producing further data. In this second example, the last call to close is a no-operation because the output channel was already closed, and we don’t need to add unnecessary code to prevent the call.
The important point is that termination of the data stream is easy to handle for the program without resorting to exceptions (or panics), and we know which one is the error, so we can take whatever measures are convenient in that case.


Channels and pipes
There are three big differences between channels and pipes (we are using pipe to refer to any ‘‘file descrip- tor’’ used to convey data, as stated before). One is that pipes may have errors when sending or receiving, but channels do not. Another one is that pipes carry ony streams of bytes and not separate messages. Yet another is that channels convey a data type but pipes convey just bytes.
The first difference is mostly dealt with the changes made to channels as described in the previous section. That is, channels may have errors while sending and or receiving, considered the changes made. Therefore, the code using a channel must consider errors in very much the same way it would do if using a pipe.
To address the third difference we are going to consider channels of byte arrays by now.
The second difference can be dealt with by ensuring that applications using channels to speak through a pipe preserve message boundaries within the pipe. With this in mind, a new nchan package pro- vides new channel tools to bridge the gap between the channel and the pipe domains.
The following function writes each message received from c into w as it arrives. If w preserves mes- sage boundaries, that is enough. The second function is its counterpart.

func WriteBytesTo(w io.Writer, c <-chan []byte) (int64, error)
func ReadBytesFrom(r io.Reader, c chan<- []byte) (int64, error)

 However, is most cases, the transport does not preserve message boundaries. Thus, the next function writes all messages received from c into w, but precedes each such write with a header indicating the message length. The second function can rely on this to read one message at a time and forward it to the given chan- nel.
  func WriteMsgsTo(w io.Writer, c <-chan []byte) (int64, error)
func ReadMsgsFrom(r io.Reader, c chan<- []byte) (int64, error)


One interesting feature of WriteMsgsTo and ReadMsgsFrom is that when the channel is closed, its error sta- tus is checked out and forwarded through the pipe. The other end notices that the message is an error indi- cation and closes the channel with said error.
Thus, code like the excerpt shown for our middle process in the stream of processes would work correctly even if the input channel comes from a pipe and not from a another process within the same program.

The the post in the series will be about channel connections and channel multiplexors.


No comments:

Post a Comment