summaryrefslogtreecommitdiff
path: root/vendor/github.com/ugorji/go/codec/doc.go
blob: 750dd234ac652119724faf1ceae2b9f4deec2313 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
// Copyright (c) 2012-2020 Ugorji Nwoke. All rights reserved.
// Use of this source code is governed by a MIT license found in the LICENSE file.

/*
Package codec provides a
High Performance, Feature-Rich Idiomatic Go 1.4+ codec/encoding library
for binc, msgpack, cbor, json.

Supported Serialization formats are:

  - msgpack: https://github.com/msgpack/msgpack
  - binc:    http://github.com/ugorji/binc
  - cbor:    http://cbor.io http://tools.ietf.org/html/rfc7049
  - json:    http://json.org http://tools.ietf.org/html/rfc7159
  - simple:  (unpublished)

This package will carefully use 'package unsafe' for performance reasons in specific places.
You can build without unsafe use by passing the safe or appengine tag
i.e. 'go install -tags=codec.safe ...'.

This library works with both the standard `gc` and the `gccgo` compilers.

For detailed usage information, read the primer at http://ugorji.net/blog/go-codec-primer .

The idiomatic Go support is as seen in other encoding packages in
the standard library (ie json, xml, gob, etc).

Rich Feature Set includes:

  - Simple but extremely powerful and feature-rich API
  - Support for go 1.4 and above, while selectively using newer APIs for later releases
  - Excellent code coverage ( > 90% )
  - Very High Performance.
    Our extensive benchmarks show us outperforming Gob, Json, Bson, etc by 2-4X.
  - Careful selected use of 'unsafe' for targeted performance gains.
  - 100% safe mode supported, where 'unsafe' is not used at all.
  - Lock-free (sans mutex) concurrency for scaling to 100's of cores
  - In-place updates during decode, with option to zero value in maps and slices prior to decode
  - Coerce types where appropriate
    e.g. decode an int in the stream into a float, decode numbers from formatted strings, etc
  - Corner Cases:
    Overflows, nil maps/slices, nil values in streams are handled correctly
  - Standard field renaming via tags
  - Support for omitting empty fields during an encoding
  - Encoding from any value and decoding into pointer to any value
    (struct, slice, map, primitives, pointers, interface{}, etc)
  - Extensions to support efficient encoding/decoding of any named types
  - Support encoding.(Binary|Text)(M|Unm)arshaler interfaces
  - Support using existence of `IsZero() bool` to determine if a value is a zero value.
    Analogous to time.Time.IsZero() bool.
  - Decoding without a schema (into a interface{}).
    Includes Options to configure what specific map or slice type to use
    when decoding an encoded list or map into a nil interface{}
  - Mapping a non-interface type to an interface, so we can decode appropriately
    into any interface type with a correctly configured non-interface value.
  - Encode a struct as an array, and decode struct from an array in the data stream
  - Option to encode struct keys as numbers (instead of strings)
    (to support structured streams with fields encoded as numeric codes)
  - Comprehensive support for anonymous fields
  - Fast (no-reflection) encoding/decoding of common maps and slices
  - Code-generation for faster performance, supported in go 1.6+
  - Support binary (e.g. messagepack, cbor) and text (e.g. json) formats
  - Support indefinite-length formats to enable true streaming
    (for formats which support it e.g. json, cbor)
  - Support canonical encoding, where a value is ALWAYS encoded as same sequence of bytes.
    This mostly applies to maps, where iteration order is non-deterministic.
  - NIL in data stream decoded as zero value
  - Never silently skip data when decoding.
    User decides whether to return an error or silently skip data when keys or indexes
    in the data stream do not map to fields in the struct.
  - Detect and error when encoding a cyclic reference (instead of stack overflow shutdown)
  - Encode/Decode from/to chan types (for iterative streaming support)
  - Drop-in replacement for encoding/json. `json:` key in struct tag supported.
  - Provides a RPC Server and Client Codec for net/rpc communication protocol.
  - Handle unique idiosyncrasies of codecs e.g.
    For messagepack, configure how ambiguities in handling raw bytes are resolved and
    provide rpc server/client codec to support
    msgpack-rpc protocol defined at:
    https://github.com/msgpack-rpc/msgpack-rpc/blob/master/spec.md

# Supported build tags

We gain performance by code-generating fast-paths for slices and maps of built-in types,
and monomorphizing generic code explicitly so we gain inlining and de-virtualization benefits.

The results are 20-40% performance improvements.

Building and running is configured using build tags as below.

At runtime:

- codec.safe: run in safe mode (not using unsafe optimizations)
- codec.notmono: use generics code (bypassing performance-boosting monomorphized code)
- codec.notfastpath: skip fast path code for slices and maps of built-in types (number, bool, string, bytes)

Each of these "runtime" tags have a convenience synonym i.e. safe, notmono, notfastpath.
Pls use these mostly during development - use codec.XXX in your go files.

Build only:

- codec.build: used to generate fastpath and monomorphization code

Test only:

- codec.notmammoth: skip the mammoth generated tests

# Extension Support

Users can register a function to handle the encoding or decoding of
their custom types.

There are no restrictions on what the custom type can be. Some examples:

	type BisSet   []int
	type BitSet64 uint64
	type UUID     string
	type MyStructWithUnexportedFields struct { a int; b bool; c []int; }
	type GifImage struct { ... }

As an illustration, MyStructWithUnexportedFields would normally be
encoded as an empty map because it has no exported fields, while UUID
would be encoded as a string. However, with extension support, you can
encode any of these however you like.

There is also seamless support provided for registering an extension (with a tag)
but letting the encoding mechanism default to the standard way.

# Custom Encoding and Decoding

This package maintains symmetry in the encoding and decoding halfs.
We determine how to encode or decode by walking this decision tree

  - is there an extension registered for the type?
  - is type a codec.Selfer?
  - is format binary, and is type a encoding.BinaryMarshaler and BinaryUnmarshaler?
  - is format specifically json, and is type a encoding/json.Marshaler and Unmarshaler?
  - is format text-based, and type an encoding.TextMarshaler and TextUnmarshaler?
  - else we use a pair of functions based on the "kind" of the type e.g. map, slice, int64, etc

This symmetry is important to reduce chances of issues happening because the
encoding and decoding sides are out of sync e.g. decoded via very specific
encoding.TextUnmarshaler but encoded via kind-specific generalized mode.

Consequently, if a type only defines one-half of the symmetry
(e.g. it implements UnmarshalJSON() but not MarshalJSON() ),
then that type doesn't satisfy the check and we will continue walking down the
decision tree.

# RPC

RPC Client and Server Codecs are implemented, so the codecs can be used
with the standard net/rpc package.

# Usage

The Handle is SAFE for concurrent READ, but NOT SAFE for concurrent modification.

The Encoder and Decoder are NOT safe for concurrent use.

Consequently, the usage model is basically:

  - Create and initialize the Handle before any use.
    Once created, DO NOT modify it.
  - Multiple Encoders or Decoders can now use the Handle concurrently.
    They only read information off the Handle (never write).
  - However, each Encoder or Decoder MUST not be used concurrently
  - To re-use an Encoder/Decoder, call Reset(...) on it first.
    This allows you use state maintained on the Encoder/Decoder.

Sample usage model:

	// create and configure Handle
	var (
	  bh codec.BincHandle
	  mh codec.MsgpackHandle
	  ch codec.CborHandle
	)

	mh.MapType = reflect.TypeOf(map[string]interface{}(nil))

	// configure extensions
	// e.g. for msgpack, define functions and enable Time support for tag 1
	// mh.SetExt(reflect.TypeOf(time.Time{}), 1, myExt)

	// create and use decoder/encoder
	var (
	  r io.Reader
	  w io.Writer
	  b []byte
	  h = &bh // or mh to use msgpack
	)

	dec = codec.NewDecoder(r, h)
	dec = codec.NewDecoderBytes(b, h)
	err = dec.Decode(&v)

	enc = codec.NewEncoder(w, h)
	enc = codec.NewEncoderBytes(&b, h)
	err = enc.Encode(v)

	//RPC Server
	go func() {
	    for {
	        conn, err := listener.Accept()
	        rpcCodec := codec.GoRpc.ServerCodec(conn, h)
	        //OR rpcCodec := codec.MsgpackSpecRpc.ServerCodec(conn, h)
	        rpc.ServeCodec(rpcCodec)
	    }
	}()

	//RPC Communication (client side)
	conn, err = net.Dial("tcp", "localhost:5555")
	rpcCodec := codec.GoRpc.ClientCodec(conn, h)
	//OR rpcCodec := codec.MsgpackSpecRpc.ClientCodec(conn, h)
	client := rpc.NewClientWithCodec(rpcCodec)

# Running Tests

To run tests, use the following:

	go test

To run the full suite of tests, use the following:

	go test -tags alltests -run Suite

You can run the tag 'codec.safe' to run tests or build in safe mode. e.g.

	go test -tags codec.safe -run Json
	go test -tags "alltests codec.safe" -run Suite

You can run the tag 'codec.notmono' to build bypassing the monomorphized code e.g.

	go test -tags codec.notmono -run Json

Running Benchmarks

	cd bench
	go test -bench . -benchmem -benchtime 1s

Please see http://github.com/ugorji/go-codec-bench .

# Caveats

Struct fields matching the following are ignored during encoding and decoding
  - struct tag value set to -
  - func, complex numbers, unsafe pointers
  - unexported and not embedded
  - unexported and embedded and not struct kind
  - unexported and embedded pointers (from go1.10)

Every other field in a struct will be encoded/decoded.

Embedded fields are encoded as if they exist in the top-level struct,
with some caveats. See Encode documentation.
*/
package codec

/*
Generics

Generics are used across to board to reduce boilerplate, and hopefully
improve performance by
- reducing need for interface calls (de-virtualization)
- resultant inlining of those calls

encoder/decoder --> Driver (json/cbor/...) --> input/output (bytes or io abstraction)

There are 2 * 5 * 2 (20) combinations of monomorphized values.

Key rules
- do not use top-level generic functions.
  Due to type inference, monomorphizing them proves challenging
- only use generic methods.
  Monomorphizing is done at the type once, and method names need not change
- do not have method calls have a parameter of an encWriter or decReader.
  All those calls are handled directly by the driver.
- Include a helper type for each parameterized thing, and add all generic functions to them e.g.
  helperEncWriter[T encWriter]
  helperEncReader[T decReader]
  helperEncDriver[T encDriver]
  helperDecDriver[T decDriver]
- Always use T as the generic type name (when needed)
- No inline types
- No closures taking parameters of generic types

*/
/*
Naming convention:

Currently, as generic and non-generic types/functions/vars are put in the same files,
we suffer because:
- build takes longer as non-generic code is built when a build tag wants only monomorphised code
- files have many lines which are not used at runtime (due to type parameters)
- code coverage is inaccurate on a single run

To resolve this, we are streamlining our file naming strategy.

Basically, we will have the following nomenclature for filenames:
- fastpath (tag:notfastpath):        *.notfastpath.*.go vs *.fastpath.*.go
- typed parameters (tag:notmono):    *.notmono.*.go vs *.mono.*.go
- safe (tag:safe):                   *.safe.*.go vs *.unsafe.go
- generated files:                   *.generated.go
- all others (tags:N/A):             *.go without safe/mono/fastpath/generated in the name

The following files will be affected and split/renamed accordingly

Base files:
- binc.go
- cbor.go
- json.go
- msgpack.go
- simple.go
- decode.go
- encode.go

For each base file, split into __file__.go (containing type parameters) and __file__.base.go.
__file__.go will only build with notmono.

Other files:
- fastpath.generated.go -> base.fastpath.generated.go and base.fastpath.notmono.generated.go
- fastpath.not.go       -> base.notfastpath.go
- init.go               -> init.notmono.go

Appropriate build tags will be included in the files, and the right ones only used for
monomorphization.
*/
/*
Caching Handle options for fast runtime use

If using cached values from Handle options, then
- re-cache them at each reset() call
- reset is always called at the start of each (Must)(En|De)code
  - which calls (en|de)coder.reset([]byte|io.Reader|String)
  - which calls (en|de)cDriver.reset()
- at reset, (en|de)c(oder|Driver) can re-cache Handle options before each run

Some examples:
- json: e.rawext,di,d,ks,is / d.rawext
- decode: (decoderBase) d.jsms,mtr,str,
*/