This repository has been archived by the owner on May 13, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 22
/
Copy pathgoogletrans_bench_test.go
446 lines (442 loc) · 23.4 KB
/
googletrans_bench_test.go
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
package googletrans
import (
"testing"
)
func BenchmarkTranslate(b *testing.B) {
for i := 0; i < b.N; i++ {
params := TranslateParams{
Src: "auto",
Dest: "zh-CN",
Text: "Go is an open source programming language that makes it easy to build simple, reliable, and efficient software. ",
}
Translate(params)
}
}
func BenchmarkParseRawTranslated(b *testing.B) {
data := []byte(rawTraslatedStr)
var t *Translator
for i := 0; i < b.N; i++ {
t.parseRawTranslated(data)
}
}
var rawTraslatedStr = `
[[["Go中的字符串,字节,符文和字符\n\n","Strings, bytes, runes and characters in Go\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["罗伯·派克\n","Rob Pike\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["2013年10月23日\n","23 October 2013\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["介绍\n\n","Introduction\n\n",null,null,1]
,["上一篇博客文章使用许多示例说明了切片在Go中的工作原理,以说明其实现的机制。","The previous blog post explained how slices work in Go, using a number of examples to illustrate the mechanism behind their implementation.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["在此背景下,本文讨论了Go中的字符串。","Building on that background, this post discusses strings in Go.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["最初,字符串对于博客文章而言似乎太简单了,但是要想很好地使用它们,不仅需要了解它们的工作原理,还需要了解字节,字符和符文之间的区别,以及Unicode和UTF- ","At first, strings might seem too simple a topic for a blog post, but to use them well requires understanding not only how they work, but also the difference between a byte, a character, and a rune, the difference between Unicode and UTF-",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["8,字符串和字符串文字之间的区别,以及其他更细微的区别。\n\n","8, the difference between a string and a string literal, and other even more subtle distinctions.\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["处理该主题的一种方法是将其视为对以下常见问题的回答:“当我在位置n处索引Go字符串时,为什么不得到第n个字符?”","One way to approach this topic is to think of it as an answer to the frequently asked question, \"When I index a Go string at position n, why don't I get the nth character?\"",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["如您所见,这个问题使我们获得了许多有关文本在现代世界中如何工作的细节。\n\n","As you'll see, this question leads us to many details about how text works in the modern world.\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["Joel Spolsky的著名博客文章,绝对绝对是每个软件开发人员绝对肯定地了解Unicode和字符集(无借口!),是其中一个独立于Go的极好的介绍。","An excellent introduction to some of these issues, independent of Go, is Joel Spolsky's famous blog post, The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["他提出的许多观点将在这里得到回应。\n","Many of the points he raises will be echoed here.\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["什么是琴弦?\n\n","What is a string?\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["让我们从一些基础知识开始。\n\n","Let's start with some basics.\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["在Go中,字符串实际上是只读的字节片。","In Go, a string is in effect a read-only slice of bytes.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["如果您不确定某个字节是什么字节或如何工作,请阅读上一篇博客文章;","If you're at all uncertain about what a slice of bytes is or how it works, please read the previous blog post;",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["我们在这里假设您有。\n\n","we'll assume here that you have.\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["重要的是要预先声明字符串包含任意字节。","It's important to state right up front that a string holds arbitrary bytes.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["不需要保留Unicode文本,UTF-8文本或任何其他预定义格式。","It is not required to hold Unicode text, UTF-8 text, or any other predefined format.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["就字符串的内容而言,它完全相当于一个字节片。\n\n","As far as the content of a string is concerned, it is exactly equivalent to a slice of bytes.\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["这是一个字符串文字(稍后将详细介绍),该文字使用\\ xNN表示法定义一个包含一些特殊字节值的字符串常量。 ","Here is a string literal (more about those soon) that uses the \\xNN notation to define a string constant holding some peculiar byte values.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["(当然,字节的范围是十六进制值00到FF,包括两端)。\n\n ","(Of course, bytes range from hexadecimal values 00 through FF, inclusive.)\n\n ",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["const sample \u003d“ \\ xbd \\ xb2 \\ x3d \\ xbc \\ x20 \\ xe2 \\ x8c \\ x98”\n\n","const sample \u003d \"\\xbd\\xb2\\x3d\\xbc\\x20\\xe2\\x8c\\x98\"\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["打印字符串\n\n","Printing strings\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["由于示例字符串中的某些字节不是有效的ASCII,甚至不是有效的UTF-8,因此直接打印字符串将产生难看的输出。","Because some of the bytes in our sample string are not valid ASCII, not even valid UTF-8, printing the string directly will produce ugly output.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["简单的打印声明\n\n ","The simple print statement\n\n ",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["fmt.Println(样本)\n\n","fmt.Println(sample)\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["产生这种混乱(其确切外观随环境而变化):\n\n ","produces this mess (whose exact appearance varies with the environment):\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["��\u003d�⌘\n\n","��\u003d� ⌘\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["要找出该字符串的真正含义,我们需要将其拆开并检查各个部分。","To find out what that string really holds, we need to take it apart and examine the pieces.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["有几种方法可以做到这一点。","There are several ways to do this.",null,null,1]
,["最明显的是遍历其内容并逐个拉出字节,如以下for循环所示:\n\n ","The most obvious is to loop over its contents and pull out the bytes individually, as in this for loop:\n\n ",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["对于我:\u003d 0; ","for i :\u003d 0;",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["i \u003clen(样本);","i \u003c len(sample);",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["我++ {\n ","i++ {\n ",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["fmt.Printf(“%x”,sample [i])\n ","fmt.Printf(\"%x \", sample[i])\n ",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["}\n\n","}\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["如前所述,对字符串进行索引访问的是单个字节,而不是字符。","As implied up front, indexing a string accesses individual bytes, not characters.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["我们将在下面详细返回该主题。","We'll return to that topic in detail below.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["现在,让我们仅保留字节。","For now, let's stick with just the bytes.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["这是逐字节循环的输出:\n\n","This is the output from the byte-by-byte loop:\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["bd b2 3d bc 20 e2 8c 98\n\n","bd b2 3d bc 20 e2 8c 98\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["注意各个字节如何与定义字符串的十六进制转义符匹配。\n\n","Notice how the individual bytes match the hexadecimal escapes that defined the string.\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["为混乱的字符串生成可显示的输出的一种较短方法是使用fmt.Printf的%x(十六进制)格式动词。","A shorter way to generate presentable output for a messy string is to use the %x (hexadecimal) format verb of fmt.Printf.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["它只是将字符串的顺序字节转储为十六进制数字,每个字节两个。\n\n ","It just dumps out the sequential bytes of the string as hexadecimal digits, two per byte.\n\n ",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["fmt.Printf(“%x \\ n”,示例)\n\n","fmt.Printf(\"%x\\n\", sample)\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["将其输出与上面的输出进行比较:\n\n","Compare its output to that above:\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["bdb23dbc20e28c98\n\n","bdb23dbc20e28c98\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["一个不错的技巧是使用该格式的“空格”标志,在%和x之间放置一个空格。","A nice trick is to use the \"space\" flag in that format, putting a space between the % and the x.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["将此处使用的格式字符串与上面的格式字符串进行比较,\n\n ","Compare the format string used here to the one above,\n\n ",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["fmt.Printf(“%x \\ n”,示例)\n\n","fmt.Printf(\"% x\\n\", sample)\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["并注意字节之间如何留出空格,从而使结果不那么强悍:\n\n","and notice how the bytes come out with spaces between, making the result a little less imposing:\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["bd b2 3d bc 20 e2 8c 98\n\n","bd b2 3d bc 20 e2 8c 98\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["还有更多。 ","There's more.",null,null,1]
,["%q(带引号)动词将转义字符串中所有不可打印的字节序列,因此输出无歧义。\n\n ","The %q (quoted) verb will escape any non-printable byte sequences in a string so the output is unambiguous.\n\n ",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["fmt.Printf(“%q \\ n”,示例)\n\n","fmt.Printf(\"%q\\n\", sample)\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["当大部分字符串可理解为文本但有一些特殊的含义可以根除时,此技术很方便。","This technique is handy when much of the string is intelligible as text but there are peculiarities to root out;",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["它产生:\n\n","it produces:\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["“ \\ xbd \\ xb2 \u003d \\ xbc⌘”\n\n","\"\\xbd\\xb2\u003d\\xbc ⌘\"\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["如果我们斜视一下,我们可以看到噪声中隐藏的是一个ASCII等号以及规则的空格,最后出现了著名的瑞典“景点”符号。","If we squint at that, we can see that buried in the noise is one ASCII equals sign, along with a regular space, and at the end appears the well-known Swedish \"Place of Interest\" symbol.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["该符号的Unicode值为U + 2318,由空格后的字节(十六进制值20)编码为UTF-8:e2 8c 98。\n\n","That symbol has Unicode value U+2318, encoded as UTF-8 by the bytes after the space (hex value 20): e2 8c 98.\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["如果我们对字符串中的陌生值不熟悉或感到困惑,可以对%q动词使用“加号”标志。","If we are unfamiliar or confused by strange values in the string, we can use the \"plus\" flag to the %q verb.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["此标志使输出在解释UTF-8时不仅转义不可打印的序列,而且转义所有非ASCII字节。","This flag causes the output to escape not only non-printable sequences, but also any non-ASCII bytes, all while interpreting UTF-8.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["结果是它公开了格式正确的UTF-8的Unicode值,该值表示字符串中的非ASCII数据:\n\n ","The result is that it exposes the Unicode values of properly formatted UTF-8 that represents non-ASCII data in the string:\n\n ",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["fmt.Printf(“%+ q \\ n”,示例)\n\n","fmt.Printf(\"%+q\\n\", sample)\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["使用这种格式,瑞典符号的Unicode值显示为\\ u转义:\n\n","With that format, the Unicode value of the Swedish symbol shows up as a \\u escape:\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["“ \\ xbd \\ xb2 \u003d \\ xbc \\ u2318”\n\n","\"\\xbd\\xb2\u003d\\xbc \\u2318\"\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["在调试字符串的内容时,这些打印技术很不错,并且在下面的讨论中会很方便。","These printing techiques are good to know when debugging the contents of strings, and will be handy in the discussion that follows.",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["值得指出的是,所有这些方法对于字节片的行为与对字符串的行为完全相同。\n\n","It's worth pointing out as well that all these methods behave exactly the same for byte slices as they do for strings.\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["这是我们列出的全套打印选项,以完整的程序形式显示,您可以在浏览器中直接运行(和编辑):\n\n","Here's the full set of printing options we've listed, presented as a complete program you can run (and edit) right in the browser:\n\n",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,["包米","package m",null,null,3,null,null,[[]
]
,[[["b3ad15e7a0073e77814019b341d18493","en_zh_2019q3.md"]
]
]
]
,[null,null,"Go zhōng de zìfú chuàn, zì jié, fú wén hé zìfú\n\nluō bó·pàikè\n2013 nián 10 yuè 23 rì\njièshào\n\nshàng yī piān bókè wénzhāng shǐyòng xǔduō shìlì shuōmíngliǎo qiēpiàn zài Go zhōng de gōngzuò yuánlǐ, yǐ shuōmíng qí shíxiàn de jīzhì. Zài cǐ bèijǐng xià, běnwén tǎolùnle Go zhōng de zìfú chuàn. Zuìchū, zìfú chuàn duìyú bókè wénzhāng ér yán sìhū tài jiǎndānle, dànshì yào xiǎng hěn hǎo dì shǐyòng tāmen, bùjǐn xūyào liǎojiě tāmen de gōngzuò yuánlǐ, hái xūyào liǎojiě zì jié, zìfú hé fúwénzhī jiān de qūbié, yǐjí Unicode hé UTF- 8, zìfú chuàn hé zìfú chuàn wénzì zhī jiān de qūbié, yǐjí qítā gèng xìwéi de qūbié.\n\nChǔlǐ gāi zhǔtí de yī zhǒng fāngfǎ shì jiāng qí shì wéi duì yǐxià chángjiàn wèntí de huídá:“Dāng wǒ zài wèizhì n chù suǒyǐn Go zìfú chuàn shí, wèishéme bù dédào dì n gè zìfú?” Rú nín suǒ jiàn, zhège wèntí shǐ wǒmen huòdéle xǔduō yǒuguān wénběn zài xiàndài shìjiè zhōng rúhé gōngzuò de xìjié.\n\nJoel Spolsky de zhùmíng bókè wénzhāng, juéduì juéduì shì měi gè ruǎnjiàn kāifā rényuán juéduì kěndìng dì liǎojiě Unicode hé zìfú jí (wú jièkǒu!), Shì qízhōng yīgè dúlì yú Go de jí hǎo de jièshào. Tā tíchū de xǔduō guāndiǎn jiàng zài zhèlǐ dédào huíyīng.\nShénme shì qín xián?\n\nRàng wǒmen cóng yīxiē jīchǔ zhīshì kāishǐ.\n\nZài Go zhōng, zìfú chuàn shíjì shang shì zhǐ dú de zì jié piàn. Rúguǒ nín bù quèdìng mǒu gè zì jié shì shénme zì jié huò rúhé gōngzuò, qǐng yuèdú shàng yī piān bókè wénzhāng; wǒmen zài zhèlǐ jiǎshè nín yǒu.\n\nZhòngyào de shì yào yùxiān shēngmíng zìfú chuàn bāohán rènyì zì jié. Bù xūyào bǎoliú Unicode wénběn,UTF-8 wénběn huò rènhé qítā yù dìngyì géshì. Jiù zìfú chuàn de nèiróng ér yán, tā wánquán xiāngdāng yú yīgè zì jié piàn.\n\nZhè shì yīgè zìfú chuàn wénzì (shāo hòu jiāng xiángxì jièshào), gāi wénzì shǐyòng\\ xNN biǎoshì fǎ dìngyì yīgè bāohán yīxiē tèshū zì jié zhí de zìfú chuàn chángliàng. (Dāngrán, zì jié de fànwéi shì shíliù jìn zhì zhí 00 dào FF, bāokuò liǎng duān).\n\n Const sample \u003d“ \\ xbd\\ xb2\\ x3d\\ xbc\\ x20\\ xe2\\ x8c\\ x98”\n\ndǎyìn zìfú chuàn\n\nyóuyú shìlì zìfú chuàn zhōng de mǒu xiē zì jié bùshì yǒuxiào de ASCII, shènzhì bùshì yǒuxiào de UTF-8, yīncǐ zhíjiē dǎyìn zìfú chuàn jiāng chǎnshēng nánkàn de shūchū. Jiǎndān de dǎyìn shēngmíng\n\n fmt.Println(yàngběn)\n\nchǎnshēng zhè zhǒng hǔnluàn (qí quèqiè wàiguān suí huánjìng ér biànhuà):\n\n ��\u003d�⌘\n\nYào zhǎo chū gāi zìfú chuàn de zhēnzhèng hányì, wǒmen xūyào jiāng qí chāi kāi bìng jiǎnchá gège bùfèn. Yǒu jǐ zhǒng fāngfǎ kěyǐ zuò dào zhè yīdiǎn. Zuì míngxiǎn de shì biànlì qí nèiróng bìng zhúgè lā chū zì jié, rú yǐxià for xúnhuán suǒ shì:\n\n Duìyú wǒ:\u003d 0; I \u003clen(yàngběn); wǒ ++ {\n fmt.Printf(“%x”,sample [i])\n }\n\nrú qián suǒ shù, duì zìfú chuàn jìnxíng suǒyǐn fǎngwèn de shì dāngè zì jié, ér bùshì zìfú. Wǒmen jiàng zài xiàmiàn xiángxì fǎnhuí gāi zhǔtí. Xiànzài, ràng wǒmen jǐn bǎoliú zì jié. Zhè shì zhú zì jié xúnhuán de shūchū:\n\nBd b2 3d bc 20 e2 8c 98\n\nzhùyì gège zì jié rúhé yǔ dìngyì zìfú chuàn de shíliù jìn zhì zhuǎn yì fú pǐpèi.\n\nWèi hǔnluàn de zìfú chuàn shēngchéng kě xiǎnshì de shūchū de yī zhǒng jiào duǎn fāngfǎ shì shǐyòng fmt.Printf de%x(shíliù jìn zhì) géshì dòngcí. Tā zhǐshì jiāng zìfú chuàn de shùnxù zì jié zhuǎn chǔ wèi shíliù jìn zhì shùzì, měi gè zì jié liǎng gè.\n\n Fmt.Printf(“%x\\ n”, shìlì)\n\njiāng qí shūchū yǔ shàngmiàn de shūchū jìnxíng bǐjiào:\n\nBdb23dbc20e28c98\n\nyīgè bùcuò de jìqiǎo shì shǐyòng gāi géshì de “kònggé” biāozhì, zài%hé x zhī jiān fàngzhì yīgè kònggé. Jiāng cǐ chù shǐyòng de géshì zìfú chuàn yǔ shàngmiàn de géshì zìfú chuàn jìnxíng bǐjiào,\n\n fmt.Printf(“%x\\ n”, shìlì)\n\nbìng zhùyì zì jié zhī jiān rúhé liú chū kònggé, cóng'ér shǐ jiéguǒ bù nàme qiánghàn:\n\nBd b2 3d bc 20 e2 8c 98\n\nhái yǒu gèng duō. %Q(dài yǐnhào) dòngcí jiāng zhuǎn yì zìfú chuàn zhōng suǒyǒu bùkě dǎyìn de zì jié xùliè, yīncǐ shūchū wú qíyì.\n\n Fmt.Printf(“%q\\ n”, shìlì)\n\ndāng dà bùfèn zìfú chuàn kě lǐjiě wèi wénběn dàn yǒu yīxiē tèshū de hányì kěyǐ gēnchú shí, cǐ jìshù hěn fāngbiàn. Tā chǎnshēng:\n\n“ \\ Xbd\\ xb2 \u003d\\ xbc⌘”\n\nrúguǒ wǒmen xiéshì yīxià, wǒmen kěyǐ kàn dào zàoshēng zhōng yǐncáng de shì yīgè ASCII děng hào yǐjí guīzé de kònggé, zuìhòu chūxiànle zhùmíng de ruìdiǎn “jǐngdiǎn” fúhào. Gāi fúhào de Unicode zhí wèi U + 2318, yóu kònggé hòu de zì jié (shíliù jìn zhì zhí 20) biānmǎ wèi UTF-8:E2 8c 98.\n\nRúguǒ wǒmen duì zìfú chuàn zhōng de mòshēng zhí bù shúxī huò gǎndào kùnhuò, kěyǐ duì%q dòngcí shǐyòng “jiā hào” biāozhì. Cǐ biāozhì shǐ shūchū zài jiěshì UTF-8 shí bùjǐn zhuǎn yì bùkě dǎyìn de xùliè, érqiě zhuǎn yì suǒyǒu fēi ASCII zì jié. Jiéguǒ shì tā gōngkāile géshì zhèngquè de UTF-8 de Unicode zhí, gāi zhí biǎoshì zìfú chuàn zhōng de fēi ASCII shùjù:\n\n Fmt.Printf(“%+ q\\ n”, shìlì)\n\nshǐyòng zhè zhǒng géshì, ruìdiǎn fúhào de Unicode zhí xiǎnshì wèi\\ u zhuǎn yì:\n\n“ \\ Xbd\\ xb2 \u003d\\ xbc\\ u2318”\n\nzài tiáoshì zìfú chuàn de nèiróng shí, zhèxiē dǎyìn jìshù hěn bùcuò, bìngqiě zài xiàmiàn de tǎolùn zhōng huì hěn fāngbiàn. Zhídé zhǐchū de shì, suǒyǒu zhèxiē fāngfǎ duìyú zì jié piàn de xíngwéi yǔ duì zìfú chuàn de xíngwéi wánquán xiāngtóng.\n\nZhè shì wǒmen liè chū de quántào dǎyìn xuǎnxiàng, yǐ wánzhěng de chéngxù xíngshì xiǎnshì, nín kěyǐ zài liúlǎn qì zhōng zhíjiē yùnxíng (hé biānjí):\n\nBāo mǐ"]
]
,null,"en",null,null,null,1.0,[]
,[["en"]
,null,[1.0]
,["en"]
]
]
`