-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
138 lines (122 loc) · 4.65 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Text LZ</title>
<script src="https://code.jquery.com/jquery-3.1.1.slim.min.js" type="application/javascript"></script>
<script src="textlzss.js" type="application/javascript"></script>
<link rel="stylesheet" type="text/css" href="textlzss.css">
</head>
<body>
<a href="https://github.com/nerai/TextLZSS">
<img style="position: absolute; top: 0; left: 0; border: 0;"
src="https://camo.githubusercontent.com/121cd7cbdc3e4855075ea8b558508b91ac463ac2/68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f6769746875622f726962626f6e732f666f726b6d655f6c6566745f677265656e5f3030373230302e706e67"
alt="Fork me on GitHub"
data-canonical-src="https://s3.amazonaws.com/github/ribbons/forkme_left_green_007200.png">
</a>
<div class="content">
<h1>Text LZSS</h1>
<div class="left">
<h4>Reference options</h4>
<div>
Reference length:
<div id="btnLengthFixed" class="toggler toggled">Static (fixed number of bits)</div>
<div id="btnLengthUnary" class="toggler">Dynamic (variable length with unary encoded log prefix)</div>
</div>
<div id="fixedRefLengthOptions">
<table>
<tr>
<td>
<label for="offsetBits">Offset bits</label>
</td>
<td>
<input type="number" id="offsetBits" name="offsetBits"
style="width: 4em; font-size: 1em; text-align: center"
value="7" min="0" max="16" ;>
</td>
<td>
Defines how many bits are used to to store the match offset.
This effectively controls the <span style="font-style: italic">dictionary size</span>.
Larger values allow references from further away.
</td>
</tr>
<tr>
<td>
<label for="lengthBits">Length bits</label>
</td>
<td>
<input type="number" id="lengthBits" name="lengthBits"
style="width: 4em; font-size: 1em; text-align: center"
value="4" min="0" max="16" ;>
</td>
<td>
Controls how many bits are used to store the length of a match.
Larger values allow for referencing larger chunks.
</td>
</tr>
<tr>
<td>
<label for="minLength">Min length</label>
</td>
<td>
<input type="number" id="minLength" name="minLength"
style="width: 4em; font-size: 1em; text-align: center"
value="3" min="0" max="10" ;>
</td>
<td>
The minimum match length ensures that only sufficiently large matches will be used.
Referencing a too small match would take more space than including it as a literal would.
</td>
</tr>
</table>
</div>
<div id="sizeInfo"></div>
<div>
<h4>Encoding options</h4>
<label for="showBits">Display required bits</label>
<input type="checkbox" id="showBits" name="showBits">
<br>
<label for="encodeUtf8">Encode literals as UTF-8 (instead of UTF-16)</label>
<input type="checkbox" id="encodeUtf8" name="encodeUtf8" checked>
</div>
<div>
<h4>Examples</h4>
<div class="button" id="setAaca">Aaca</div>
<div class="button" id="setDova">Dovahkiin</div>
<div class="button" id="setAlice">Alice in Wonderland</div>
<div class="button" id="setPi">Digits of Pi</div>
<div class="button" id="setRandom">Random</div>
<div class="button" id="setUnicodeHi">Unicode and surrogates</div>
</div>
</div>
<textarea id="input">Type here. Your text will be compressed using the LZSS compression algorithm.</textarea>
<br>
<div id="output" class="lzout"></div>
<br>
<br>
<div>
<h3>What is this?</h3>
<div style="text-align: justify; line-height: 150%; margin: auto; width: 80%">
<p>
The text is compressed using a (crude) variant of LZSS. This basically means that any long, repeated
character sequence gets replaced with a short reference to its previous location, thus saving space.
</p>
<p>
Each element in LZSS output is either a literal or a reference.
(In comparison, LZ77 exclusively uses references and literals combined in pairs.)
Red boxes represent literals (uncompressed parts).
Blue boxes contain references.
A reference consists of an offset denoting how far to look back, and a length showing how many literals
to copy.
The minimum length of a reference should be chosen as not to be wasteful. This depends on the encoding, with dynamic encoding more flexibility.
Note that references may contain themselves. For instance, 'aaaaa' can be stored as 'literal a' followed by 'reference offset 1, length 4'.
</p>
<p>
This variant of LZSS does not make use of Huffman trees, which would encode emitted symbols to represent
them in just a few bits instead of whole bytes.
</p>
</div>
</div>
</div>
</body>
</html>