Skip to content

Commit

Permalink
Merge pull request #11 from sisong/dev
Browse files Browse the repository at this point in the history
update test
  • Loading branch information
sisong authored Aug 16, 2022
2 parents 6e4bfa7 + d27cb86 commit c8f0252
Show file tree
Hide file tree
Showing 9 changed files with 267 additions and 94 deletions.
83 changes: 74 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# [tinyuz](https://github.com/sisong/tinyuz)
[![release](https://img.shields.io/badge/release-v0.9.1-blue.svg)](https://github.com/sisong/tinyuz/releases)
[![release](https://img.shields.io/badge/release-v0.9.2-blue.svg)](https://github.com/sisong/tinyuz/releases)
[![license](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/sisong/tinyuz/blob/master/LICENSE)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-blue.svg)](https://github.com/sisong/tinyuz/pulls)
[![+issue Welcome](https://img.shields.io/github/issues-raw/sisong/tinyuz?color=green&label=%2Bissue%20welcome)](https://github.com/sisong/tinyuz/issues)
Expand All @@ -8,20 +8,55 @@

english | [中文版](README_cn.md)

**tinyuz** is a lossless compression algorithm, which is characterized by a very small decompress code(disk or Flash occupancy, compiled from source code); The code compiled by Mbed Studio is 856 bytes.
And the decompress memory(RAM occupancy) can also be very small, RAM size = dictionary size(1Byte--1GB) specified when compress + input cache size(>=2Byte) when decompress. Tip: The smaller the dictionary, the lower the compression ratio; while the smaller input cache only affects the decompress speed.
Large data are supported, and both compress and decompress are streaming.
The compress and decompress speed is related to the characteristics of the input data and parameter settings; On modern CPUs, compress speed is slower by about 0.4MB/S--2MB/S, requires about dictionary size*18 of memory; and decompress speed is faster by about 180MB/S--300MB/S.
(developmenting & evaluating ...)
**tinyuz** is a lossless compression algorithm, designed for tiny devices (MCU, NB-IoT, etc.) with better compression ratios.
Which is characterized by a very small decompress code(ROM occupancy);
The stream decompresser compiled by Mbed Studio is 856 bytes(can define to 758bytes),
and the memory decompresser compiled by Mbed Studio is 424 bytes(can define to 298bytes).
At the same time, the decompress memory(RAM occupancy) can also be very small,
RAM size = dictionary size(1Byte--1GB) specified when compress + input cache size(>=2Byte) when decompress.
Tip: The smaller the dictionary, the lower the compression ratio; while the smaller input cache only affects the decompress speed.

Large data are supported, both compress and decompress support streaming.
The compress and decompress speed is related to the characteristics of the input data and parameter settings;
On modern CPUs, compress speed is slower by about 0.4MB/S--2MB/S, and decompress speed is faster by about 180MB/S--300MB/S.

---
## Releases/Binaries
[Download from latest release](https://github.com/sisong/tinyuz/releases) : Command line app for Windows, Linux, MacOS.
( release files build by projects in path `tinyuz/builds` )

## Build it yourself
need library [HDiffPatch](https://github.com/sisong/HDiffPatch)
### Linux or MacOS X ###
```
$ cd <dir>/tinyuz
$ git clone https://github.com/sisong/HDiffPatch.git ../HDiffPatch
$ cd <dir>
$ git clone https://github.com/sisong/tinyuz.git tinyuz
$ git clone https://github.com/sisong/HDiffPatch.git HDiffPatch
$ cd tinyuz
$ make
```
### Windows ###
```
$ cd <dir>
$ git clone https://github.com/sisong/tinyuz.git tinyuz
$ git clone https://github.com/sisong/HDiffPatch.git HDiffPatch
```
build `tinyuz/builds/vc/tinyuz.sln` with [`Visual Studio`](https://visualstudio.microsoft.com)

---
## command line usage:
```
compress : tinyuz -c[-dictSize] inputFile outputFile
deccompress: tinyuz -d[-cacheSize] inputFile outputFile
options:
-c[-dictSize]
set compress dictSize;
dictSize>=1, DEFAULT -c-16m, recommended: 127, 4k, 1m, 512m, etc...
requires O(dictSize*18) bytes of memory;
-d[-cacheSize]
set decompress cacheSize;
cacheSize>=2, DEFAULT -d-256k, recommended: 64, 1k, 32k, 4m, etc...
requires (dictSize+cacheSize) bytes of memory;
```

---
## library API usage:
Expand All @@ -42,6 +77,36 @@ can also decompress at once in memory:
tuz_TResult tuz_decompress_mem(const tuz_byte* in_code,tuz_size_t code_size,tuz_byte* out_data,tuz_size_t* data_size);
```

---
## test compression ratio:
ratio: compressedSize/uncompressedSize
zlib v1.2.11: test with compress level 9, windowBits -15
tinyuz v0.9.2: test with multiple different dictSizes 32MB,1MB,32KB,5KB,1KB,255B,79B,24B
('tuz -32m' means: tinyuz -c-32m)

"aMCU.bin" is a firmware file of MCU device;
"aMCU.bin.diff" is a uncompressed differential file between two versions of firmware files (created by [HPatchLite](https://github.com/sisong/HPatchLite));
"A10.jpg"--"world95.txt" download from http://www.maximumcompression.com/data/files/index.html
"enwik8" download from https://data.deepai.org/enwik8.zip
"silesia.tar" download from https://sun.aei.polsl.pl//~sdeor/index.php?page=silesia

||zlib -9|tuz -32m|tuz -1m|tuz -32k|tuz -5k|tuz -1k|tuz -255|tuz -79|tuz -24|
|:----|----:|----:|----:|----:|----:|----:|----:|----:|----:|
|aMCU.bin|46.54%|45.80%|45.80%|45.98%|49.16%|54.29%|60.61%|68.03%|77.95%|
|aMCU.bin.diff|5.29%|5.75%|5.75%|5.75%|5.95%|6.35%|6.89%|7.85%|9.54%|
|A10.jpg|99.88%|99.99%|99.99%|99.99%|99.99%|99.99%|99.99%|99.99%|99.99%|
|AcroRd32.exe|44.88%|42.01%|42.12%|43.80%|46.53%|51.48%|58.29%|67.57%|78.81%|
|english.dic|25.83%|28.62%|28.65%|29.20%|29.98%|31.25%|33.49%|36.53%|39.93%|
|FlashMX.pdf|84.76%|86.08%|85.34%|85.81%|87.34%|88.31%|89.90%|92.05%|96.83%|
|FP.LOG|6.46%|4.95%|5.26%|7.36%|9.99%|12.67%|19.27%|99.25%|100.00%|
|MSO97.DLL|57.94%|53.54%|54.12%|56.96%|59.80%|64.38%|70.62%|78.36%|87.73%|
|ohs.doc|24.05%|20.65%|21.03%|24.50%|26.85%|31.08%|37.50%|69.31%|82.85%|
|rafale.bmp|30.23%|30.30%|30.40%|32.66%|35.51%|40.81%|43.52%|47.70%|54.42%|
|vcfiu.hlp|20.41%|17.71%|17.79%|20.39%|24.24%|27.46%|32.39%|49.01%|69.64%|
|world95.txt|28.87%|22.88%|23.44%|30.79%|47.15%|54.96%|65.23%|78.53%|97.20%|
|enwik8|36.45%|30.09%|33.22%|38.36%|43.96%|51.53%|63.38%|79.63%|96.78%|
|silesia.tar|31.98%|28.41%|29.66%|33.27%|38.21%|44.45%|52.58%|63.62%|78.49%|

---
## Contact
[email protected]
Expand Down
79 changes: 70 additions & 9 deletions README_cn.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# [tinyuz](https://github.com/sisong/tinyuz)
[![release](https://img.shields.io/badge/release-v0.9.1-blue.svg)](https://github.com/sisong/tinyuz/releases)
[![release](https://img.shields.io/badge/release-v0.9.2-blue.svg)](https://github.com/sisong/tinyuz/releases)
[![license](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/sisong/tinyuz/blob/master/LICENSE)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-blue.svg)](https://github.com/sisong/tinyuz/pulls)
[![+issue Welcome](https://img.shields.io/github/issues-raw/sisong/tinyuz?color=green&label=%2Bissue%20welcome)](https://github.com/sisong/tinyuz/issues)
Expand All @@ -8,20 +8,52 @@

中文版 | [english](README.md)

**tinyuz** 是一个无损压缩算法,特色是编译后的解压缩代码(磁盘或Flash占用)非常的小,用 Mbed Studio 编译后为 856 字节;
并且解压时内存(RAM 占用)也可以非常的小,大小为 压缩时指定的字典大小(1Byte--1GB) + 解压缩时输入的缓存区大小(>=2Byte);提示:字典越小压缩率越低,而输入缓存区较小时只影响解压缩速度。
支持处理巨大的数据,压缩和解压缩时都是流式处理。
压缩和解压缩速度与数据特性和参数设置有关;在现代 CPU 上,压缩时比较慢约 0.4MB/S--2MB/S,约占用 字典大小*18 的内存;解压缩较快约 180MB/S--300MB/S。
(开发评估中...)
**tinyuz** 是一个无损压缩算法,为超小型设备(MCU、NB-IoT等)设计,保持还不错的压缩率。
特色是编译后的解压缩代码(ROM占用)非常的小;
流模式解压用 Mbed Studio 编译后为 856 字节(可以调整宏定义后到758字节);
而内存模式解压用 Mbed Studio 编译后为 424 字节(可以调整宏定义后到298字节)。
同时,解压时内存(RAM占用)也可以非常的小,大小为 压缩时指定的字典大小(1Byte--1GB) + 解压缩时输入的缓存区大小(>=2Byte);
提示:字典越小压缩率越低,而输入缓存区较小时只影响解压缩速度。
支持处理巨大的数据,压缩和解压缩时都支持流式处理。
压缩和解压缩速度与数据特性和参数设置有关;在现代 CPU 上,压缩时比较慢约 0.4MB/S--2MB/S,解压缩较快约 180MB/S--300MB/S。

---
## 二进制发布包
[从 release 下载](https://github.com/sisong/tinyuz/releases) : 分别运行在 Windows、Linux、MacOS操作系统的命令行程序。
( 编译出这些发布文件的项目路径在 `tinyuz/builds` )

## 自己编译
编译时需要[HDiffPatch](https://github.com/sisong/HDiffPatch)
### Linux or MacOS X ###
```
$ cd <dir>/tinyuz
$ git clone https://github.com/sisong/HDiffPatch.git ../HDiffPatch
$ cd <dir>
$ git clone https://github.com/sisong/tinyuz.git tinyuz
$ git clone https://github.com/sisong/HDiffPatch.git HDiffPatch
$ cd tinyuz
$ make
```
### Windows ###
```
$ cd <dir>
$ git clone https://github.com/sisong/tinyuz.git tinyuz
$ git clone https://github.com/sisong/HDiffPatch.git HDiffPatch
```
[`Visual Studio`](https://visualstudio.microsoft.com) 打开 `tinyuz/builds/vc/tinyuz.sln` 编译

---
## 命令行使用:
```
压缩 : tinyuz -c[-dictSize] inputFile outputFile
解压缩: tinyuz -d[-cacheSize] inputFile outputFile
选项说明:
-c[-dictSize]
设置压缩时使用的字典大小;
dictSize>=1, 默认 -c-16m, 推荐: 127、4k、1m、512m 等...
运行时需要内存约 (dictSize*18) 字节;
-d[-cacheSize]
设置解压时使用的缓冲区大小;
cacheSize>=2, 默认 -d-256k, 推荐: 64、1k、32k、4m 等...
运行时需要内存 (dictSize+cacheSize) 字节;
```

---
## 库 API 使用:
Expand All @@ -42,6 +74,35 @@ tuz_TResult tuz_TStream_decompress_partial(tuz_TStream* self,tuz_byte* out_data,
tuz_TResult tuz_decompress_mem(const tuz_byte* in_code,tuz_size_t code_size,tuz_byte* out_data,tuz_size_t* data_size);
```

---
## 压缩率测试:
压缩率: 压缩后大小/压缩前大小
zlib v1.2.11: 测试时设置压缩水平为9, 窗口比特大小设置为-15
tinyuz v0.9.2: 测试时设置多个不同的字典大小 32MB,1MB,32KB,5KB,1KB,255B,79B,24B
(表中'tuz -32m' 表示: tinyuz -c-32m)

"aMCU.bin" 是一个MCU设备的固件文件;
"aMCU.bin.diff" 是一个用两个不同版本的固件文件来创建的未压缩的补丁文件(用 [HPatchLite](https://github.com/sisong/HPatchLite) 所创建);
"A10.jpg"--"world95.txt" 从 http://www.maximumcompression.com/data/files/index.html 下载
"enwik8" 从 https://data.deepai.org/enwik8.zip 下载
"silesia.tar" 从 https://sun.aei.polsl.pl//~sdeor/index.php?page=silesia 下载

||zlib -9|tuz -32m|tuz -1m|tuz -32k|tuz -5k|tuz -1k|tuz -255|tuz -79|tuz -24|
|:----|----:|----:|----:|----:|----:|----:|----:|----:|----:|
|aMCU.bin|46.54%|45.80%|45.80%|45.98%|49.16%|54.29%|60.61%|68.03%|77.95%|
|aMCU.bin.diff|5.29%|5.75%|5.75%|5.75%|5.95%|6.35%|6.89%|7.85%|9.54%|
|A10.jpg|99.88%|99.99%|99.99%|99.99%|99.99%|99.99%|99.99%|99.99%|99.99%|
|AcroRd32.exe|44.88%|42.01%|42.12%|43.80%|46.53%|51.48%|58.29%|67.57%|78.81%|
|english.dic|25.83%|28.62%|28.65%|29.20%|29.98%|31.25%|33.49%|36.53%|39.93%|
|FlashMX.pdf|84.76%|86.08%|85.34%|85.81%|87.34%|88.31%|89.90%|92.05%|96.83%|
|FP.LOG|6.46%|4.95%|5.26%|7.36%|9.99%|12.67%|19.27%|99.25%|100.00%|
|MSO97.DLL|57.94%|53.54%|54.12%|56.96%|59.80%|64.38%|70.62%|78.36%|87.73%|
|ohs.doc|24.05%|20.65%|21.03%|24.50%|26.85%|31.08%|37.50%|69.31%|82.85%|
|rafale.bmp|30.23%|30.30%|30.40%|32.66%|35.51%|40.81%|43.52%|47.70%|54.42%|
|vcfiu.hlp|20.41%|17.71%|17.79%|20.39%|24.24%|27.46%|32.39%|49.01%|69.64%|
|world95.txt|28.87%|22.88%|23.44%|30.79%|47.15%|54.96%|65.23%|78.53%|97.20%|
|enwik8|36.45%|30.09%|33.22%|38.36%|43.96%|51.53%|63.38%|79.63%|96.78%|
|silesia.tar|31.98%|28.41%|29.66%|33.27%|38.21%|44.45%|52.58%|63.62%|78.49%|

---
## 联系
Expand Down
4 changes: 2 additions & 2 deletions builds/vc/tinyuz.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@
<RuntimeLibrary>MultiThreaded</RuntimeLibrary>
</ClCompile>
<Link>
<GenerateDebugInformation>true</GenerateDebugInformation>
<GenerateDebugInformation>false</GenerateDebugInformation>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
</Link>
Expand All @@ -105,7 +105,7 @@
<RuntimeLibrary>MultiThreaded</RuntimeLibrary>
</ClCompile>
<Link>
<GenerateDebugInformation>true</GenerateDebugInformation>
<GenerateDebugInformation>false</GenerateDebugInformation>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
</Link>
Expand Down
10 changes: 6 additions & 4 deletions builds/vc2019/tinyuz.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -96,9 +96,10 @@
<RuntimeLibrary>MultiThreaded</RuntimeLibrary>
</ClCompile>
<Link>
<GenerateDebugInformation>true</GenerateDebugInformation>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<GenerateDebugInformation>false</GenerateDebugInformation>
<OptimizeReferences>true</OptimizeReferences>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<LinkTimeCodeGeneration>UseLinkTimeCodeGeneration</LinkTimeCodeGeneration>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
Expand All @@ -110,9 +111,10 @@
<RuntimeLibrary>MultiThreaded</RuntimeLibrary>
</ClCompile>
<Link>
<GenerateDebugInformation>true</GenerateDebugInformation>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<GenerateDebugInformation>false</GenerateDebugInformation>
<OptimizeReferences>true</OptimizeReferences>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<LinkTimeCodeGeneration>UseLinkTimeCodeGeneration</LinkTimeCodeGeneration>
</Link>
</ItemDefinitionGroup>
<ItemGroup>
Expand Down
2 changes: 1 addition & 1 deletion compress/tuz_enc.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ hpatch_StreamPos_t tuz_compress(const hpatch_TStreamOutput* out_code,const hpatc
checkv((props->dictSize>=1)&(props->dictSize<=tuz_kMaxOfDictSize));
checkv(props->dictSize==(tuz_size_t)props->dictSize);
checkv(props->maxSaveLength==(tuz_length_t)props->maxSaveLength);
checkv((props->maxSaveLength>=tuz_kMinOfMaxSaveLength)&(props->maxSaveLength<=tuz_kMaxOfMaxSaveLength));
checkv((props->maxSaveLength>=tuz_kMinOfMaxSaveLength)&&(props->maxSaveLength<=tuz_kMaxOfMaxSaveLength));
}

tuz_TCompressProps selfProps=(props)?*props:tuz_kDefaultCompressProps;
Expand Down
6 changes: 4 additions & 2 deletions decompress/tuz_dec.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,9 @@ typedef enum tuz_TResult{


//-----------------------------------------------------------------------------------------------------------------
// decompress step by step: compiled by Mbed Studio is 856 bytes
// if set tuz_isNeedLiteralLine=0 & _IS_RUN_MEM_SAFE_CHECK=0, compiled by Mbed Studio is 758 bytes

// decompress by tuz_TStream: compiled by Mbed Studio is 856 bytes
typedef struct tuz_TStream{
_tuz_TInputCache _code_cache;
_tuz_TDict _dict;
Expand All @@ -55,8 +56,9 @@ tuz_TResult tuz_TStream_decompress_partial(tuz_TStream* self,tuz_byte* out_data,

//-----------------------------------------------------------------------------------------------------------------

//decompress all to out_data
//decompress all to out_data memory
// compiled by Mbed Studio is 424 bytes; faster than decompress by tuz_TStream;
// if set tuz_isNeedLiteralLine=0 & _IS_RUN_MEM_SAFE_CHECK=0, compiled by Mbed Studio is 298 bytes
// data_size: input out_data buf's size, output decompressed data size;
// if success return tuz_STREAM_END;
tuz_TResult tuz_decompress_mem(const tuz_byte* in_code,tuz_size_t code_size,tuz_byte* out_data,tuz_size_t* data_size);
Expand Down
14 changes: 8 additions & 6 deletions decompress/tuz_types.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@
# define _IS_USED_SHARE_hpatch_lite_types 0
#endif

#ifndef tuz_isNeedLiteralLine // optimize incompressible data for improve compression ratio
//if tuz_isNeedLiteralLine==0 when decompress, must also be set to 0 when compress, can reduce 80 bytes
# define tuz_isNeedLiteralLine 1
#endif

#if (_IS_USED_SHARE_hpatch_lite_types)
# include "hpatch_lite_types.h" //in "HDiffPatch/libHDiffPatch/HPatchLite/"
# include "hpatch_lite_input_cache.h"
Expand All @@ -34,7 +39,7 @@ extern "C" {

#define TINYUZ_VERSION_MAJOR 0
#define TINYUZ_VERSION_MINOR 9
#define TINYUZ_VERSION_RELEASE 1
#define TINYUZ_VERSION_RELEASE 2

#define _TINYUZ_VERSION TINYUZ_VERSION_MAJOR.TINYUZ_VERSION_MINOR.TINYUZ_VERSION_RELEASE
#define _TINYUZ_QUOTE(str) #str
Expand Down Expand Up @@ -107,16 +112,13 @@ extern "C" {
# define _IS_USED_C_MEMCPY 1
#endif

#ifndef tuz_isNeedLiteralLine // optimize for can not compress data
# define tuz_isNeedLiteralLine 1
#endif

#ifndef tuz_kMaxOfDictSize
# define tuz_kMaxOfDictSize __tuz_kMaxOfDictSize_MAX
//# define tuz_kMaxOfDictSize ((1<<24)-1) //3 bytes
//# define tuz_kMaxOfDictSize ((1<<16)-1) //2 bytes
#endif

//save dictSize at the beginning of the compressed code stream, little-endian order, tuz_kDictSizeSavedBytes bytes
#define __tuz_kMaxOfDictSize_MAX (1<<30) //now limit for uint32
#if (tuz_kMaxOfDictSize>__tuz_kMaxOfDictSize_MAX)
# error tuz_kMaxOfDictSize error
Expand All @@ -136,7 +138,7 @@ extern "C" {
typedef void* tuz_TInputStreamHandle;
#endif
#ifndef tuz_TInputStream_read
//read (*data_size) data to out_data from sequence stream; if input stream end,set *data_size readed size; if read error return tuz_FALSE;
//read (*data_size) bytes data from inputStream to out_data; if input stream end,set *data_size readed size; if read error return tuz_FALSE;
typedef tuz_BOOL (*tuz_TInputStream_read)(tuz_TInputStreamHandle inputStream,tuz_byte* out_data,tuz_size_t* data_size);
#endif

Expand Down
Loading

0 comments on commit c8f0252

Please sign in to comment.