-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathChangeLog
91 lines (79 loc) · 3.46 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
Version 0.91-dev
various bug fixes
input1d now accepts int1 and uint1 types as image index
input2d now accepts int2 and uint2 types as image index
ps::get_global_id for pixel shaders
il function support ( func example )
doesn't need 'code << "end\n"' at the end of create_kernel
improvements to cal::Kernel class
uav support for forcing cached reads
removing use of flat thread index ( much slower on 5xxx )
compilation fix for SDK 2.4
improved ::cal::Kernel class ( using pinned memory for const transfer, removing unnecessary calls )
abs, min, max math functions
support for program binary image
class for file caching binary images ( cal/cal_program_cache.hpp )
redistributing CAL related header files from AMD SDK ( removing dependency on AMD SDK - only drivers required )
bitselect function
bitextract function ( amd_bfe )
thread safety ( enabled by flag __CAL_THREADSAFE )
automatic use of fma instead of mad ( with flag __CAL_USE_AUTOFMA )
Version 0.90
support for offset in sample load
new matrix multiplication example
uav support
new example - uav write
cast_bits ( and as_typeN opencl like functions )
changes to double swizzle handling
fixes in lds handling
emmit_comment function for adding comments to generated IL
type traits
nbody example using double variables for computations
frexp
more accurate log(double)
changes to relational operators for double types ( return type matches input vectors length )
changes to select ( handling of double types )
new reciprocal and native_reciprocal functions ( calculating 1/x )
sqrt and native_sqrt functions
native_exp and native_log functions
native_rsqrt and improved precision for rsqrt(double)
mixed IL and C types operators (+,-,...)
refactoring of math library ( to improve compilation times only required files can be included )
changes in files layout ( cal/il/cal_il.hpp moved to cal/cal_il.hpp, cal/il/cal_il_math.hpp moved to cal/cal_il_math.hpp )
improved mul, div, mod operators with C types ( automatic use of shl, shr, ldexp for power of 2 values ) - to enable compile with defined __CAL_USE_IMPROVED_MULDIV - use with caution ( double ldexp may triger CAL driver bug causing invalid results )
regression directory with not working kernels ( due to bugs in ATI SDK )
integer division
device target name
uav atomic functions
lds atomic functions
uavatomics example
Version 0.87.1
compilation fix for SDK 2.3
Version 0.87
IL kernel link bug fix
support for pinned memory
new example - optimal n-body brute-force algorithm implementation
various bug fixes
Version 0.86.3
rotate bug fix
Non blocking wait disabled by default. To enable __CAL_USE_NON_BLOCKING_WAIT must be defined
Version 0.86.2
small bug fix to rotate
Version 0.86.1
small improvement to rotate
Version 0.86
Improvements to memory handling. Memory object allocates CAL resource for each device in Context.
The proper resource for device is later selected by CommandQueue object.
Version 0.85
Various bug fixes
ability to use target device information ( wavefront size, device type, etc ) for code generation
rotate function
non-blocking waiting for event completion
Version 0.8b
Compilation fixes for Visual C++
Improved cmake files
Version 0.8a
Compilation fixes for gcc 4.4
Version 0.8
Initial distribution
ChangeLog starts here.