diff --git a/DOCUMENTATION.md b/DOCUMENTATION.md
index 0c70bb06..4b241d84 100644
--- a/DOCUMENTATION.md
+++ b/DOCUMENTATION.md
@@ -78,6 +78,7 @@ git clone https://github.com/ProjectPhysX/FluidX3D.git
| 6 | raytraced free surface |
| 7 | particles |
| T | toggle slice visualization mode |
+| Z | toggle field visualization mode |
| Q E | move slice in slice visualization mode |
diff --git a/LICENSE.md b/LICENSE.md
index 97058561..675d18d2 100644
--- a/LICENSE.md
+++ b/LICENSE.md
@@ -2,7 +2,7 @@ Copyright (c) 2022-2024 Dr. Moritz Lehmann
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files, to use this software for public research, education or personal use, and to alter it and redistribute it freely, subject to the following restrictions:
-1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.
+1. The [origin of this software](https://github.com/ProjectPhysX/FluidX3D) must not be misrepresented; you must not claim that you wrote the original software. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software.
2. Commercial use is not allowed. You may not sell this software, altered source versions, any part thereof or any of the rights granted to you under the license. You may not provide to third parties, for a fee or other consideration (including without limitation fees for hosting or consulting/support services related to the software), a product or service whose value derives from the functionality of this software, altered source versions or any part thereof, unless explicit permission is granted to you by the copyright owner.
3. Military use is not allowed. You may not use this software, altered source versions or any part thereof for military research or any military or defense industry purposes, or within a military institution.
4. You may not train AI models on the source code of this software, altered source versions or any part thereof.
diff --git a/README.md b/README.md
index 10025594..38491e26 100644
--- a/README.md
+++ b/README.md
@@ -8,40 +8,40 @@ The fastest and most memory efficient lattice Boltzmann CFD software, running on
Update History
-- v1.0 (04.08.2022)
- - initial release
-- v1.1 (29.09.2022)
+- [v1.0](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v1.0) (04.08.2022) [changes](https://github.com/ProjectPhysX/FluidX3D/commit/768073501af725e392a4b85885009e2fa6400e48) (public release)
+ - public release
+- [v1.1](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v1.1) (29.09.2022) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v1.0...v1.1) (GPU voxelization)
- added solid voxelization on GPU (slow algorithm)
- - added tool to print current camera position (key H)
+ - added tool to print current camera position (key G)
- minor bug fix (workaround for Intel iGPU driver bug with triangle rendering)
-- v1.2 (24.10.2022)
+- [v1.2](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v1.2) (24.10.2022) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v1.1...v1.2) (force/torque compuatation)
- added functions to compute force/torque on objects
- added function to translate Mesh
- added Stokes drag validation setup
-- v1.3 (10.11.2022)
+- [v1.3](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v1.3) (10.11.2022) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v1.2...v1.3) (minor bug fixes)
- added unit conversion functions for torque
- `FORCE_FIELD` and `VOLUME_FORCE` can now be used independently
- minor bug fix (workaround for AMD legacy driver bug with binary number literals)
-- v1.4 (14.12.2022)
+- [v1.4](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v1.4) (14.12.2022) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v1.3...v1.4) (Linux graphics)
- complete rewrite of C++ graphics library to minimize API dependencies
- added interactive graphics mode on Linux with X11
- fixed streamline visualization bug in 2D
-- v2.0 (09.01.2023)
+- [v2.0](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.0) (09.01.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v1.4...v2.0) (multi-GPU upgrade)
- added (cross-vendor) multi-GPU support on a single node (PC/laptop/server)
-- v2.1 (15.01.2023)
+- [v2.1](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.1) (15.01.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.0...v2.1) (fast voxelization)
- made solid voxelization on GPU lightning fast (new algorithm, from minutes to milliseconds)
-- v2.2 (20.01.2023)
+- [v2.2](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.0) (20.01.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.1...v2.2) (velocity voxelization)
- added option to voxelize moving/rotating geometry on GPU, with automatic velocity initialization for each grid point based on center of rotation, linear velocity and rotational velocity
- cells that are converted from solid->fluid during re-voxelization now have their DDFs properly initialized
- added option to not auto-scale mesh during `read_stl(...)`, with negative `size` parameter
- added kernel for solid boundary rendering with marching-cubes
-- v2.3 (30.01.2023)
+- [v2.3](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.3) (30.01.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.2...v2.3) (particles)
- added particles with immersed-boundary method (either passive or 2-way-coupled, only supported with single-GPU)
- minor optimization to GPU voxelization algorithm (workgroup threads outside mesh bounding-box return after ray-mesh intersections have been found)
- displayed GPU memory allocation size is now fully accurate
- fixed bug in `write_line()` function in `src/utilities.hpp`
- removed `.exe` file extension for Linux/macOS
-- v2.4 (11.03.2023)
+- [v2.4](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.4) (11.03.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.3...v2.4) (UI improvements)
- added a help menu with key H that shows keyboard/mouse controls, visualization settings and simulation stats
- improvements to keyboard/mouse control (+/- for zoom, mouseclick frees/locks cursor)
- added suggestion of largest possible grid resolution if resolution is set larger than memory allows
@@ -51,14 +51,14 @@ The fastest and most memory efficient lattice Boltzmann CFD software, running on
- fixed bug in make.sh where multi-GPU device IDs would not get forwarded to the executable
- minor bug fixes in graphics engine (free cursor not centered during rotation, labels in VR mode)
- fixed bug in `LBM::voxelize_stl()` size parameter standard initialization
-- v2.5 (11.04.2023)
+- [v2.5](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.5) (11.04.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.4...v2.5) (raytracing overhaul)
- implemented light absorption in fluid for raytracing graphics (no performance impact)
- improved raytracing framerate when camera is inside fluid
- fixed skybox pole flickering artifacts
- fixed bug where moving objects during re-voxelization would leave an erroneous trail of solid grid cells behind
-- v2.6 (16.04.2023)
+- [v2.6](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.6) (16.04.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.5...v2.6) (Intel Arc patch)
- patched OpenCL issues of Intel Arc GPUs: now VRAM allocations >4GB are possible and correct VRAM capacity is reported
-- v2.7 (29.05.2023)
+- [v2.7](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.7) (29.05.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.6...v2.7) (visualization upgrade)
- added slice visualization (key 2 / key 3 modes, then switch through slice modes with key T, move slice with keys Q/E)
- made flag wireframe / solid surface visualization kernels toggleable with key 1
- added surface pressure visualization (key 1 when `FORCE_FIELD` is enabled and `lbm.calculate_force_on_boundaries();` is called)
@@ -69,7 +69,7 @@ The fastest and most memory efficient lattice Boltzmann CFD software, running on
- reverted back to separate `cl::Context` for each OpenCL device, as the shared Context otherwise would allocate extra VRAM on all other unused Nvidia GPUs
- removed Debug and x86 configurations from Visual Studio solution file (one less complication for compiling)
- fixed bug that particles could get too close to walls and get stuck, or leave the fluid phase (added boundary force)
-- v2.8 (24.06.2023)
+- [v2.8](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.8) (24.06.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.7...v2.8) (documentation + polish)
- finally added more [documentation](DOCUMENTATION.md)
- cleaned up all sample setups in `setup.cpp` for more beginner-friendliness, and added required extensions in `defines.hpp` as comments to all setups
- improved loading of composite `.stl` geometries, by adding an option to omit automatic mesh repositioning, added more functionality to `Mesh` struct in `utilities.hpp`
@@ -87,7 +87,7 @@ The fastest and most memory efficient lattice Boltzmann CFD software, running on
- fixed bug in Q-criterion rendering of halo data in multi-GPU mode, reduced gap width between domains
- removed shared memory optimization from mesh voxelization kernel, as it crashes on Nvidia GPUs with new GPU drivers and is incompatible with old OpenCL 1.0 GPUs
- fixed raytracing attenuation color when no surface is at the simulation box walls with periodic boundaries
-- v2.9 (31.07.2023)
+- [v2.9](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.9) (31.07.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.8...v2.9) (multithreading)
- added cross-platform `parallel_for` implementation in `utilities.hpp` using `std::threads`
- significantly (>4x) faster simulation startup with multithreaded geometry initialization and sanity checks
- faster `calculate_force_on_object()` and `calculate_torque_on_object()` functions with multithreading
@@ -95,7 +95,7 @@ The fastest and most memory efficient lattice Boltzmann CFD software, running on
- fixed bug in voxelization ray direction for re-voxelizing rotating objects
- fixed bug in `Mesh::get_bounding_box_size()`
- fixed bug in `print_message()` function in `utilities.hpp`
-- v2.10 (05.11.2023)
+- [v2.10](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.10) (05.11.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.9...v2.10) (frustrum culling)
- improved rasterization performance via frustrum culling when only part of the simulation box is visible
- improved switching between centered/free camera mode
- refactored OpenCL rendering library
@@ -106,17 +106,17 @@ The fastest and most memory efficient lattice Boltzmann CFD software, running on
- replaced slow (in multithreading) `std::rand()` function with standard C99 LCG
- more robust correction of wrong VRAM capacity reporting on Intel Arc GPUs
- fixed some minor compiler warnings
-- v2.11 (07.12.2023)
+- [v2.11](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.11) (07.12.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.10...v2.11) (improved Linux graphics)
- interactive graphics on Linux are now in fullscreen mode too, fully matching Windows
- made CPU/GPU buffer initialization significantly faster with `std::fill` and `enqueueFillBuffer` (overall ~8% faster simulation startup)
- added operating system info to OpenCL device driver version printout
- fixed flickering with frustrum culling at very small field of view
- fixed bug where rendered/exported frame was not updated when `visualization_modes` changed
-- v2.12 (18.01.2024)
+- [v2.12](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.12) (18.01.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.11...v2.12) (faster startup)
- ~3x faster source code compiling on Linux using multiple CPU cores if [`make`](https://www.gnu.org/software/make/) is installed
- significantly faster simulation initialization (~40% single-GPU, ~15% multi-GPU)
- minor bug fix in `Memory_Container::reset()` function
-- v2.13 (11.02.2024)
+- [v2.13](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.13) (11.02.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.12...v2.13) (improved .vtk export)
- data in exported `.vtk` files is now automatically converted to SI units
- ~2x faster `.vtk` export with multithreading
- added unit conversion functions for `TEMPERATURE` extension
@@ -127,6 +127,16 @@ The fastest and most memory efficient lattice Boltzmann CFD software, running on
- fixed slow numeric drift issues caused by `-cl-fast-relaxed-math`
- fixed wrong Maximum Allocation Size reporting in `LBM::write_status()`
- fixed missing scaling of coordinates to SI units in `LBM::write_mesh_to_vtk()`
+- [v2.14](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.14) (03.03.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.13...v2.14) (visualization upgrade)
+ - coloring can now be switched between velocity/density/temperature with key Z
+ - uniform improved color palettes for velocity/density/temperature visualization
+ - color scale with automatic unit conversion can now be shown with key H
+ - slice mode for field visualization now draws fully filled-in slices instead of only lines for velocity vectors
+ - shading in `VIS_FLAG_SURFACE` and `VIS_PHI_RASTERIZE` modes is smoother now
+ - `make.sh` now automatically detects operating system and X11 support on Linux and only runs FluidX3D if last compilation was successful
+ - fixed compiler warnings on Android
+ - fixed `make.sh` failing on some systems due to nonstandard interpreter path
+ - fixed that `make` would not compile with multiple cores on some systems
diff --git a/src/defines.hpp b/src/defines.hpp
index 962d782f..fa4cfc45 100644
--- a/src/defines.hpp
+++ b/src/defines.hpp
@@ -31,15 +31,18 @@
#define GRAPHICS_FRAME_WIDTH 1920 // set frame width if only GRAPHICS is enabled
#define GRAPHICS_FRAME_HEIGHT 1080 // set frame height if only GRAPHICS is enabled
#define GRAPHICS_BACKGROUND_COLOR 0x000000 // set background color; black background (default) = 0x000000, white background = 0xFFFFFF
-//#define GRAPHICS_TRANSPARENCY 0.7f // optional: comment/uncomment this line to disable/enable semi-transparent rendering (looks better but reduces framerate), number represents transparency (equal to 1-opacity) (default: 0.7f)
#define GRAPHICS_U_MAX 0.25f // maximum velocity for velocity coloring in units of LBM lattice speed of sound (c=1/sqrt(3)) (default: 0.25f)
+#define GRAPHICS_RHO_DELTA 0.01f // coloring range for density rho will be [1.0f-GRAPHICS_RHO_DELTA, 1.0f+GRAPHICS_RHO_DELTA] (default: 0.01f)
+#define GRAPHICS_T_DELTA 1.0f // coloring range for temperature T will be [1.0f-GRAPHICS_T_DELTA, 1.0f+GRAPHICS_T_DELTA] (default: 1.0f)
+#define GRAPHICS_F_MAX 0.001f // maximum force in LBM units for visualization of forces on solid boundaries if VOLUME_FORCE is enabled and lbm.calculate_force_on_boundaries(); is called (default: 0.001f)
#define GRAPHICS_Q_CRITERION 0.0001f // Q-criterion value for Q-criterion isosurface visualization (default: 0.0001f)
-#define GRAPHICS_F_MAX 0.002f // maximum force in LBM units for visualization of forces on solid boundaries if VOLUME_FORCE is enabled and lbm.calculate_force_on_boundaries(); is called (default: 0.002f)
#define GRAPHICS_STREAMLINE_SPARSE 4 // set how many streamlines there are every x lattice points
#define GRAPHICS_STREAMLINE_LENGTH 128 // set maximum length of streamlines
#define GRAPHICS_RAYTRACING_TRANSMITTANCE 0.25f // transmitted light fraction in raytracing graphics ("0.25f" = 1/4 of light is transmitted and 3/4 is absorbed along longest box side length, "1.0f" = no absorption)
#define GRAPHICS_RAYTRACING_COLOR 0x005F7F // absorption color of fluid in raytracing graphics
+//#define GRAPHICS_TRANSPARENCY 0.7f // optional: comment/uncomment this line to disable/enable semi-transparent rendering (looks better but reduces framerate), number represents transparency (equal to 1-opacity) (default: 0.7f)
+
// #############################################################################################################
diff --git a/src/graphics.cpp b/src/graphics.cpp
index 199a2276..7c5bc453 100644
--- a/src/graphics.cpp
+++ b/src/graphics.cpp
@@ -67,24 +67,15 @@ void set_light(const uint i, const float3& position) {
}
}
int shading(const int color, const float3& p, const float3& normal, const bool translucent=false) {
- const float snb = sq(normal.x)+sq(normal.y)+sq(normal.z); // only one sqrt instead of two
+ const float nl2 = sq(normal.x)+sq(normal.y)+sq(normal.z); // only one sqrt instead of two
float br = 0.0f;
for(uint i=0u; i0) {
+ draw_line(x0-(int)camera.width/2, y0, x1-(int)camera.width/2, y1, color);
+ }
+ if((x0+x1)/2+(int)camera.width/2<(int)camera.width) {
+ draw_line(x0+(int)camera.width/2, y0, x1+(int)camera.width/2, y1, color);
+ }
+ }
+}
void draw_bitmap(const int* bitmap) {
std::copy(bitmap, bitmap+(int)camera.width*(int)camera.height, camera.bitmap);
}
diff --git a/src/graphics.hpp b/src/graphics.hpp
index 0a5811d3..11bf88ce 100644
--- a/src/graphics.hpp
+++ b/src/graphics.hpp
@@ -79,6 +79,11 @@ class Camera {
R.xx = cosrx; R.xy = sinrx; R.xz = 0.0f;
R.yx = sinrx*sinry; R.yy = -cosrx*sinry; R.yz = cosry;
R.zx = -sinrx*cosry; R.zy = cosrx*cosry; R.zz = sinry;
+ if(!free) {
+ pos.x = R.zx*dis/zoom;
+ pos.y = R.zy*dis/zoom;
+ pos.z = R.zz*dis/zoom;
+ }
}
void set_key_state(const int key, const bool state) {
key_state[clamp(256+key, 0, 511)] = state;
@@ -329,6 +334,7 @@ void set_light(const uint i, const float3& p);
void draw_bitmap(const int* bitmap);
void draw_label(const int x, const int y, const string& s, const int color);
+void draw_line_label(const int x0, const int y0, const int x1, const int y1, const int color);
void draw_pixel(const int x, const int y, const int color); // 2D drawing functions
void draw_circle(const int x, const int y, const int r, const int color);
diff --git a/src/info.cpp b/src/info.cpp
index bb84042f..e11c7402 100644
--- a/src/info.cpp
+++ b/src/info.cpp
@@ -18,7 +18,7 @@ void Info::initialize(LBM* lbm) {
collision += " (FP32/FP32)";
#endif // FP32
cpu_mem_required = (uint)(lbm->get_N()*(ulong)bytes_per_cell_host()/1048576ull); // reset to get valid values for consecutive simulations
- gpu_mem_required = lbm->lbm[0]->get_device().info.memory_used;
+ gpu_mem_required = lbm->lbm_domain[0]->get_device().info.memory_used;
print_info("Allocating memory. This may take a few seconds.");
}
void Info::append(const ulong steps, const ulong t) {
@@ -40,13 +40,13 @@ double Info::time() const { // returns either elapsed time or remaining time
void Info::print_logo() const {
const int a=color_light_blue, b=color_orange, c=color_pink;
print(".-----------------------------------------------------------------------------.\n");
- print("| "); print( " ______________ ", a); print(" ______________ ", b); print(" |\n");
- print("| "); print( "\\ ________ | ", a); print("| ________ /", b); print(" |\n");
- print("| "); print("\\ \\ | | ", a); print("| | / /", b); print(" |\n");
- print("| "); print("\\ \\ | | ", a); print("| | / /", b); print(" |\n");
- print("| "); print("\\ \\ | | ", a); print("| | / /", b); print(" |\n");
- print("| "); print("\\ \\_.-\" | ", a); print("| \"-._/ /", b); print(" |\n");
- print("| "); print("\\ _.-\" ", a); print("_ ", c); print("\"-._ /", b); print(" |\n");
+ print("| "); print( " ______________ ", a); print(" ______________ ", b); print(" |\n");
+ print("| "); print( "\\ ________ | ", a); print("| ________ /", b); print(" |\n");
+ print("| "); print("\\ \\ | | ", a); print("| | / /", b); print(" |\n");
+ print("| "); print("\\ \\ | | ", a); print("| | / /", b); print(" |\n");
+ print("| "); print("\\ \\ | | ", a); print("| | / /", b); print(" |\n");
+ print("| "); print("\\ \\_.-\" | ", a); print("| \"-._/ /", b); print(" |\n");
+ print("| "); print("\\ _.-\" ", a); print("_ ", c); print("\"-._ /", b); print(" |\n");
print("| "); print("\\.-\" ", a); print("_.-\" \"-._ ", c); print("\"-./", b); print(" |\n");
print("| "); print(" .-\" .-\"-. \"-. ", c); print(" |\n");
print("| "); print("\\ v\" \"v /", c); print(" |\n");
@@ -55,7 +55,7 @@ void Info::print_logo() const {
print("| "); print("\\ \\ / /", c); print(" |\n");
print("| "); print("\\ ' /", c); print(" |\n");
print("| "); print("\\ /", c); print(" |\n");
- print("| "); print("\\ /", c); print(" FluidX3D Version 2.13 |\n");
+ print("| "); print("\\ /", c); print(" FluidX3D Version 2.14 |\n");
print("| "); print( "'", c); print(" Copyright (c) Dr. Moritz Lehmann |\n");
print("|-----------------------------------------------------------------------------|\n");
}
diff --git a/src/kernel.cpp b/src/kernel.cpp
index 4baa166b..67794452 100644
--- a/src/kernel.cpp
+++ b/src/kernel.cpp
@@ -23,6 +23,11 @@ string opencl_c_container() { return R( // ########################## begin of O
)+R(float fast_acos(const float x) { // slightly fastwer approximation
return fma(fma(-0.5702f, sq(sq(sq(x))), -1.0f), x, 1.5712963f); // 0.5707964f = (pi-2)/2
}
+)+R(void swap(float* x, float* y) {
+ const float t = *x;
+ *x = *y;
+ *y = t;
+}
)+R(void lu_solve(float* M, float* x, float* b, const int N, const int Nsol) { // solves system of N linear equations M*x=b within dimensionality Nsol<=N
for(int i=0; i int color)
- x = clamp(360.0f-x*360.0f/255.0f, 0.0f, 360.0f);
- float r=255.0f, g=0.0f, b=0.0f;
- if(x<60.0f) { // white - yellow
- g = 255.0f;
- b = 255.0f-255.0f*x/60.0f;
- } else if(x<180.0f) { // yellow - red
- g = 255.0f-255.0f*(x-60.0f)/120.0f;
- } else if(x<270.0f) { // red - violet
- r = 255.0f-255.0f*(x-180.0f)/180.0f;
- b = 255.0f*(x-180.0f)/90.0f;
- } else { // violet - black
- r = 255.0f-255.0f*(x-180.0f)/180.0f;
- b = 255.0f-255.0f*(x-270.0f)/90.0f;
- }
- return (((int)r)<<16)|(((int)g)<<8)|((int)b);
-}
-)+R(int rainbow_color(float x) { // coloring scheme (float 0-255 -> int color)
- x = clamp(360.0f-x*360.0f/255.0f, 0.0f, 360.0f);
- float r=0.0f, g=0.0f, b=0.0f; // black
- if(x<60.0f) { // red - yellow
- r = 255.0f;
- g = 255.0f*x/60.0f;
- } else if(x>=60.0f&&x<120.0f) { // yellow - green
- r = 255.0f-255.0f*(x-60.0f)/60.0f;
- g = 255.0f;
- } else if(x>=120.0f&&x<180.0f) { // green - cyan
- g = 255.0f;
- b = 255.0f*(x-120.0f)/60.0f;
- } else if(x>=180.0f&&x<240.0f) { // cyan - blue
- g = 255.0f-255.0f*(x-180.0f)/60.0f;
- b = 255.0f;
- } else if(x>=240.0f&&x<300.0f) { // blue - violet
- r = (255.0f*(x-240.0f)/60.0f)/2.0f;
- b = 255.0f;
- } else { // violet - black
- r = (255.0f-255.0f*(x-300.0f)/60.0f)/2.0f;
- b = 255.0f-255.0f*(x-300.0f)/60.0f;
- }
- return (((int)r)<<16)|(((int)g)<<8)|((int)b);
+)+R(int color_from_floats(const float red, const float green, const float blue) {
+ return clamp((int)fma(255.0f, red, 0.5f), 0, 255)<<16|clamp((int)fma(255.0f, green, 0.5f), 0, 255)<<8|clamp((int)fma(255.0f, blue, 0.5f), 0, 255);
}
-)+R(int color_dim(const int c, const float x) {
- const int r = clamp((int)fma((float)((c>>16)&255), x, 0.5f), 0, 255);
- const int g = clamp((int)fma((float)((c>> 8)&255), x, 0.5f), 0, 255);
- const int b = clamp((int)fma((float)( c &255), x, 0.5f), 0, 255);
- return (r&255)<<16|(g&255)<<8|(b&255);
+)+R(int color_mul(const int c, const float x) { // c*x
+ const int r = min((int)fma((float)((c>>16)&255), x, 0.5f), 255);
+ const int g = min((int)fma((float)((c>> 8)&255), x, 0.5f), 255);
+ const int b = min((int)fma((float)( c &255), x, 0.5f), 255);
+ return r<<16|g<<8|b; // values are already clamped
}
)+R(int color_average(const int c1, const int c2) { // (c1+c2)/s
const uchar4 cc1=as_uchar4(c1), cc2=as_uchar4(c2);
@@ -167,7 +133,51 @@ string opencl_c_container() { return R( // ########################## begin of O
else if(h<240.0f) { g = x; b = c; }
else if(h<300.0f) { r = x; b = c; }
else if(h<360.0f) { r = c; b = x; }
- return (int)((r+m)*255.0f)<<16|(int)((g+m)*255.0f)<<8|(int)((b+m)*255.0f);
+ return color_from_floats(r+m, g+m, b+m);
+}
+)+R(int colorscale_rainbow(float x) { // coloring scheme (float [0, 1] -> int color)
+ x = clamp(6.0f*(1.0f-x), 0.0f, 6.0f);
+ float r=0.0f, g=0.0f, b=0.0f; // black
+ if(x<1.2f) { // red - yellow
+ r = 1.0f;
+ g = x*0.83333333f;
+ } else if(x>=1.2f&&x<2.0f) { // yellow - green
+ r = 2.5f-x*1.25f;
+ g = 1.0f;
+ } else if(x>=2.0f&&x<3.0f) { // green - cyan
+ g = 1.0f;
+ b = x-2.0f;
+ } else if(x>=3.0f&&x<4.0f) { // cyan - blue
+ g = 4.0f-x;
+ b = 1.0f;
+ } else if(x>=4.0f&&x<5.0f) { // blue - violet
+ r = x*0.4f-1.6f;
+ b = 3.0f-x*0.5f;
+ } else { // violet - black
+ r = 2.4f-x*0.4f;
+ b = 3.0f-x*0.5f;
+ }
+ return color_from_floats(r, g, b);
+}
+)+R(int colorscale_iron(float x) { // coloring scheme (float [0, 1] -> int color)
+ x = clamp(4.0f*(1.0f-x), 0.0f, 4.0f);
+ float r=1.0f, g=0.0f, b=0.0f;
+ if(x<0.66666667f) { // white - yellow
+ g = 1.0f;
+ b = 1.0f-x*1.5f;
+ } else if(x<2.0f) { // yellow - red
+ g = 1.5f-x*0.75f;
+ } else if(x<3.0f) { // red - violet
+ r = 2.0f-x*0.5f;
+ b = x-2.0f;
+ } else { // violet - black
+ r = 2.0f-x*0.5f;
+ b = 4.0f-x;
+ }
+ return color_from_floats(r, g, b);
+}
+)+R(int colorscale_twocolor(float x) { // coloring scheme (float [0, 1] -> int color)
+ return x>0.5f ? color_mix(0xFFAA00, 0x181818, clamp(2.0f*x-1.0f, 0.0f, 1.0f)) : color_mix(0x181818, 0x0080FF, clamp(2.0f*x, 0.0f, 1.0f)); // red - gray - blue
}
)+R(int shading(const int c, const float3 p, const float3 normal, const float* camera_cache) { // calculate flat shading color of triangle
)+"#ifndef GRAPHICS_TRANSPARENCY"+R(
@@ -178,9 +188,7 @@ string opencl_c_container() { return R( // ########################## begin of O
const float3 d = p-Rz*(dis/zoom)-pos; // distance vector between p and camera position
const float nl2 = sq(normal.x)+sq(normal.y)+sq(normal.z); // only one native_sqrt instead of two
const float dl2 = sq(d.x)+sq(d.y)+sq(d.z);
- const float br = max(1.5f*fabs(dot(normal, d))*rsqrt(nl2*dl2), 0.3f);
- const float cr=(float)((c>>16)&255), cg=(float)((c>>8)&255), cb=(float)(c&255);
- return min((int)(br*cr), 255)<<16|min((int)(br*cg), 255)<<8|min((int)(br*cb), 255);
+ return color_mul(c, max(1.5f*fabs(dot(normal, d))*rsqrt(nl2*dl2), 0.3f));
)+"#else"+R( // GRAPHICS_TRANSPARENCY
return c; // disable flat shading, just return input color
)+"#endif"+R( // GRAPHICS_TRANSPARENCY
@@ -640,7 +648,7 @@ string opencl_c_container() { return R( // ########################## begin of O
}
}
)+R(int skybox_color_bw(const float x, const float y) {
- return color_dim(0xFFFFFF, 1.0f-y);
+ return color_mul(0xFFFFFF, 1.0f-y);
}
)+R(int skybox_color_hsv(const float x, const float y) {
const float h = fmod(x*360.0f+120.0f, 360.0f);
@@ -2180,6 +2188,19 @@ string opencl_c_container() { return R( // ########################## begin of O
insert_gi(a, index_insert_p(a, direction), 2u*direction+0u, t, transfer_buffer_p, gi);
insert_gi(a, index_insert_m(a, direction), 2u*direction+1u, t, transfer_buffer_m, gi);
}
+
+)+R(kernel void transfer_extract_T(const uint direction, const ulong t, global float* transfer_buffer_p, global float* transfer_buffer_m, const global float* T) {
+ const uint a=get_global_id(0), A=get_area(direction); // a = domain area index for each side, A = area of the domain boundary
+ if(a>=A) return; // area might not be a multiple of def_workgroup_size, so return here to avoid writing in unallocated memory space
+ transfer_buffer_p[a] = T[index_extract_p(a, direction)];
+ transfer_buffer_m[a] = T[index_extract_m(a, direction)];
+}
+)+R(kernel void transfer__insert_T(const uint direction, const ulong t, const global float* transfer_buffer_p, const global float* transfer_buffer_m, global float* T) {
+ const uint a=get_global_id(0), A=get_area(direction); // a = domain area index for each side, A = area of the domain boundary
+ if(a>=A) return; // area might not be a multiple of def_workgroup_size, so return here to avoid writing in unallocated memory space
+ T[index_insert_p(a, direction)] = transfer_buffer_p[a];
+ T[index_insert_m(a, direction)] = transfer_buffer_m[a];
+}
)+"#endif"+R( // TEMPERATURE
@@ -2283,9 +2304,9 @@ string opencl_c_container() { return R( // ########################## begin of O
)+"#ifdef GRAPHICS"+R(
)+"#ifndef FORCE_FIELD"+R( // render flags as grid
-)+R(kernel void graphics_flags(const global uchar* flags, const global float* camera, global int* bitmap, global int* zbuffer) {
+)+R(kernel void graphics_flags(const global float* camera, global int* bitmap, global int* zbuffer, const global uchar* flags) {
)+"#else"+R( // FORCE_FIELD
-)+R(kernel void graphics_flags(const global uchar* flags, const global float* camera, global int* bitmap, global int* zbuffer, const global float* F) {
+)+R(kernel void graphics_flags(const global float* camera, global int* bitmap, global int* zbuffer, const global uchar* flags, const global float* F) {
)+"#endif"+R( // FORCE_FIELD
const uint n = get_global_id(0);
if(n>=(uint)def_N||is_halo(n)) return; // don't execute graphics_flags() on halo
@@ -2345,7 +2366,7 @@ string opencl_c_container() { return R( // ########################## begin of O
const float3 Fn = def_scale_F*(float3)(F[n], F[def_N+(ulong)n], F[2ul*def_N+(ulong)n]);
const float Fnl = length(Fn);
if(Fnl>0.0f) {
- const int c = iron_color(255.0f*Fnl); // color boundaries depending on the force on them
+ const int c = colorscale_iron(Fnl); // color boundaries depending on the force on them
draw_line(p, p+Fn, c, camera_cache, bitmap, zbuffer); // draw colored force vectors
}
}
@@ -2353,9 +2374,9 @@ string opencl_c_container() { return R( // ########################## begin of O
}
)+"#ifndef FORCE_FIELD"+R( // render solid boundaries with marching-cubes
-)+R(kernel void graphics_flags_mc(const global uchar* flags, const global float* camera, global int* bitmap, global int* zbuffer) {
+)+R(kernel void graphics_flags_mc(const global float* camera, global int* bitmap, global int* zbuffer, const global uchar* flags) {
)+"#else"+R( // FORCE_FIELD
-)+R(kernel void graphics_flags_mc(const global uchar* flags, const global float* camera, global int* bitmap, global int* zbuffer, const global float* F) {
+)+R(kernel void graphics_flags_mc(const global float* camera, global int* bitmap, global int* zbuffer, const global uchar* flags, const global float* F) {
)+"#endif"+R( // FORCE_FIELD
const uint n = get_global_id(0);
if(n>=(uint)def_N||is_halo(n)) return; // don't execute graphics_flags() on halo
@@ -2399,30 +2420,34 @@ string opencl_c_container() { return R( // ########################## begin of O
int c0, c1, c2; {
const float x1=p0.x, y1=p0.y, z1=p0.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
const float3 Fi = (x0*y0*z0)*Fj[0]+(x1*y0*z0)*Fj[1]+(x1*y0*z1)*Fj[2]+(x0*y0*z1)*Fj[3]+(x0*y1*z0)*Fj[4]+(x1*y1*z0)*Fj[5]+(x1*y1*z1)*Fj[6]+(x0*y1*z1)*Fj[7]; // perform trilinear interpolation
- c0 = shading(rainbow_color(191.0f+255.0f*def_scale_F*dot(Fi, normal)), p+p0, normal, camera_cache); // rainbow_color(191.0f+255.0f*def_scale_F*dot(Fi, normal));
+ c0 = shading(colorscale_twocolor(0.5f+def_scale_F*dot(Fi, normal)), p+p0, normal, camera_cache); // colorscale_twocolor(0.5f+def_scale_F*dot(Fi, normal));
} {
const float x1=p1.x, y1=p1.y, z1=p1.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
const float3 Fi = (x0*y0*z0)*Fj[0]+(x1*y0*z0)*Fj[1]+(x1*y0*z1)*Fj[2]+(x0*y0*z1)*Fj[3]+(x0*y1*z0)*Fj[4]+(x1*y1*z0)*Fj[5]+(x1*y1*z1)*Fj[6]+(x0*y1*z1)*Fj[7]; // perform trilinear interpolation
- c1 = shading(rainbow_color(191.0f+255.0f*def_scale_F*dot(Fi, normal)), p+p1, normal, camera_cache); // rainbow_color(191.0f+255.0f*def_scale_F*dot(Fi, normal));
+ c1 = shading(colorscale_twocolor(0.5f+def_scale_F*dot(Fi, normal)), p+p1, normal, camera_cache); // colorscale_twocolor(0.5f+def_scale_F*dot(Fi, normal));
} {
const float x1=p2.x, y1=p2.y, z1=p2.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
const float3 Fi = (x0*y0*z0)*Fj[0]+(x1*y0*z0)*Fj[1]+(x1*y0*z1)*Fj[2]+(x0*y0*z1)*Fj[3]+(x0*y1*z0)*Fj[4]+(x1*y1*z0)*Fj[5]+(x1*y1*z1)*Fj[6]+(x0*y1*z1)*Fj[7]; // perform trilinear interpolation
- c2 = shading(rainbow_color(191.0f+255.0f*def_scale_F*dot(Fi, normal)), p+p2, normal, camera_cache); // rainbow_color(191.0f+255.0f*def_scale_F*dot(Fi, normal));
+ c2 = shading(colorscale_twocolor(0.5f+def_scale_F*dot(Fi, normal)), p+p2, normal, camera_cache); // colorscale_twocolor(0.5f+def_scale_F*dot(Fi, normal));
}
draw_triangle_interpolated(p+p0, p+p1, p+p2, c0, c1, c2, camera_cache, bitmap, zbuffer); // draw triangle with interpolated colors
)+"#else"+R( // FORCE_FIELD
- const int c = shading(0xDFDFDF, p+(p0+p1+p2)/3.0f, normal, camera_cache); // 0xDFDFDF;
- draw_triangle(p+p0, p+p1, p+p2, c, camera_cache, bitmap, zbuffer);
+ const int c0 = shading(0xDFDFDF, p+p0, normal, camera_cache);
+ const int c1 = shading(0xDFDFDF, p+p1, normal, camera_cache);
+ const int c2 = shading(0xDFDFDF, p+p2, normal, camera_cache);
+ draw_triangle_interpolated(p+p0, p+p1, p+p2, c0, c1, c2, camera_cache, bitmap, zbuffer);
)+"#endif"+R( // FORCE_FIELD
}
}
-)+R(kernel void graphics_field(const global uchar* flags, const global float* u, const global float* camera, global int* bitmap, global int* zbuffer, const int slice_mode, const int slice_x, const int slice_y, const int slice_z) {
+)+"#ifndef TEMPERATURE"+R(
+)+R(kernel void graphics_field(const global float* camera, global int* bitmap, global int* zbuffer, const int field_mode, const global float* rho, const global float* u, const global uchar* flags) {
+)+"#else"+R( // TEMPERATURE
+)+R(kernel void graphics_field(const global float* camera, global int* bitmap, global int* zbuffer, const int field_mode, const global float* rho, const global float* u, const global uchar* flags, const global float* T) {
+)+"#endif"+R( // TEMPERATURE
const uint n = get_global_id(0);
if(n>=(uint)def_N||is_halo(n)) return; // don't execute graphics_field() on halo
const uint3 xyz = coordinates(n);
- const bool rx=(int)xyz.x!=slice_x, ry=(int)xyz.y!=slice_y, rz=(int)xyz.z!=slice_z;
- if((slice_mode==1&&rx)||(slice_mode==2&&ry)||(slice_mode==3&&rz)||(slice_mode==4&&rx&&rz)||(slice_mode==5&&rx&&ry&&rz)||(slice_mode==6&&ry&&rz)||(slice_mode==7&&rx&&ry)) return;
float camera_cache[15]; // cache camera parameters in case the kernel draws more than one shape
for(uint i=0u; i<15u; i++) camera_cache[i] = camera[i];
const float3 p = position(xyz);
@@ -2432,20 +2457,89 @@ string opencl_c_container() { return R( // ########################## begin of O
)+"#else"+R( // EQUILIBRIUM_BOUNDARIES
if(flags[n]&(TYPE_I|TYPE_G)) return;
)+"#endif"+R( // EQUILIBRIUM_BOUNDARIES
- float3 un = load_u(n, u); // cache velocity
+ const float3 un = load_u(n, u); // cache velocity
const float ul = length(un);
if(def_scale_u*ul<0.1f) return; // don't draw lattice points where the velocity is lower than this threshold
- const int c = iron_color(255.0f*def_scale_u*ul); // coloring by velocity
+ int c = 0; // coloring
+ switch(field_mode) {
+ case 0: c = colorscale_rainbow(def_scale_u*ul); break; // coloring by velocity
+ case 1: c = colorscale_twocolor(0.5f+def_scale_rho*(rho[n]-1.0f)); break; // coloring by density
+)+"#ifdef TEMPERATURE"+R(
+ case 2: c = colorscale_iron(0.5f+def_scale_T*(T[n]-def_T_avg)); break; // coloring by temperature
+)+"#endif"+R( // TEMPERATURE
+ }
draw_line(p-(0.5f/ul)*un, p+(0.5f/ul)*un, c, camera_cache, bitmap, zbuffer);
}
-)+"#ifndef GRAPHICS_TEMPERATURE"+R(
-)+R(kernel void graphics_streamline(const global uchar* flags, const global float* u, const global float* camera, global int* bitmap, global int* zbuffer, const int slice_mode, const int slice_x, const int slice_y, const int slice_z) {
-)+"#else"+R( // GRAPHICS_TEMPERATURE
-)+R(kernel void graphics_streamline(const global uchar* flags, const global float* u, const global float* camera, global int* bitmap, global int* zbuffer, const int slice_mode, const int slice_x, const int slice_y, const int slice_z, const global float* T) {
-)+"#endif"+R( // GRAPHICS_TEMPERATURE
+)+"#ifndef TEMPERATURE"+R(
+)+R(kernel void graphics_field_slice(const global float* camera, global int* bitmap, global int* zbuffer, const int field_mode, const int slice_mode, const int slice_x, const int slice_y, const int slice_z, const global float* rho, const global float* u, const global uchar* flags) {
+)+"#else"+R( // TEMPERATURE
+)+R(kernel void graphics_field_slice(const global float* camera, global int* bitmap, global int* zbuffer, const int field_mode, const int slice_mode, const int slice_x, const int slice_y, const int slice_z, const global float* rho, const global float* u, const global uchar* flags, const global float* T) {
+)+"#endif"+R( // TEMPERATURE
+ const uint a = get_global_id(0);
+ const uint direction = (uint)clamp(slice_mode-1, 0, 2);
+ if(a>=get_area(direction)||slice_mode<1||slice_mode>3||(slice_mode==1&&(slice_x<0||slice_x>=(int)def_Nx))||(slice_mode==2&&(slice_y<0||slice_y>=(int)def_Ny))||(slice_mode==3&&(slice_z<0||slice_z>=(int)def_Nz))) return;
+ uint3 xyz00, xyz01, xyz10, xyz11;
+ float3 normal;
+ switch(direction) {
+ case 0u: xyz00 = (uint3)((uint)slice_x, a%def_Ny, a/def_Ny); if(xyz00.y>=def_Ny-1u||xyz00.z>=def_Nz-1u) return; xyz01 = xyz00+(uint3)(0u, 0u, 1u); xyz10 = xyz00+(uint3)(0u, 1u, 0u); xyz11 = xyz00+(uint3)(0u, 1u, 1u); normal = (float3)(1.0f, 0.0f, 0.0f); break;
+ case 1u: xyz00 = (uint3)(a/def_Nz, (uint)slice_y, a%def_Nz); if(xyz00.x>=def_Nx-1u||xyz00.z>=def_Nz-1u) return; xyz01 = xyz00+(uint3)(0u, 0u, 1u); xyz10 = xyz00+(uint3)(1u, 0u, 0u); xyz11 = xyz00+(uint3)(1u, 0u, 1u); normal = (float3)(0.0f, 1.0f, 0.0f); break;
+ case 2u: xyz00 = (uint3)(a%def_Nx, a/def_Nx, (uint)slice_z); if(xyz00.x>=def_Nx-1u||xyz00.y>=def_Ny-1u) return; xyz01 = xyz00+(uint3)(0u, 1u, 0u); xyz10 = xyz00+(uint3)(1u, 0u, 0u); xyz11 = xyz00+(uint3)(1u, 1u, 0u); normal = (float3)(0.0f, 0.0f, 1.0f); break;
+ }
+ const float3 p00=position(xyz00), p01=position(xyz01), p10=position(xyz10), p11=position(xyz11), p=0.25f*(p00+p01+p10+p11);
+ float camera_cache[15]; // cache camera parameters in case the kernel draws more than one shape
+ for(uint i=0u; i<15u; i++) camera_cache[i] = camera[i];
+ if(!is_in_camera_frustrum(p, camera_cache)) return; // skip loading LBM data if grid cell is not visible
+ const ulong n00=index(xyz00), n01=index(xyz01), n10=index(xyz10), n11=index(xyz11);
+ bool d00=true, d01=true, d10=true, d11=true;
+)+"#ifdef SURFACE"+R(
+ d00 = flags[n00]&(TYPE_F|TYPE_I); // only draw fluid or interface cells
+ d01 = flags[n01]&(TYPE_F|TYPE_I);
+ d10 = flags[n10]&(TYPE_F|TYPE_I);
+ d11 = flags[n11]&(TYPE_F|TYPE_I);
+ if((int)d00+(int)d01+(int)d10+(int)d11<3) return;
+)+"#endif"+R( // SURFACE
+ int c00=0, c01=0, c10=0, c11=0;
+ switch(field_mode) {
+ case 0: // coloring by velocity
+ c00 = colorscale_rainbow(def_scale_u*length(load_u(n00, u)));
+ c01 = colorscale_rainbow(def_scale_u*length(load_u(n01, u)));
+ c10 = colorscale_rainbow(def_scale_u*length(load_u(n10, u)));
+ c11 = colorscale_rainbow(def_scale_u*length(load_u(n11, u)));
+ break;
+ case 1: // coloring by density
+ c00 = colorscale_twocolor(0.5f+def_scale_rho*(rho[n00]-1.0f));
+ c01 = colorscale_twocolor(0.5f+def_scale_rho*(rho[n01]-1.0f));
+ c10 = colorscale_twocolor(0.5f+def_scale_rho*(rho[n10]-1.0f));
+ c11 = colorscale_twocolor(0.5f+def_scale_rho*(rho[n11]-1.0f));
+ break;
+)+"#ifdef TEMPERATURE"+R(
+ case 2: // coloring by temperature
+ c00 = colorscale_iron(0.5f+def_scale_T*(T[n00]-def_T_avg));
+ c01 = colorscale_iron(0.5f+def_scale_T*(T[n01]-def_T_avg));
+ c10 = colorscale_iron(0.5f+def_scale_T*(T[n10]-def_T_avg));
+ c11 = colorscale_iron(0.5f+def_scale_T*(T[n11]-def_T_avg));
+ break;
+)+"#endif"+R( // TEMPERATURE
+ }
+ c00 = shading(c00, p00, normal, camera_cache);
+ c01 = shading(c01, p01, normal, camera_cache);
+ c10 = shading(c10, p10, normal, camera_cache);
+ c11 = shading(c11, p11, normal, camera_cache);
+ const int c = color_average(color_average(c00, c11), color_average(c01, c10));
+ if(d00&&d01) draw_triangle_interpolated(p00, p01, p, c00, c01, c, camera_cache, bitmap, zbuffer);
+ if(d01&&d11) draw_triangle_interpolated(p01, p11, p, c01, c11, c, camera_cache, bitmap, zbuffer);
+ if(d11&&d10) draw_triangle_interpolated(p11, p10, p, c11, c10, c, camera_cache, bitmap, zbuffer);
+ if(d10&&d00) draw_triangle_interpolated(p10, p00, p, c10, c00, c, camera_cache, bitmap, zbuffer);
+}
+
+)+"#ifndef TEMPERATURE"+R(
+)+R(kernel void graphics_streamline(const global float* camera, global int* bitmap, global int* zbuffer, const int field_mode, const int slice_mode, const int slice_x, const int slice_y, const int slice_z, const global float* rho, const global float* u, const global uchar* flags) {
+)+"#else"+R( // TEMPERATURE
+)+R(kernel void graphics_streamline(const global float* camera, global int* bitmap, global int* zbuffer, const int field_mode, const int slice_mode, const int slice_x, const int slice_y, const int slice_z, const global float* rho, const global float* u, const global uchar* flags, const global float* T) {
+)+"#endif"+R( // TEMPERATURE
const uint n = get_global_id(0);
- const float3 slice = position((uint3)(slice_x, slice_y, slice_z));
+ const float3 ps = (float3)((float)slice_x+0.5f-0.5f*(float)def_Nx, (float)slice_y+0.5f-0.5f*(float)def_Ny, (float)slice_z+0.5f-0.5f*(float)def_Nz);
)+"#ifndef D2Q9"+R(
if(n>=(def_Nx/def_streamline_sparse)*(def_Ny/def_streamline_sparse)*(def_Nz/def_streamline_sparse)) return;
const uint z = n/((def_Nx/def_streamline_sparse)*(def_Ny/def_streamline_sparse)); // disassemble 1D index to 3D coordinates
@@ -2453,18 +2547,18 @@ string opencl_c_container() { return R( // ########################## begin of O
const uint y = t/(def_Nx/def_streamline_sparse);
const uint x = t%(def_Nx/def_streamline_sparse);
float3 p = (float)def_streamline_sparse*((float3)((float)x+0.5f, (float)y+0.5f, (float)z+0.5f))-0.5f*((float3)((float)def_Nx, (float)def_Ny, (float)def_Nz));
- const bool rx=fabs(p.x-slice.x)>0.5f*(float)def_streamline_sparse, ry=fabs(p.y-slice.y)>0.5f*(float)def_streamline_sparse, rz=fabs(p.z-slice.z)>0.5f*(float)def_streamline_sparse;
+ const bool rx=fabs(p.x-ps.x)>0.5f*(float)def_streamline_sparse, ry=fabs(p.y-ps.y)>0.5f*(float)def_streamline_sparse, rz=fabs(p.z-ps.z)>0.5f*(float)def_streamline_sparse;
)+"#else"+R( // D2Q9
if(n>=(def_Nx/def_streamline_sparse)*(def_Ny/def_streamline_sparse)) return;
const uint y = n/(def_Nx/def_streamline_sparse); // disassemble 1D index to 3D coordinates
const uint x = n%(def_Nx/def_streamline_sparse);
float3 p = ((float3)((float)def_streamline_sparse*((float)x+0.5f), (float)def_streamline_sparse*((float)y+0.5f), 0.5f))-0.5f*((float3)((float)def_Nx, (float)def_Ny, (float)def_Nz));
- const bool rx=fabs(p.x-slice.x)>0.5f*(float)def_streamline_sparse, ry=fabs(p.y-slice.y)>0.5f*(float)def_streamline_sparse, rz=true;
+ const bool rx=fabs(p.x-ps.x)>0.5f*(float)def_streamline_sparse, ry=fabs(p.y-ps.y)>0.5f*(float)def_streamline_sparse, rz=true;
)+"#endif"+R( // D2Q9
if((slice_mode==1&&rx)||(slice_mode==2&&ry)||(slice_mode==3&&rz)||(slice_mode==4&&rx&&rz)||(slice_mode==5&&rx&&ry&&rz)||(slice_mode==6&&ry&&rz)||(slice_mode==7&&rx&&ry)) return;
- if((slice_mode==1||slice_mode==5||slice_mode==4||slice_mode==7)&!rx) p.x = slice.x; // snap streamline position to slice position
- if((slice_mode==2||slice_mode==5||slice_mode==6||slice_mode==7)&!ry) p.y = slice.y;
- if((slice_mode==3||slice_mode==5||slice_mode==4||slice_mode==6)&!rz) p.z = slice.z;
+ if((slice_mode==1||slice_mode==5||slice_mode==4||slice_mode==7)&!rx) p.x = ps.x; // snap streamline position to slice position
+ if((slice_mode==2||slice_mode==5||slice_mode==6||slice_mode==7)&!ry) p.y = ps.y;
+ if((slice_mode==3||slice_mode==5||slice_mode==4||slice_mode==6)&!rz) p.z = ps.z;
float camera_cache[15]; // cache camera parameters in case the kernel draws more than one shape
for(uint i=0u; i<15u; i++) camera_cache[i] = camera[i];
const float hLx=0.5f*(float)(def_Nx-2u*(def_Dx>1u)), hLy=0.5f*(float)(def_Ny-2u*(def_Dy>1u)), hLz=0.5f*(float)(def_Nz-2u*(def_Dz>1u));
@@ -2482,17 +2576,24 @@ string opencl_c_container() { return R( // ########################## begin of O
p0 = p1;
p1 += (dt/ul)*un; // integrate forward in time
if(def_scale_u*ul<0.1f||p1.x<-hLx||p1.x>hLx||p1.y<-hLy||p1.y>hLy||p1.z<-hLz||p1.z>hLz) break;
-)+"#ifndef GRAPHICS_TEMPERATURE"+R(
- const int c = iron_color(255.0f*def_scale_u*ul);
-)+"#else"+R( // GRAPHICS_TEMPERATURE
- const int c = iron_color(167.0f+255.0f*(T[n]-def_T_avg));
-)+"#endif"+R( // GRAPHICS_TEMPERATURE
+ int c = 0; // coloring
+ switch(field_mode) {
+ case 0: c = colorscale_rainbow(def_scale_u*ul); break; // coloring by velocity
+ case 1: c = colorscale_twocolor(0.5f+def_scale_rho*(rho[n]-1.0f)); break; // coloring by density
+)+"#ifdef TEMPERATURE"+R(
+ case 2: c = colorscale_iron(0.5f+def_scale_T*(T[n]-def_T_avg)); break; // coloring by temperature
+)+"#endif"+R( // TEMPERATURE
+ }
draw_line(p0, p1, c, camera_cache, bitmap, zbuffer);
}
}
}
-)+R(kernel void graphics_q_field(const global uchar* flags, const global float* u, const global float* camera, global int* bitmap, global int* zbuffer) {
+)+"#ifndef TEMPERATURE"+R(
+)+R(kernel void graphics_q_field(const global float* camera, global int* bitmap, global int* zbuffer, const int field_mode, const global float* rho, const global float* u, const global uchar* flags) {
+)+"#else"+R( // TEMPERATURE
+)+R(kernel void graphics_q_field(const global float* camera, global int* bitmap, global int* zbuffer, const int field_mode, const global float* rho, const global float* u, const global uchar* flags, const global float* T) {
+)+"#endif"+R( // TEMPERATURE
const uint n = get_global_id(0);
if(n>=(uint)def_N||is_halo(n)) return; // don't execute graphics_q_field() on halo
if(flags[n]&(TYPE_S|TYPE_E|TYPE_I|TYPE_G)) return;
@@ -2504,11 +2605,22 @@ string opencl_c_container() { return R( // ########################## begin of O
const float ul = length(un);
const float Q = calculate_Q(n, u);
if(Q=def_Nx-1u||xyz.y>=def_Ny-1u||xyz.z>=def_Nz-1u||is_halo_q(xyz)) return; // don't execute graphics_q_field() on marching-cubes halo
@@ -2583,25 +2695,64 @@ string opencl_c_container() { return R( // ########################## begin of O
const float3 p1 = triangles[3u*i+1u];
const float3 p2 = triangles[3u*i+2u];
const float3 normal = normalize(cross(p1-p0, p2-p0));
- int c0, c1, c2; {
- const float x1=p0.x, y1=p0.y, z1=p0.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
- const float3 ui = (x0*y0*z0)*uj[0]+(x1*y0*z0)*uj[1]+(x1*y0*z1)*uj[2]+(x0*y0*z1)*uj[3]+(x0*y1*z0)*uj[4]+(x1*y1*z0)*uj[5]+(x1*y1*z1)*uj[6]+(x0*y1*z1)*uj[7]; // perform trilinear interpolation
- c0 = shading(rainbow_color(255.0f*def_scale_u*length(ui)), p+p0, normal, camera_cache); // rainbow_color(255.0f*def_scale_u*length(ui));
- } {
- const float x1=p1.x, y1=p1.y, z1=p1.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
- const float3 ui = (x0*y0*z0)*uj[0]+(x1*y0*z0)*uj[1]+(x1*y0*z1)*uj[2]+(x0*y0*z1)*uj[3]+(x0*y1*z0)*uj[4]+(x1*y1*z0)*uj[5]+(x1*y1*z1)*uj[6]+(x0*y1*z1)*uj[7]; // perform trilinear interpolation
- c1 = shading(rainbow_color(255.0f*def_scale_u*length(ui)), p+p1, normal, camera_cache); // rainbow_color(255.0f*def_scale_u*length(ui));
- } {
- const float x1=p2.x, y1=p2.y, z1=p2.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
- const float3 ui = (x0*y0*z0)*uj[0]+(x1*y0*z0)*uj[1]+(x1*y0*z1)*uj[2]+(x0*y0*z1)*uj[3]+(x0*y1*z0)*uj[4]+(x1*y1*z0)*uj[5]+(x1*y1*z1)*uj[6]+(x0*y1*z1)*uj[7]; // perform trilinear interpolation
- c2 = shading(rainbow_color(255.0f*def_scale_u*length(ui)), p+p2, normal, camera_cache); // rainbow_color(255.0f*def_scale_u*length(ui));
+ int c0=0, c1=0, c2=0;
+ switch(field_mode) {
+ case 0: // coloring by velocity
+ {
+ const float x1=p0.x, y1=p0.y, z1=p0.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
+ const float3 ui = (x0*y0*z0)*uj[0]+(x1*y0*z0)*uj[1]+(x1*y0*z1)*uj[2]+(x0*y0*z1)*uj[3]+(x0*y1*z0)*uj[4]+(x1*y1*z0)*uj[5]+(x1*y1*z1)*uj[6]+(x0*y1*z1)*uj[7]; // perform trilinear interpolation
+ c0 = shading(colorscale_rainbow(def_scale_u*length(ui)), p+p0, normal, camera_cache); // colorscale_rainbow(def_scale_u*length(ui));
+ } {
+ const float x1=p1.x, y1=p1.y, z1=p1.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
+ const float3 ui = (x0*y0*z0)*uj[0]+(x1*y0*z0)*uj[1]+(x1*y0*z1)*uj[2]+(x0*y0*z1)*uj[3]+(x0*y1*z0)*uj[4]+(x1*y1*z0)*uj[5]+(x1*y1*z1)*uj[6]+(x0*y1*z1)*uj[7]; // perform trilinear interpolation
+ c1 = shading(colorscale_rainbow(def_scale_u*length(ui)), p+p1, normal, camera_cache); // colorscale_rainbow(def_scale_u*length(ui));
+ } {
+ const float x1=p2.x, y1=p2.y, z1=p2.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
+ const float3 ui = (x0*y0*z0)*uj[0]+(x1*y0*z0)*uj[1]+(x1*y0*z1)*uj[2]+(x0*y0*z1)*uj[3]+(x0*y1*z0)*uj[4]+(x1*y1*z0)*uj[5]+(x1*y1*z1)*uj[6]+(x0*y1*z1)*uj[7]; // perform trilinear interpolation
+ c2 = shading(colorscale_rainbow(def_scale_u*length(ui)), p+p2, normal, camera_cache); // colorscale_rainbow(def_scale_u*length(ui));
+ }
+ break;
+ case 1: // coloring by density
+ for(uint i=0u; i<8u; i++) v[i] = rho[j[i]];
+ {
+ const float x1=p0.x, y1=p0.y, z1=p0.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
+ const float vi = (x0*y0*z0)*v[0]+(x1*y0*z0)*v[1]+(x1*y0*z1)*v[2]+(x0*y0*z1)*v[3]+(x0*y1*z0)*v[4]+(x1*y1*z0)*v[5]+(x1*y1*z1)*v[6]+(x0*y1*z1)*v[7]; // perform trilinear interpolation
+ c0 = shading(colorscale_twocolor(0.5f+def_scale_rho*(vi-1.0f)), p+p0, normal, camera_cache); // colorscale_twocolor(0.5f+def_scale_rho*(vi-1.0f));
+ } {
+ const float x1=p1.x, y1=p1.y, z1=p1.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
+ const float vi = (x0*y0*z0)*v[0]+(x1*y0*z0)*v[1]+(x1*y0*z1)*v[2]+(x0*y0*z1)*v[3]+(x0*y1*z0)*v[4]+(x1*y1*z0)*v[5]+(x1*y1*z1)*v[6]+(x0*y1*z1)*v[7]; // perform trilinear interpolation
+ c1 = shading(colorscale_twocolor(0.5f+def_scale_rho*(vi-1.0f)), p+p1, normal, camera_cache); // colorscale_twocolor(0.5f+def_scale_rho*(vi-1.0f));
+ } {
+ const float x1=p2.x, y1=p2.y, z1=p2.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
+ const float vi = (x0*y0*z0)*v[0]+(x1*y0*z0)*v[1]+(x1*y0*z1)*v[2]+(x0*y0*z1)*v[3]+(x0*y1*z0)*v[4]+(x1*y1*z0)*v[5]+(x1*y1*z1)*v[6]+(x0*y1*z1)*v[7]; // perform trilinear interpolation
+ c2 = shading(colorscale_twocolor(0.5f+def_scale_rho*(vi-1.0f)), p+p2, normal, camera_cache); // colorscale_twocolor(0.5f+def_scale_rho*(vi-1.0f));
+ }
+ break;
+)+"#ifdef TEMPERATURE"+R(
+ case 2: // coloring by temperature
+ for(uint i=0u; i<8u; i++) v[i] = T[j[i]];
+ {
+ const float x1=p0.x, y1=p0.y, z1=p0.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
+ const float vi = (x0*y0*z0)*v[0]+(x1*y0*z0)*v[1]+(x1*y0*z1)*v[2]+(x0*y0*z1)*v[3]+(x0*y1*z0)*v[4]+(x1*y1*z0)*v[5]+(x1*y1*z1)*v[6]+(x0*y1*z1)*v[7]; // perform trilinear interpolation
+ c0 = shading(colorscale_iron(0.5f+def_scale_T*(vi-def_T_avg)), p+p0, normal, camera_cache); // colorscale_iron(0.5f+def_scale_T*(T[n]-def_T_avg));
+ } {
+ const float x1=p1.x, y1=p1.y, z1=p1.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
+ const float vi = (x0*y0*z0)*v[0]+(x1*y0*z0)*v[1]+(x1*y0*z1)*v[2]+(x0*y0*z1)*v[3]+(x0*y1*z0)*v[4]+(x1*y1*z0)*v[5]+(x1*y1*z1)*v[6]+(x0*y1*z1)*v[7]; // perform trilinear interpolation
+ c1 = shading(colorscale_iron(0.5f+def_scale_T*(vi-def_T_avg)), p+p1, normal, camera_cache); // colorscale_iron(0.5f+def_scale_T*(T[n]-def_T_avg));
+ } {
+ const float x1=p2.x, y1=p2.y, z1=p2.z, x0=1.0f-x1, y0=1.0f-y1, z0=1.0f-z1; // calculate interpolation factors
+ const float vi = (x0*y0*z0)*v[0]+(x1*y0*z0)*v[1]+(x1*y0*z1)*v[2]+(x0*y0*z1)*v[3]+(x0*y1*z0)*v[4]+(x1*y1*z0)*v[5]+(x1*y1*z1)*v[6]+(x0*y1*z1)*v[7]; // perform trilinear interpolation
+ c2 = shading(colorscale_iron(0.5f+def_scale_T*(vi-def_T_avg)), p+p2, normal, camera_cache); // colorscale_iron(0.5f+def_scale_T*(T[n]-def_T_avg));
+ }
+ break;
+)+"#endif"+R( // TEMPERATURE
}
draw_triangle_interpolated(p+p0, p+p1, p+p2, c0, c1, c2, camera_cache, bitmap, zbuffer); // draw triangle with interpolated colors
}
}
)+"#ifdef SURFACE"+R(
-)+R(kernel void graphics_rasterize_phi(const global float* phi, const global float* camera, global int* bitmap, global int* zbuffer) { // marching cubes
+)+R(kernel void graphics_rasterize_phi(const global float* camera, global int* bitmap, global int* zbuffer, const global float* phi) { // marching cubes
const uint n = get_global_id(0);
const uint3 xyz = coordinates(n);
if(xyz.x>=def_Nx-1u||xyz.y>=def_Ny-1u||xyz.z>=def_Nz-1u) return;
@@ -2630,15 +2781,18 @@ string opencl_c_container() { return R( // ########################## begin of O
const uint tn = marching_cubes(v, 0.5f, triangles); // run marching cubes algorithm
if(tn==0u) return;
for(uint i=0u; i=(uint)def_particles_N) return;
float camera_cache[15]; // cache parameters in case the kernel draws more than one shape
for(uint i=0u; i<15u; i++) camera_cache[i] = camera[i];
const int c = COLOR_P; // coloring scheme
const float3 p = (float3)(particles[n], particles[def_particles_N+(ulong)n], particles[2ul*def_particles_N+(ulong)n]);
- //draw_circle(p, 0.5f, c, camera_cache, bitmap, zbuffer);
draw_point(p, c, camera_cache, bitmap, zbuffer);
+ //draw_circle(p, 0.5f, c, camera_cache, bitmap, zbuffer);
}
)+"#endif"+R( // PARTICLES
)+"#endif"+R( // GRAPHICS
diff --git a/src/lbm.cpp b/src/lbm.cpp
index 17b1acfc..84c346e0 100644
--- a/src/lbm.cpp
+++ b/src/lbm.cpp
@@ -422,15 +422,16 @@ void LBM_Domain::Graphics::allocate(Device& device) {
camera_parameters = Memory(device, 15u);
kernel_clear = Kernel(device, bitmap.length(), "graphics_clear", bitmap, zbuffer);
- kernel_graphics_flags = Kernel(device, lbm->get_N(), "graphics_flags", lbm->flags, camera_parameters, bitmap, zbuffer);
- kernel_graphics_flags_mc = Kernel(device, lbm->get_N(), "graphics_flags_mc", lbm->flags, camera_parameters, bitmap, zbuffer);
- kernel_graphics_field = Kernel(device, lbm->get_N(), "graphics_field", lbm->flags, lbm->u, camera_parameters, bitmap, zbuffer, 0, 0, 0, 0);
+ kernel_graphics_flags = Kernel(device, lbm->get_N(), "graphics_flags", camera_parameters, bitmap, zbuffer, lbm->flags);
+ kernel_graphics_flags_mc = Kernel(device, lbm->get_N(), "graphics_flags_mc", camera_parameters, bitmap, zbuffer, lbm->flags);
+ kernel_graphics_field = Kernel(device, lbm->get_N(), "graphics_field", camera_parameters, bitmap, zbuffer, 0, lbm->rho, lbm->u, lbm->flags);
+ kernel_graphics_field_slice = Kernel(device, lbm->get_N(), "graphics_field_slice", camera_parameters, bitmap, zbuffer, 0, 0, 0, 0, 0, lbm->rho, lbm->u, lbm->flags);
#ifndef D2Q9
- kernel_graphics_streamline = Kernel(device, (lbm->get_Nx()/GRAPHICS_STREAMLINE_SPARSE)*(lbm->get_Ny()/GRAPHICS_STREAMLINE_SPARSE)*(lbm->get_Nz()/GRAPHICS_STREAMLINE_SPARSE), "graphics_streamline", lbm->flags, lbm->u, camera_parameters, bitmap, zbuffer, 0, 0, 0, 0); // 3D
+ kernel_graphics_streamline = Kernel(device, (lbm->get_Nx()/GRAPHICS_STREAMLINE_SPARSE)*(lbm->get_Ny()/GRAPHICS_STREAMLINE_SPARSE)*(lbm->get_Nz()/GRAPHICS_STREAMLINE_SPARSE), "graphics_streamline", camera_parameters, bitmap, zbuffer, 0, 0, 0, 0, 0, lbm->rho, lbm->u, lbm->flags); // 3D
#else // D2Q9
- kernel_graphics_streamline = Kernel(device, (lbm->get_Nx()/GRAPHICS_STREAMLINE_SPARSE)*(lbm->get_Ny()/GRAPHICS_STREAMLINE_SPARSE), "graphics_streamline", lbm->flags, lbm->u, camera_parameters, bitmap, zbuffer, 0, 0, 0, 0); // 2D
+ kernel_graphics_streamline = Kernel(device, (lbm->get_Nx()/GRAPHICS_STREAMLINE_SPARSE)*(lbm->get_Ny()/GRAPHICS_STREAMLINE_SPARSE), "graphics_streamline", camera_parameters, bitmap, zbuffer, 0, 0, 0, 0, 0, lbm->rho, lbm->u, lbm->flags); // 2D
#endif // D2Q9
- kernel_graphics_q = Kernel(device, lbm->get_N(), "graphics_q", lbm->flags, lbm->u, camera_parameters, bitmap, zbuffer);
+ kernel_graphics_q = Kernel(device, lbm->get_N(), "graphics_q", camera_parameters, bitmap, zbuffer, 0, lbm->rho, lbm->u, lbm->flags);
#ifdef FORCE_FIELD
kernel_graphics_flags.add_parameters(lbm->F);
@@ -439,16 +440,19 @@ void LBM_Domain::Graphics::allocate(Device& device) {
#ifdef SURFACE
skybox = Memory(device, skybox_image->width()*skybox_image->height(), 1u, skybox_image->data());
- kernel_graphics_rasterize_phi = Kernel(device, lbm->get_N(), "graphics_rasterize_phi", lbm->phi, camera_parameters, bitmap, zbuffer);
- kernel_graphics_raytrace_phi = Kernel(device, bitmap.length(), "graphics_raytrace_phi", lbm->phi, lbm->flags, skybox, camera_parameters, bitmap);
+ kernel_graphics_rasterize_phi = Kernel(device, lbm->get_N(), "graphics_rasterize_phi", camera_parameters, bitmap, zbuffer, lbm->phi);
+ kernel_graphics_raytrace_phi = Kernel(device, bitmap.length(), "graphics_raytrace_phi", camera_parameters, bitmap, skybox, lbm->phi, lbm->flags);
#endif // SURFACE
#ifdef TEMPERATURE
+ kernel_graphics_field.add_parameters(lbm->T);
+ kernel_graphics_field_slice.add_parameters(lbm->T);
kernel_graphics_streamline.add_parameters(lbm->T);
+ kernel_graphics_q.add_parameters(lbm->T);
#endif // TEMPERATURE
#ifdef PARTICLES
- kernel_graphics_particles = Kernel(device, lbm->particles.length(), "graphics_particles", lbm->particles, camera_parameters, bitmap, zbuffer);
+ kernel_graphics_particles = Kernel(device, lbm->particles.length(), "graphics_particles", camera_parameters, bitmap, zbuffer, lbm->particles);
#endif // PARTICLES
}
@@ -462,7 +466,7 @@ bool LBM_Domain::Graphics::update_camera() {
}
return change; // return false if camera parameters remain unchanged
}
-bool LBM_Domain::Graphics::enqueue_draw_frame(const int visualization_modes, const int slice_mode, const int slice_x, const int slice_y, const int slice_z, const bool visualization_change) {
+bool LBM_Domain::Graphics::enqueue_draw_frame(const int visualization_modes, const int field_mode, const int slice_mode, const int slice_x, const int slice_y, const int slice_z, const bool visualization_change) {
const bool camera_update = update_camera();
#if defined(INTERACTIVE_GRAPHICS)||defined(INTERACTIVE_GRAPHICS_ASCII)
if(!visualization_change&&!camera_update&&!camera.key_update&&lbm->get_t()==t_last_rendered_frame) return false; // don't render a new frame if the scene hasn't changed since last frame
@@ -471,17 +475,44 @@ bool LBM_Domain::Graphics::enqueue_draw_frame(const int visualization_modes, con
camera.key_update = false;
if(camera_update) camera_parameters.enqueue_write_to_device(); // camera_parameters PCIe transfer and kernel_clear execution can happen simulataneously
kernel_clear.enqueue_run();
+ const int sx=slice_x-lbm->Ox, sy=slice_y-lbm->Oy, sz=slice_z-lbm->Oz; // subtract domain offsets
#ifdef SURFACE
if((visualization_modes&VIS_PHI_RAYTRACE)&&lbm->get_D()==1u) kernel_graphics_raytrace_phi.enqueue_run(); // disable raytracing for multi-GPU (domain decomposition rendering doesn't work for raytracing)
if(visualization_modes&VIS_PHI_RASTERIZE) kernel_graphics_rasterize_phi.enqueue_run();
#endif // SURFACE
- if(visualization_modes&VIS_FLAG_LATTICE ) kernel_graphics_flags.enqueue_run();
- if(visualization_modes&VIS_FLAG_SURFACE ) kernel_graphics_flags_mc.enqueue_run();
- if(visualization_modes&VIS_FIELD ) kernel_graphics_field.set_parameters(5u, slice_mode, slice_x-lbm->Ox, slice_y-lbm->Oy, slice_z-lbm->Oz).enqueue_run();
- if(visualization_modes&VIS_STREAMLINES ) kernel_graphics_streamline.set_parameters(5u, slice_mode, slice_x-lbm->Ox, slice_y-lbm->Oy, slice_z-lbm->Oz).enqueue_run();
- if(visualization_modes&VIS_Q_CRITERION ) kernel_graphics_q.enqueue_run();
+ if(visualization_modes&VIS_FLAG_LATTICE) kernel_graphics_flags.enqueue_run();
+ if(visualization_modes&VIS_FLAG_SURFACE) kernel_graphics_flags_mc.enqueue_run();
+ if(visualization_modes&VIS_FIELD) {
+ switch(slice_mode) { // 0 (no slice), 1 (x), 2 (y), 3 (z), 4 (xz), 5 (xyz), 6 (yz), 7 (xy)
+ case 0: // no slice
+ kernel_graphics_field.set_parameters(3u, field_mode).enqueue_run();
+ break;
+ case 1: case 2: case 3: // x/y/z
+ kernel_graphics_field_slice.set_ranges(lbm->get_area((uint)clamp(slice_mode-1, 0, 2))).set_parameters(3u, field_mode, slice_mode, sx, sy, sz).enqueue_run();
+ break;
+ case 4: // xz
+ kernel_graphics_field_slice.set_ranges(lbm->get_area(0u)).set_parameters(3u, field_mode, 0u+1u, sx, sy, sz).enqueue_run();
+ kernel_graphics_field_slice.set_ranges(lbm->get_area(2u)).set_parameters(3u, field_mode, 2u+1u, sx, sy, sz).enqueue_run();
+ break;
+ case 5: // xyz
+ kernel_graphics_field_slice.set_ranges(lbm->get_area(0u)).set_parameters(3u, field_mode, 0u+1u, sx, sy, sz).enqueue_run();
+ kernel_graphics_field_slice.set_ranges(lbm->get_area(1u)).set_parameters(3u, field_mode, 1u+1u, sx, sy, sz).enqueue_run();
+ kernel_graphics_field_slice.set_ranges(lbm->get_area(2u)).set_parameters(3u, field_mode, 2u+1u, sx, sy, sz).enqueue_run();
+ break;
+ case 6: // yz
+ kernel_graphics_field_slice.set_ranges(lbm->get_area(1u)).set_parameters(3u, field_mode, 1u+1u, sx, sy, sz).enqueue_run();
+ kernel_graphics_field_slice.set_ranges(lbm->get_area(2u)).set_parameters(3u, field_mode, 2u+1u, sx, sy, sz).enqueue_run();
+ break;
+ case 7: // xy
+ kernel_graphics_field_slice.set_ranges(lbm->get_area(0u)).set_parameters(3u, field_mode, 0u+1u, sx, sy, sz).enqueue_run();
+ kernel_graphics_field_slice.set_ranges(lbm->get_area(1u)).set_parameters(3u, field_mode, 1u+1u, sx, sy, sz).enqueue_run();
+ break;
+ }
+ }
+ if(visualization_modes&VIS_STREAMLINES) kernel_graphics_streamline.set_parameters(3u, field_mode, slice_mode, sx, sy, sz).enqueue_run();
+ if(visualization_modes&VIS_Q_CRITERION) kernel_graphics_q.set_parameters(3u, field_mode).enqueue_run();
#ifdef PARTICLES
- if(visualization_modes&VIS_PARTICLES ) kernel_graphics_particles.enqueue_run();
+ if(visualization_modes&VIS_PARTICLES) kernel_graphics_particles.enqueue_run();
#endif // PARTICLES
bitmap.enqueue_read_from_device();
if(lbm->get_D()>1u) zbuffer.enqueue_read_from_device();
@@ -500,8 +531,10 @@ string LBM_Domain::Graphics::device_defines() const { return
"\n #define def_screen_width " +to_string(camera.width)+"u"
"\n #define def_screen_height " +to_string(camera.height)+"u"
"\n #define def_scale_u " +to_string(1.0f/(0.57735027f*(GRAPHICS_U_MAX)))+"f"
+ "\n #define def_scale_rho " +to_string(0.5f/(GRAPHICS_RHO_DELTA))+"f"
+ "\n #define def_scale_T " +to_string(0.5f/(GRAPHICS_T_DELTA))+"f"
+ "\n #define def_scale_F " +to_string(0.5f/(GRAPHICS_F_MAX))+"f"
"\n #define def_scale_Q_min " +to_string(GRAPHICS_Q_CRITERION)+"f"
- "\n #define def_scale_F " +to_string(1.0f/(GRAPHICS_F_MAX))+"f"
"\n #define def_streamline_sparse "+to_string(GRAPHICS_STREAMLINE_SPARSE)+"u"
"\n #define def_streamline_length "+to_string(GRAPHICS_STREAMLINE_LENGTH)+"u"
"\n #define def_n " +to_string(1.333f)+"f" // refractive index of water for raytracing graphics
@@ -530,10 +563,6 @@ string LBM_Domain::Graphics::device_defines() const { return
"\n #define def_skybox_width " +to_string(skybox_image->width() )+"u"
"\n #define def_skybox_height "+to_string(skybox_image->height())+"u"
#endif // SURFACE
-
-#ifdef TEMPERATURE
- "\n #define GRAPHICS_TEMPERATURE"
-#endif // TEMPERATURE
;}
#endif // GRAPHICS
@@ -614,44 +643,44 @@ LBM::LBM(const uint Nx, const uint Ny, const uint Nz, const uint Dx, const uint
const uint Hx=Dx>1u, Hy=Dy>1u, Hz=Dz>1u; // halo offsets
const vector& device_infos = smart_device_selection(D);
sanity_checks_constructor(device_infos, this->Nx, this->Ny, this->Nz, Dx, Dy, Dz, nu, fx, fy, fz, sigma, alpha, beta, particles_N, particles_rho);
- lbm = new LBM_Domain*[D];
+ lbm_domain = new LBM_Domain*[D];
for(uint d=0u; dNx/Dx+2u*Hx, this->Ny/Dy+2u*Hy, this->Nz/Dz+2u*Hz, Dx, Dy, Dz, (int)(x*this->Nx/Dx)-(int)Hx, (int)(y*this->Ny/Dy)-(int)Hy, (int)(z*this->Nz/Dz)-(int)Hz, nu, fx, fy, fz, sigma, alpha, beta, particles_N, particles_rho);
+ lbm_domain[d] = new LBM_Domain(device_infos[d], this->Nx/Dx+2u*Hx, this->Ny/Dy+2u*Hy, this->Nz/Dz+2u*Hz, Dx, Dy, Dz, (int)(x*this->Nx/Dx)-(int)Hx, (int)(y*this->Ny/Dy)-(int)Hy, (int)(z*this->Nz/Dz)-(int)Hz, nu, fx, fy, fz, sigma, alpha, beta, particles_N, particles_rho);
} // });
{
Memory** buffers_rho = new Memory*[D];
- for(uint d=0u; drho);
+ for(uint d=0u; drho);
rho = Memory_Container(this, buffers_rho, "rho");
} {
Memory** buffers_u = new Memory*[D];
- for(uint d=0u; du);
+ for(uint d=0u; du);
u = Memory_Container(this, buffers_u, "u");
} {
Memory** buffers_flags = new Memory*[D];
- for(uint d=0u; dflags);
+ for(uint d=0u; dflags);
flags = Memory_Container(this, buffers_flags, "flags");
} {
#ifdef FORCE_FIELD
Memory** buffers_F = new Memory*[D];
- for(uint d=0u; dF);
+ for(uint d=0u; dF);
F = Memory_Container(this, buffers_F, "F");
#endif // FORCE_FIELD
} {
#ifdef SURFACE
Memory** buffers_phi = new Memory*[D];
- for(uint d=0u; dphi);
+ for(uint d=0u; dphi);
phi = Memory_Container(this, buffers_phi, "phi");
#endif // SURFACE
} {
#ifdef TEMPERATURE
Memory** buffers_T = new Memory*[D];
- for(uint d=0u; dT);
+ for(uint d=0u; dT);
T = Memory_Container(this, buffers_T, "T");
#endif // TEMPERATURE
} {
#ifdef PARTICLES
- particles = &(lbm[0]->particles);
+ particles = &(lbm_domain[0]->particles);
#endif // PARTICLES
}
#ifdef GRAPHICS
@@ -661,8 +690,8 @@ LBM::LBM(const uint Nx, const uint Ny, const uint Nz, const uint Dx, const uint
}
LBM::~LBM() {
info.print_finalize();
- for(uint d=0u; d& device_infos, const uint Nx, const uint Ny, const uint Nz, const uint Dx, const uint Dy, const uint Dz, const float nu, const float fx, const float fy, const float fz, const float sigma, const float alpha, const float beta, const uint particles_N, const float particles_rho) { // sanity checks on grid resolution and extension support
@@ -770,66 +799,70 @@ void LBM::initialize() { // write all data fields to device and call kernel_init
sanity_checks_initialization();
#endif // BENCHMARK
- for(uint d=0u; drho.enqueue_write_to_device();
- for(uint d=0u; du.enqueue_write_to_device();
- for(uint d=0u; dflags.enqueue_write_to_device();
+ for(uint d=0u; drho.enqueue_write_to_device();
+ for(uint d=0u; du.enqueue_write_to_device();
+ for(uint d=0u; dflags.enqueue_write_to_device();
#ifdef FORCE_FIELD
- for(uint d=0u; dF.enqueue_write_to_device();
+ for(uint d=0u; dF.enqueue_write_to_device();
#endif // FORCE_FIELD
#ifdef SURFACE
- for(uint d=0u; dphi.enqueue_write_to_device();
+ for(uint d=0u; dphi.enqueue_write_to_device();
#endif // SURFACE
#ifdef TEMPERATURE
- for(uint d=0u; dT.enqueue_write_to_device();
+ for(uint d=0u; dT.enqueue_write_to_device();
#endif // TEMPERATURE
#ifdef PARTICLES
- for(uint d=0u; dparticles.enqueue_write_to_device();
+ for(uint d=0u; dparticles.enqueue_write_to_device();
#endif // PARTICLES
- for(uint d=0u; dincrement_time_step(); // the communicate calls at initialization need an odd time step
+ for(uint d=0u; dincrement_time_step(); // the communicate calls at initialization need an odd time step
communicate_rho_u_flags();
#ifdef SURFACE
communicate_phi_massex_flags();
#endif // SURFACE
- for(uint d=0u; denqueue_initialize(); // odd time step is baked-in the kernel
+ for(uint d=0u; denqueue_initialize(); // odd time step is baked-in the kernel
communicate_rho_u_flags();
#ifdef SURFACE
communicate_phi_massex_flags();
#endif // SURFACE
communicate_fi(); // time step must be odd here
#ifdef TEMPERATURE
+ communicate_T(); // T halo data is required for field_slice rendering
communicate_gi(); // time step must be odd here
#endif // TEMPERATURE
- for(uint d=0u; dfinish_queue();
- for(uint d=0u; dreset_time_step(); // set time step to 0 again
+ for(uint d=0u; dfinish_queue();
+ for(uint d=0u; dreset_time_step(); // set time step to 0 again
initialized = true;
}
void LBM::do_time_step() { // call kernel_stream_collide to perform one LBM time step
#ifdef SURFACE
- for(uint d=0u; denqueue_surface_0();
+ for(uint d=0u; denqueue_surface_0();
#endif // SURFACE
- for(uint d=0u; denqueue_stream_collide(); // run LBM stream_collide kernel after domain communication
+ for(uint d=0u; denqueue_stream_collide(); // run LBM stream_collide kernel after domain communication
#if defined(SURFACE) || defined(GRAPHICS)
communicate_rho_u_flags(); // rho/u/flags halo data is required for SURFACE extension, and u halo data is required for Q-criterion rendering
#endif // SURFACE || GRAPHICS
#ifdef SURFACE
- for(uint d=0u; denqueue_surface_1();
+ for(uint d=0u; denqueue_surface_1();
communicate_flags();
- for(uint d=0u; denqueue_surface_2();
+ for(uint d=0u; denqueue_surface_2();
communicate_flags();
- for(uint d=0u; denqueue_surface_3();
+ for(uint d=0u; denqueue_surface_3();
communicate_phi_massex_flags();
#endif // SURFACE
communicate_fi();
#ifdef TEMPERATURE
+#ifdef GRAPHICS
+ communicate_T(); // T halo data is required for field_slice rendering
+#endif // GRAPHICS
communicate_gi();
#endif // TEMPERATURE
#ifdef PARTICLES
- for(uint d=0u; denqueue_integrate_particles(); // intgegrate particles forward in time and couple particles to fluid
+ for(uint d=0u; denqueue_integrate_particles(); // intgegrate particles forward in time and couple particles to fluid
#endif // PARTICLES
- if(get_D()==1u) for(uint d=0u; dfinish_queue(); // this additional domain synchronization barrier is only required in single-GPU, as communication calls already provide all necessary synchronization barriers in multi-GPU
- for(uint d=0u; dincrement_time_step();
+ if(get_D()==1u) for(uint d=0u; dfinish_queue(); // this additional domain synchronization barrier is only required in single-GPU, as communication calls already provide all necessary synchronization barriers in multi-GPU
+ for(uint d=0u; dincrement_time_step();
}
void LBM::run(const ulong steps) { // initializes the LBM simulation (copies data to device and runs initialize kernel), then runs LBM
@@ -848,12 +881,12 @@ void LBM::run(const ulong steps) { // initializes the LBM simulation (copies dat
do_time_step();
info.update(clock.stop());
}
- if(get_D()>1u) for(uint d=0u; dfinish_queue(); // wait for everything to finish (multi-GPU only)
+ if(get_D()>1u) for(uint d=0u; dfinish_queue(); // wait for everything to finish (multi-GPU only)
}
void LBM::update_fields() { // update fields (rho, u, T) manually
- for(uint d=0u; denqueue_update_fields();
- for(uint d=0u; dfinish_queue();
+ for(uint d=0u; denqueue_update_fields();
+ for(uint d=0u; dfinish_queue();
}
void LBM::reset() { // reset simulation (takes effect in following run() call)
@@ -862,8 +895,8 @@ void LBM::reset() { // reset simulation (takes effect in following run() call)
#ifdef FORCE_FIELD
void LBM::calculate_force_on_boundaries() { // calculate forces from fluid on TYPE_S cells
- for(uint d=0u; denqueue_calculate_force_on_boundaries();
- for(uint d=0u; dfinish_queue();
+ for(uint d=0u; denqueue_calculate_force_on_boundaries();
+ for(uint d=0u; dfinish_queue();
}
float3 LBM::calculate_object_center_of_mass(const uchar flag_marker) { // calculate center of mass of all cells flagged with flag_marker
double3 com(0.0, 0.0, 0.0);
@@ -920,9 +953,9 @@ float3 LBM::calculate_torque_on_object(const float3& rotation_center, const ucha
#ifdef MOVING_BOUNDARIES
void LBM::update_moving_boundaries() { // mark/unmark cells next to TYPE_S cells with velocity!=0 with TYPE_MS
- for(uint d=0u; denqueue_update_moving_boundaries();
+ for(uint d=0u; denqueue_update_moving_boundaries();
communicate_rho_u_flags();
- for(uint d=0u; dfinish_queue();
+ for(uint d=0u; dfinish_queue();
}
#endif // MOVING_BOUNDARIES
@@ -936,9 +969,9 @@ void LBM::integrate_particles(const ulong steps, const uint time_step_multiplica
if(!running) break;
#endif // INTERACTIVE_GRAPHICS_ASCII || INTERACTIVE_GRAPHICS
clock.start();
- for(uint d=0u; denqueue_integrate_particles(time_step_multiplicator);
- for(uint d=0u; dfinish_queue();
- for(uint d=0u; dincrement_time_step(time_step_multiplicator);
+ for(uint d=0u; denqueue_integrate_particles(time_step_multiplicator);
+ for(uint d=0u; dfinish_queue();
+ for(uint d=0u; dincrement_time_step(time_step_multiplicator);
info.update(clock.stop());
}
}
@@ -971,10 +1004,10 @@ void LBM::write_status(const string& path) { // write LBM status report to a .tx
void LBM::voxelize_mesh_on_device(const Mesh* mesh, const uchar flag, const float3& rotation_center, const float3& linear_velocity, const float3& rotational_velocity) { // voxelize triangle mesh
if(get_D()==1u) {
- lbm[0]->voxelize_mesh_on_device(mesh, flag, rotation_center, linear_velocity, rotational_velocity); // if this crashes on Windows, create a TdrDelay 32-bit DWORD with decimal value 300 in Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers
+ lbm_domain[0]->voxelize_mesh_on_device(mesh, flag, rotation_center, linear_velocity, rotational_velocity); // if this crashes on Windows, create a TdrDelay 32-bit DWORD with decimal value 300 in Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers
} else {
parallel_for((ulong)get_D(), get_D(), [&](ulong d) {
- lbm[d]->voxelize_mesh_on_device(mesh, flag, rotation_center, linear_velocity, rotational_velocity);
+ lbm_domain[d]->voxelize_mesh_on_device(mesh, flag, rotation_center, linear_velocity, rotational_velocity);
});
}
#ifdef MOVING_BOUNDARIES
@@ -986,8 +1019,8 @@ void LBM::voxelize_mesh_on_device(const Mesh* mesh, const uchar flag, const floa
}
}
void LBM::unvoxelize_mesh_on_device(const Mesh* mesh, const uchar flag) { // remove voxelized triangle mesh from LBM grid by removing all flags in mesh bounding box (only required when bounding box size changes during re-voxelization)
- for(uint d=0u; denqueue_unvoxelize_mesh_on_device(mesh, flag);
- for(uint d=0u; dfinish_queue();
+ for(uint d=0u; denqueue_unvoxelize_mesh_on_device(mesh, flag);
+ for(uint d=0u; dfinish_queue();
}
void LBM::write_mesh_to_vtk(const Mesh* mesh, const string& path, const bool convert_to_si_units) const { // write mesh to binary .vtk file
const string header_1 = "# vtk DataFile Version 3.0\nData\nBINARY\nDATASET POLYDATA\nPOINTS "+to_string(3u*mesh->triangle_number)+" float\n";
@@ -1046,7 +1079,7 @@ void LBM::voxelize_stl(const string& path, const float size, const uchar flag) {
int* LBM::Graphics::draw_frame() {
#ifndef UPDATE_FIELDS
if(visualization_modes&(VIS_FIELD|VIS_STREAMLINES|VIS_Q_CRITERION)) {
- for(uint d=0u; dget_D(); d++) lbm->lbm[d]->enqueue_update_fields(); // only call update_fields() if the time step has changed since the last rendered frame
+ for(uint d=0u; dget_D(); d++) lbm->lbm_domain[d]->enqueue_update_fields(); // only call update_fields() if the time step has changed since the last rendered frame
}
#endif // UPDATE_FIELDS
if(key_1) { visualization_modes = (visualization_modes&~0b11)|(((visualization_modes&0b11)+1)%4); key_1 = false; }
@@ -1059,6 +1092,13 @@ int* LBM::Graphics::draw_frame() {
if(key_T) {
slice_mode = (slice_mode+1)%8; key_T = false;
}
+ if(key_Z) {
+#ifndef TEMPERATURE
+ field_mode = (field_mode+1)%2; key_Z = false; // field_mode = { 0 (u), 1 (rho) }
+#else // TEMPERATURE
+ field_mode = (field_mode+1)%3; key_Z = false; // field_mode = { 0 (u), 1 (rho), 2 (T) }
+#endif // TEMPERATURE
+ }
if(slice_mode==1u) {
if(key_Q) { slice_x = clamp(slice_x-1, 0, (int)lbm->get_Nx()-1); key_Q = false; }
if(key_E) { slice_x = clamp(slice_x+1, 0, (int)lbm->get_Nx()-1); key_E = false; }
@@ -1071,20 +1111,21 @@ int* LBM::Graphics::draw_frame() {
if(key_Q) { slice_z = clamp(slice_z-1, 0, (int)lbm->get_Nz()-1); key_Q = false; }
if(key_E) { slice_z = clamp(slice_z+1, 0, (int)lbm->get_Nz()-1); key_E = false; }
}
- const bool visualization_change = last_visualization_modes!=visualization_modes||last_slice_mode!=slice_mode||last_slice_x!=slice_x||last_slice_y!=slice_y||last_slice_z!=slice_z;
+ const bool visualization_change = last_visualization_modes!=visualization_modes||last_field_mode!=field_mode||last_slice_mode!=slice_mode||last_slice_x!=slice_x||last_slice_y!=slice_y||last_slice_z!=slice_z;
last_visualization_modes = visualization_modes;
+ last_field_mode = field_mode;
last_slice_mode = slice_mode;
last_slice_x = slice_x;
last_slice_y = slice_y;
last_slice_z = slice_z;
bool new_frame = true;
- for(uint d=0u; dget_D(); d++) new_frame = new_frame && lbm->lbm[d]->graphics.enqueue_draw_frame(visualization_modes, slice_mode, slice_x, slice_y, slice_z, visualization_change);
- for(uint d=0u; dget_D(); d++) lbm->lbm[d]->finish_queue();
- int* bitmap = lbm->lbm[0]->graphics.get_bitmap();
- int* zbuffer = lbm->lbm[0]->graphics.get_zbuffer();
+ for(uint d=0u; dget_D(); d++) new_frame = new_frame && lbm->lbm_domain[d]->graphics.enqueue_draw_frame(visualization_modes, field_mode, slice_mode, slice_x, slice_y, slice_z, visualization_change);
+ for(uint d=0u; dget_D(); d++) lbm->lbm_domain[d]->finish_queue();
+ int* bitmap = lbm->lbm_domain[0]->graphics.get_bitmap();
+ int* zbuffer = lbm->lbm_domain[0]->graphics.get_zbuffer();
for(uint d=1u; dget_D()&&new_frame; d++) {
- const int* bitmap_d = lbm->lbm[d]->graphics.get_bitmap(); // each domain renders its own frame
- const int* zbuffer_d = lbm->lbm[d]->graphics.get_zbuffer();
+ const int* bitmap_d = lbm->lbm_domain[d]->graphics.get_bitmap(); // each domain renders its own frame
+ const int* zbuffer_d = lbm->lbm_domain[d]->graphics.get_zbuffer();
for(uint i=0u; i1u) { // communicate in x-direction
- for(uint d=0u; denqueue_transfer_extract_field(lbm[d]->kernel_transfer[field][0], 0u, bytes_per_cell); // selective in-VRAM copy (x) + PCIe copy
- for(uint d=0u; dfinish_queue(); // domain synchronization barrier
+ for(uint d=0u; denqueue_transfer_extract_field(lbm_domain[d]->kernel_transfer[field][0], 0u, bytes_per_cell); // selective in-VRAM copy (x) + PCIe copy
+ for(uint d=0u; dfinish_queue(); // domain synchronization barrier
for(uint d=0u; dtransfer_buffer_p.exchange_host_buffer(lbm[dxp]->transfer_buffer_m.exchange_host_buffer(lbm[d]->transfer_buffer_p.data())); // CPU pointer swaps
+ lbm_domain[d]->transfer_buffer_p.exchange_host_buffer(lbm_domain[dxp]->transfer_buffer_m.exchange_host_buffer(lbm_domain[d]->transfer_buffer_p.data())); // CPU pointer swaps
}
- for(uint d=0u; d enqueue_transfer_insert_field(lbm[d]->kernel_transfer[field][1], 0u, bytes_per_cell); // PCIe copy + selective in-VRAM copy (x)
+ for(uint d=0u; d enqueue_transfer_insert_field(lbm_domain[d]->kernel_transfer[field][1], 0u, bytes_per_cell); // PCIe copy + selective in-VRAM copy (x)
}
if(Dy>1u) { // communicate in y-direction
- for(uint d=0u; denqueue_transfer_extract_field(lbm[d]->kernel_transfer[field][0], 1u, bytes_per_cell); // selective in-VRAM copy (y) + PCIe copy
- for(uint d=0u; dfinish_queue(); // domain synchronization barrier
+ for(uint d=0u; denqueue_transfer_extract_field(lbm_domain[d]->kernel_transfer[field][0], 1u, bytes_per_cell); // selective in-VRAM copy (y) + PCIe copy
+ for(uint d=0u; dfinish_queue(); // domain synchronization barrier
for(uint d=0u; dtransfer_buffer_p.exchange_host_buffer(lbm[dyp]->transfer_buffer_m.exchange_host_buffer(lbm[d]->transfer_buffer_p.data())); // CPU pointer swaps
+ lbm_domain[d]->transfer_buffer_p.exchange_host_buffer(lbm_domain[dyp]->transfer_buffer_m.exchange_host_buffer(lbm_domain[d]->transfer_buffer_p.data())); // CPU pointer swaps
}
- for(uint d=0u; d enqueue_transfer_insert_field(lbm[d]->kernel_transfer[field][1], 1u, bytes_per_cell); // PCIe copy + selective in-VRAM copy (y)
+ for(uint d=0u; d enqueue_transfer_insert_field(lbm_domain[d]->kernel_transfer[field][1], 1u, bytes_per_cell); // PCIe copy + selective in-VRAM copy (y)
}
if(Dz>1u) { // communicate in z-direction
- for(uint d=0u; denqueue_transfer_extract_field(lbm[d]->kernel_transfer[field][0], 2u, bytes_per_cell); // selective in-VRAM copy (z) + PCIe copy
- for(uint d=0u; dfinish_queue(); // domain synchronization barrier
+ for(uint d=0u; denqueue_transfer_extract_field(lbm_domain[d]->kernel_transfer[field][0], 2u, bytes_per_cell); // selective in-VRAM copy (z) + PCIe copy
+ for(uint d=0u; dfinish_queue(); // domain synchronization barrier
for(uint d=0u; dtransfer_buffer_p.exchange_host_buffer(lbm[dzp]->transfer_buffer_m.exchange_host_buffer(lbm[d]->transfer_buffer_p.data())); // CPU pointer swaps
+ lbm_domain[d]->transfer_buffer_p.exchange_host_buffer(lbm_domain[dzp]->transfer_buffer_m.exchange_host_buffer(lbm_domain[d]->transfer_buffer_p.data())); // CPU pointer swaps
}
- for(uint d=0u; d enqueue_transfer_insert_field(lbm[d]->kernel_transfer[field][1], 2u, bytes_per_cell); // PCIe copy + selective in-VRAM copy (z)
+ for(uint d=0u; d enqueue_transfer_insert_field(lbm_domain[d]->kernel_transfer[field][1], 2u, bytes_per_cell); // PCIe copy + selective in-VRAM copy (z)
}
}
@@ -1276,4 +1319,7 @@ void LBM::communicate_phi_massex_flags() {
void LBM::communicate_gi() {
communicate_field(enum_transfer_field::gi, sizeof(fpxx));
}
+void LBM::communicate_T() {
+ communicate_field(enum_transfer_field::T, 4u);
+}
#endif // TEMPERATURE
\ No newline at end of file
diff --git a/src/lbm.hpp b/src/lbm.hpp
index fc117601..42980e93 100644
--- a/src/lbm.hpp
+++ b/src/lbm.hpp
@@ -15,7 +15,7 @@ string default_filename(const string& path, const string& name, const string& ex
string default_filename(const string& name, const string& extension, const ulong t); // generate a default filename with timestamp at exe_path/export/
#pragma warning(disable:26812)
-enum enum_transfer_field { fi, rho_u_flags, flags, phi_massex_flags, gi, enum_transfer_field_length };
+enum enum_transfer_field { fi, rho_u_flags, flags, phi_massex_flags, gi, T, enum_transfer_field_length };
class LBM_Domain {
private:
@@ -150,6 +150,7 @@ class LBM_Domain {
Kernel kernel_graphics_flags; // render flag lattice with wireframe
Kernel kernel_graphics_flags_mc; // render flag lattice with marching-cubes
Kernel kernel_graphics_field; // render a colored velocity vector for each cell
+ Kernel kernel_graphics_field_slice; // render one slice of velocity field according to slics settings
Kernel kernel_graphics_streamline; // render streamlines
Kernel kernel_graphics_q; // render vorticity (Q-criterion)
@@ -185,7 +186,7 @@ class LBM_Domain {
return *this;
}
void allocate(Device& device); // allocate memory for bitmap and zbuffer
- bool enqueue_draw_frame(const int visualization_modes, const int slice_mode=0, const int slice_x=0, const int slice_y=0, const int slice_z=0, const bool visualization_change=true); // main rendering function, calls rendering kernels, returns true if new frame is rendered, false if old frame is returned when camera has not moved
+ bool enqueue_draw_frame(const int visualization_modes, const int field_mode=0, const int slice_mode=0, const int slice_x=0, const int slice_y=0, const int slice_z=0, const bool visualization_change=true); // main rendering function, calls rendering kernels, returns true if new frame is rendered, false if old frame is returned when camera has not moved
int* get_bitmap(); // returns pointer to bitmap
int* get_zbuffer(); // returns pointer to zbuffer
string device_defines() const; // returns preprocessor constants for embedding in OpenCL C code
@@ -217,6 +218,7 @@ class LBM {
#endif // SURFACE
#ifdef TEMPERATURE
void communicate_gi();
+ void communicate_T();
#endif // TEMPERATURE
public:
@@ -365,7 +367,7 @@ class LBM {
inline const T operator()(const ulong i, const uint dimension) const { return reference(i, dimension); } // array of structures
inline void read_from_device() {
#ifndef UPDATE_FIELDS
- for(uint domain=0u; domainlbm[domain]->enqueue_update_fields(); // make sure data in device memory is up-to-date
+ for(uint domain=0u; domainlbm_domain[domain]->enqueue_update_fields(); // make sure data in device memory is up-to-date
#endif // UPDATE_FIELDS
for(uint domain=0u; domainenqueue_read_from_device();
for(uint domain=0u; domainfinish_queue();
@@ -383,7 +385,7 @@ class LBM {
}
};
- LBM_Domain** lbm; // one LBM object per domain
+ LBM_Domain** lbm_domain; // one LBM domain per GPU
Memory_Container rho; // density of every cell
Memory_Container u; // velocity of every cell
@@ -435,20 +437,20 @@ class LBM {
uint get_Dy() const { return Dy; } // get lattice domains in y-direction
uint get_Dz() const { return Dz; } // get lattice domains in z-direction
uint get_D() const { return Dx*Dy*Dz; } // get number of lattice domains
- float get_nu() const { return lbm[0]->get_nu(); } // get kinematic shear viscosity
+ float get_nu() const { return lbm_domain[0]->get_nu(); } // get kinematic shear viscosity
float get_tau() const { return 3.0f*get_nu()+0.5f; } // get LBM relaxation time
float get_Re_max() const { return 0.57735027f*(float)min(min(Nx, Ny), Nz)/get_nu(); } // Re < c*L/nu
- float get_fx() const { return lbm[0]->get_fx(); } // get global froce per volume
- float get_fy() const { return lbm[0]->get_fy(); } // get global froce per volume
- float get_fz() const { return lbm[0]->get_fz(); } // get global froce per volume
- float get_sigma() const { return lbm[0]->get_sigma(); } // get surface tension coefficient
- float get_alpha() const { return lbm[0]->get_alpha(); } // get thermal diffusion coefficient
- float get_beta() const { return lbm[0]->get_beta(); } // get thermal expansion coefficient
- ulong get_t() const { return lbm[0]->get_t(); } // get discrete time step in LBM units
- uint get_velocity_set() const { return lbm[0]->get_velocity_set(); }
- void set_fx(const float fx) { for(uint d=0u; dset_fx(fx); } // set global froce per volume
- void set_fy(const float fy) { for(uint d=0u; dset_fy(fy); } // set global froce per volume
- void set_fz(const float fz) { for(uint d=0u; dset_fz(fz); } // set global froce per volume
+ float get_fx() const { return lbm_domain[0]->get_fx(); } // get global froce per volume
+ float get_fy() const { return lbm_domain[0]->get_fy(); } // get global froce per volume
+ float get_fz() const { return lbm_domain[0]->get_fz(); } // get global froce per volume
+ float get_sigma() const { return lbm_domain[0]->get_sigma(); } // get surface tension coefficient
+ float get_alpha() const { return lbm_domain[0]->get_alpha(); } // get thermal diffusion coefficient
+ float get_beta() const { return lbm_domain[0]->get_beta(); } // get thermal expansion coefficient
+ ulong get_t() const { return lbm_domain[0]->get_t(); } // get discrete time step in LBM units
+ uint get_velocity_set() const { return lbm_domain[0]->get_velocity_set(); }
+ void set_fx(const float fx) { for(uint d=0u; dset_fx(fx); } // set global froce per volume
+ void set_fy(const float fy) { for(uint d=0u; dset_fy(fy); } // set global froce per volume
+ void set_fz(const float fz) { for(uint d=0u; dset_fz(fz); } // set global froce per volume
void set_f(const float fx, const float fy, const float fz) { set_fx(fx); set_fy(fy); set_fz(fz); } // set global froce per volume
void coordinates(const ulong n, uint& x, uint& y, uint& z) const { // disassemble 1D linear index to 3D coordinates (n -> x,y,z)
@@ -522,7 +524,7 @@ class LBM {
LBM* lbm = nullptr;
std::atomic_int running_encoders = 0;
uint last_exported_frame = 0u; // for next_frame(...) function
- int last_visualization_modes=0, last_slice_mode=0, last_slice_x=0, last_slice_y=0, last_slice_z=0; // don't render a new frame if the scene hasn't changed since last frame
+ int last_visualization_modes=0, last_field_mode=0, last_slice_mode=0, last_slice_x=0, last_slice_y=0, last_slice_z=0; // don't render a new frame if the scene hasn't changed since last frame
void default_settings() {
visualization_modes |= VIS_FLAG_LATTICE;
#ifdef PARTICLES
@@ -531,7 +533,7 @@ class LBM {
}
public:
- int visualization_modes=0, slice_mode=0, slice_x=0, slice_y=0, slice_z=0; // slice visualization: mode = { 0 (no slice), 1 (x), 2 (y), 3 (z), 4 (xz), 5 (xyz), 6 (yz), 7 (xy) }, slice_{xyz} = position of slices
+ int visualization_modes=0, field_mode=0, slice_mode=0, slice_x=0, slice_y=0, slice_z=0; // field_mode = { 0 (u), 1 (rho), 2 (T) }, slice_mode = { 0 (no slice), 1 (x), 2 (y), 3 (z), 4 (xz), 5 (xyz), 6 (yz), 7 (xy) }, slice_{xyz} = position of slices
Graphics() {} // default constructor
Graphics(LBM* lbm) {
@@ -556,6 +558,7 @@ class LBM {
Graphics& operator=(const Graphics& graphics) { // copy assignment
lbm = graphics.lbm;
visualization_modes = graphics.visualization_modes;
+ field_mode = graphics.field_mode;
slice_mode = graphics.slice_mode;
slice_x = graphics.slice_x;
slice_y = graphics.slice_y;
diff --git a/src/main.cpp b/src/main.cpp
index e3ab633f..4c696f82 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -3,10 +3,62 @@
#include "setup.hpp"
#ifdef GRAPHICS
+void draw_scale(const int field_mode, const int color) {
+ float scale_min=0.0f, scale_max=1.0f;
+ string title = "";
+ const int label_count = 10;
+ switch(field_mode) {
+ case 0: // coloring by velocity
+ scale_min = 0.0f;
+ scale_max = units.si_u(1.0f)==1.0f ? (GRAPHICS_U_MAX) : units.si_u(0.57735027f*(GRAPHICS_U_MAX));
+ title = "velocity u / "+string(units.si_u(1.0f)==1.0f ? "c" : "[m/s]");
+ break;
+ case 1: // coloring by density
+ scale_min = units.si_rho(1.0f)-units.si_rho(GRAPHICS_RHO_DELTA);
+ scale_max = units.si_rho(1.0f)+units.si_rho(GRAPHICS_RHO_DELTA);
+ title = "density rho / "+string(units.si_u(1.0f)==1.0f ? "1" : "[kg/m^3]");
+ break;
+ case 2: // coloring by temperature
+ scale_min = units.si_T(1.0f)-units.si_T(GRAPHICS_T_DELTA);
+ scale_max = units.si_T(1.0f)+units.si_T(GRAPHICS_T_DELTA);
+ title = "temperature T / "+string(units.si_u(1.0f)==1.0f ? "1" : "[K]");
+ break;
+ }
+ const int margin_x=2*(FONT_WIDTH), margin_y=1*(FONT_HEIGHT); // margins in x and y
+ const int ox=camera.width-16*(FONT_WIDTH)-margin_x-1, oy=(FONT_HEIGHT)*3/2+margin_y+8; // plot area offset x/y
+ const int label_length_max = scale_max>=1000.0f ? (int)length(to_string(to_int(scale_max))) : 5;
+ const int w = camera.width-(ox+margin_x+label_length_max*(FONT_WIDTH)+8); // width of color scale
+ const int h = camera.height-(FONT_HEIGHT)*3/2-12*(FONT_HEIGHT)-2*margin_y-4; // height of color scale
+ const int N = min(((h/(FONT_HEIGHT))/2)*2, label_count); // number of labels on the y-axis
+ for(int i=0; i=1000.0f ? to_string(to_int(v)) : scale_max>=100.0f ? to_string(v, 1u) : scale_max>=10.0f ? to_string(v, 2u) : to_string(v, 3u);
+ const int y = h-to_int(h*f);
+ draw_line_label(ox, oy+y, ox+4, oy+y, color); // y-axis tickmarks
+ draw_line_label(ox+w-3, oy+y, ox+w+4, oy+y, color); // y-axis mirror tickmarks
+ draw_label(ox+w+7, oy+y-(FONT_HEIGHT)/2, ly, color); // y-axis labels
+ }
+ draw_label(ox+min(0, w+7+label_length_max*(FONT_WIDTH)-(int)length(title)*(FONT_WIDTH)), oy-(FONT_HEIGHT)*3/2-6, title, color); // colorbar title
+}
void main_label(const double frametime) {
if(info.allow_rendering) {
info.print_update();
- const int c = color(255-red(GRAPHICS_BACKGROUND_COLOR), 255-green(GRAPHICS_BACKGROUND_COLOR), 255-blue(GRAPHICS_BACKGROUND_COLOR));
+ const int c = invert(GRAPHICS_BACKGROUND_COLOR);
{
const int ox=camera.width-37*(FONT_WIDTH)-1, oy=camera.height-11*(FONT_HEIGHT)-1;
int i = 0;
@@ -25,9 +77,11 @@ void main_label(const double frametime) {
draw_label(ox, oy+i, "Steps " +alignr(31u, /************************************/ alignr(10u, info.lbm->get_t())+" ("+alignr(5, to_uint(1.0/info.runtime_lbm_timestep_smooth))+" Steps/s)"), c); i+=FONT_HEIGHT;
draw_label(ox, oy+i, "FPS " +alignr(33u, /************************************************************/ alignr(4u, to_uint(1.0/frametime))+" ("+alignr(5u, camera.fps_limit)+" fps max)"), c);
}
+ draw_label(2, camera.height-1*(FONT_HEIGHT)-1, "FluidX3D v2.14 Copyright (c) Dr. Moritz Lehmann", c);
if(!key_H) {
draw_label(camera.width-16*(FONT_WIDTH)-1, 2, "Press H for Help", c);
} else {
+ if(info.lbm->graphics.visualization_modes&(VIS_FIELD|VIS_STREAMLINES|VIS_Q_CRITERION)) draw_scale(info.lbm->graphics.field_mode, c);
#ifdef SURFACE
const bool surface = true;
#else // SURFACE
@@ -50,11 +104,10 @@ void main_label(const double frametime) {
string mode_6 = surface&&info.lbm->get_D()==1u ? (mode&VIS_PHI_RAYTRACE ? " active " : "inactive") : "disabled";
string mode_7 = particles ? (mode&VIS_PARTICLES ? " active " : "inactive") : "disabled";
- const int sl = info.lbm->graphics.slice_mode;
- const string sx = "x="+alignr(4u, info.lbm->graphics.slice_x);
- const string sy = "y="+alignr(4u, info.lbm->graphics.slice_y);
- const string sz = "z="+alignr(4u, info.lbm->graphics.slice_z);
+ const int sl=info.lbm->graphics.slice_mode, fl=info.lbm->graphics.field_mode;
+ const string sx="x="+alignr(4u, info.lbm->graphics.slice_x), sy="y="+alignr(4u, info.lbm->graphics.slice_y), sz="z="+alignr(4u, info.lbm->graphics.slice_z);
string slice = sl==0 ? " disabled " : sl==1 ? sx+"| | " : sl==2 ? " |"+sy+"| " : sl==3 ? " | |"+sz : sl==4 ? sx+"| |"+sz : sl==5 ? sx+"|"+sy+"|"+sz : sl==6 ? " |"+sy+"|"+sz : sx+"|"+sy+"| ";
+ string field = fl==0 ? " velocity u " : fl==1 ? " density rho " : " temperature T ";
draw_label(ox, oy+i, "Keyboard/Mouse Controls: ", c); i+=2*FONT_HEIGHT;
draw_label(ox, oy+i, "P ("+string(key_P?"running ":" paused ")+"): start/pause simulation", c); i+=FONT_HEIGHT;
@@ -67,15 +120,16 @@ void main_label(const double frametime) {
draw_label(ox, oy+i, "6 ("+mode_6+"): raytraced free surface", c); i+=FONT_HEIGHT;
draw_label(ox, oy+i, "7 ("+mode_7+"): particles", c); i+=2*FONT_HEIGHT;
draw_label(ox, oy+i, "T: ("+slice+"): toggle slice visualization mode", c); i+=FONT_HEIGHT;
+ draw_label(ox, oy+i, "Z: ("+field+"): toggle field visualization mode", c); i+=FONT_HEIGHT;
draw_label(ox, oy+i, "Q/E: move slice in slice visualization mode", c); i+=2*FONT_HEIGHT;
draw_label(ox, oy+i, "Mouse or I/J/K/L (rx="+alignr(4u, to_int(fmod(degrees(camera.rx)+90.0+360.0, 360.0)-180.0))+", ry="+alignr(3u, to_int(180.0-degrees(camera.ry)))+"): rotate camera", c); i+=FONT_HEIGHT;
draw_label(ox, oy+i, "Scrollwheel or +/- ("+to_string(camera.free ? (float)camera.free_camera_velocity : camera.zoom*(float)fmax(fmax(info.lbm->get_Nx(), info.lbm->get_Ny()), info.lbm->get_Nz())/(float)min(camera.width, camera.height), 3u)+"): zoom (centered camera mode) or camera movement speed (free camera mode)", c); i+=FONT_HEIGHT;
draw_label(ox, oy+i, "Mouseclick or U: toggle rotation with Mouse and angle snap rotation with I/J/K/L", c); i+=FONT_HEIGHT;
draw_label(ox, oy+i, "Y/X ("+alignr(3u, to_int(camera.fov))+"): adjust camera field of view", c); i+=FONT_HEIGHT;
+ draw_label(ox, oy+i, "F ("+string(camera.free?" free ":"centered")+"): toggle centered/free camera mode", c); i+=FONT_HEIGHT;
+ draw_label(ox, oy+i, "W/A/S/D/Space/C ("+to_string(camera.pos.x/(float)info.lbm->get_Nx(), 2u)+"*Nx, "+to_string(camera.pos.y/(float)info.lbm->get_Ny(), 2u)+"*Ny, "+to_string(camera.pos.z/(float)info.lbm->get_Nz(), 2u)+"*Nz): move free camera", c); i+=FONT_HEIGHT;
draw_label(ox, oy+i, "G: print current camera position/rotation in console as copy/paste command", c); i+=FONT_HEIGHT;
draw_label(ox, oy+i, "R ("+string(camera.autorotation?" active ":"inactive")+"): toggle camera autorotation", c); i+=2*FONT_HEIGHT;
- draw_label(ox, oy+i, "F ("+string(camera.free?" free ":"centered")+"): toggle centered/free camera mode", c); i+=FONT_HEIGHT;
- draw_label(ox, oy+i, "W/A/S/D/Space/C ("+to_string(camera.pos.x/(float)info.lbm->get_Nx(), 2u)+"*Nx, "+to_string(camera.pos.y/(float)info.lbm->get_Ny(), 2u)+"*Ny, "+to_string(camera.pos.z/(float)info.lbm->get_Nz(), 2u)+"*Nz): move free camera", c); i+=2*FONT_HEIGHT;
draw_label(ox, oy+i, "V ("+string(camera.vr?"VR mode":"regular")+"): toggle stereoscopic rendering for VR", c); i+=FONT_HEIGHT;
draw_label(ox, oy+i, "B ("+string(camera.tv?"TV mode":"goggles")+"): toggle VR-goggles/3D-TV mode for stereoscopic rendering", c); i+=FONT_HEIGHT;
draw_label(ox, oy+i, "N/M ("+to_string(camera.eye_distance, 1u)+"): adjust eye distance for stereoscopic rendering", c); i+=2*FONT_HEIGHT;
diff --git a/src/setup.cpp b/src/setup.cpp
index dcdb0b5c..7c3aac50 100644
--- a/src/setup.cpp
+++ b/src/setup.cpp
@@ -74,7 +74,8 @@ void main_setup() { // benchmark; required extensions in defines.hpp: BENCHMARK,
lbm.u.y[n] = -A*sinf(2.0f*pif*fx/a)*cosf(2.0f*pif*fy/b);
lbm.rho[n] = 1.0f-sq(A)*3.0f/4.0f*(cosf(4.0f*pif*fx/a)+cosf(4.0f*pif*fy/b));
}); // ####################################################################### run simulation, export images and data ##########################################################################
- lbm.graphics.visualization_modes = VIS_STREAMLINES;
+ lbm.graphics.visualization_modes = VIS_FIELD;
+ lbm.graphics.slice_mode = 3;
lbm.run();
} /**/
@@ -251,6 +252,25 @@ void main_setup() { // benchmark; required extensions in defines.hpp: BENCHMARK,
+/*void main_setup() { // 2D Karman vortex street; required extensions in defines.hpp: D2Q9, FP16S, EQUILIBRIUM_BOUNDARIES, INTERACTIVE_GRAPHICS
+ // ################################################################## define simulation box size, viscosity and volume force ###################################################################
+ const uint R = 16u;
+ const float Re = 250.0f;
+ const float u = 0.10f;
+ LBM lbm(16u*R, 32u*R, 1u, units.nu_from_Re(Re, 2.0f*(float)R, u));
+ // ###################################################################################### define geometry ######################################################################################
+ const uint Nx=lbm.get_Nx(), Ny=lbm.get_Ny(), Nz=lbm.get_Nz(); parallel_for(lbm.get_N(), [&](ulong n) { uint x=0u, y=0u, z=0u; lbm.coordinates(n, x, y, z);
+ if(cylinder(x, y, z, float3(Nx/2u, Ny/4u, Nz/2u), float3(0u, 0u, Nz), (float)R)) lbm.flags[n] = TYPE_S;
+ else lbm.u.y[n] = u;
+ if(x==0u||x==Nx-1u||y==0u||y==Ny-1u) lbm.flags[n] = TYPE_E; // all non periodic
+ }); // ####################################################################### run simulation, export images and data ##########################################################################
+ lbm.graphics.visualization_modes = VIS_FLAG_LATTICE|VIS_FIELD;
+ lbm.graphics.slice_mode = 3;
+ lbm.run();
+} /**/
+
+
+
/*void main_setup() { // particle test; required extensions in defines.hpp: VOLUME_FORCE, FORCE_FIELD, MOVING_BOUNDARIES, PARTICLES, INTERACTIVE_GRAPHICS
// ################################################################## define simulation box size, viscosity and volume force ###################################################################
const uint L = 128u;
@@ -1255,7 +1275,7 @@ void main_setup() { // benchmark; required extensions in defines.hpp: BENCHMARK,
/*void main_setup() { // Rayleigh-Benard convection; required extensions in defines.hpp: FP16S, VOLUME_FORCE, TEMPERATURE, INTERACTIVE_GRAPHICS
// ################################################################## define simulation box size, viscosity and volume force ###################################################################
- LBM lbm(256u, 256u, 64u, 0.02f, 0.0f, 0.0f, -0.001f, 0.0f, 1.0f, 1.0f);
+ LBM lbm(256u, 256u, 64u, 1u, 1u, 1u, 0.02f, 0.0f, 0.0f, -0.0005f, 0.0f, 1.0f, 1.0f);
// ###################################################################################### define geometry ######################################################################################
const uint threads = (uint)thread::hardware_concurrency();
vector seed(threads);
@@ -1264,7 +1284,7 @@ void main_setup() { // benchmark; required extensions in defines.hpp: BENCHMARK,
lbm.u.x[n] = random_symmetric(seed[t], 0.015f); // initialize velocity with random noise
lbm.u.y[n] = random_symmetric(seed[t], 0.015f);
lbm.u.z[n] = random_symmetric(seed[t], 0.015f);
- lbm.rho[n] = units.rho_hydrostatic(0.001f, (float)z, (float)Nz-2.0f); // initialize density with hydrostatic pressure
+ lbm.rho[n] = units.rho_hydrostatic(0.0005f, (float)z, 0.5f*(float)Nz); // initialize density with hydrostatic pressure
if(z==1u) {
lbm.T[n] = 1.75f;
lbm.flags[n] = TYPE_T;
@@ -1282,7 +1302,7 @@ void main_setup() { // benchmark; required extensions in defines.hpp: BENCHMARK,
/*void main_setup() { // thermal convection; required extensions in defines.hpp: FP16S, VOLUME_FORCE, TEMPERATURE, INTERACTIVE_GRAPHICS
// ################################################################## define simulation box size, viscosity and volume force ###################################################################
- LBM lbm(32u, 196u, 60u, 1u, 1u, 1u, 0.02f, 0.0f, 0.0f, -0.001f, 0.0f, 1.0f, 1.0f);
+ LBM lbm(32u, 196u, 60u, 1u, 1u, 1u, 0.02f, 0.0f, 0.0f, -0.0005f, 0.0f, 1.0f, 1.0f);
// ###################################################################################### define geometry ######################################################################################
const uint Nx=lbm.get_Nx(), Ny=lbm.get_Ny(), Nz=lbm.get_Nz(); parallel_for(lbm.get_N(), [&](ulong n) { uint x=0u, y=0u, z=0u; lbm.coordinates(n, x, y, z);
if(y==1) {
@@ -1292,7 +1312,7 @@ void main_setup() { // benchmark; required extensions in defines.hpp: BENCHMARK,
lbm.T[n] = 0.3f;
lbm.flags[n] = TYPE_T;
}
- lbm.rho[n] = units.rho_hydrostatic(0.001f, (float)z, (float)Nz-2.0f); // initialize density with hydrostatic pressure
+ lbm.rho[n] = units.rho_hydrostatic(0.0005f, (float)z, 0.5f*(float)Nz); // initialize density with hydrostatic pressure
if(x==0u||x==Nx-1u||y==0u||y==Ny-1u||z==0u||z==Nz-1u) lbm.flags[n] = TYPE_S; // all non periodic
}); // ####################################################################### run simulation, export images and data ##########################################################################
lbm.graphics.visualization_modes = VIS_FLAG_LATTICE|VIS_STREAMLINES;
diff --git a/src/utilities.hpp b/src/utilities.hpp
index 663dd298..a116eb75 100644
--- a/src/utilities.hpp
+++ b/src/utilities.hpp
@@ -2973,6 +2973,15 @@ inline int color(const int red, const int green, const int blue) {
inline int color(const int red, const int green, const int blue, const int alpha) {
return (alpha&255)<<24|(red&255)<<16|(green&255)<<8|(blue&255);
}
+inline int color(const float red, const float green, const float blue) {
+ return clamp((int)(255.0f*red+0.5f), 0, 255)<<16|clamp((int)(255.0f*green+0.5f), 0, 255)<<8|clamp((int)(255.0f*blue+0.5f), 0, 255);
+}
+inline int color(const float red, const float green, const float blue, const float alpha) {
+ return clamp((int)(255.0f*alpha+0.5f), 0, 255)<<24|clamp((int)(255.0f*red+0.5f), 0, 255)<<16|clamp((int)(255.0f*green+0.5f), 0, 255)<<8|clamp((int)(255.0f*blue+0.5f), 0, 255);
+}
+inline int color(const float3 rgb) {
+ return color(rgb.x, rgb.y, rgb.z);
+}
inline int red(const int color) {
return (color>>16)&255;
}
@@ -2990,30 +2999,40 @@ inline int brightness(const int color) {
}
inline int grayscale(const int color) {
const int b = brightness(color);
- return ::color(b, b, b);
+ return b<<16|b<<8|b;
}
inline int invert(const int color) { // invert color
- return ::color(255-red(color), 255-green(color), 255-blue(color));
+ return (255-red(color))<<16|(255-green(color))<<8|(255-blue(color));
}
inline int invert_brightness(const int color) { // invert brightness, but retain color
const int r = red(color), g=green(color), b=blue(color);
return ::color(255-(g+b)/2, 255-(r+b)/2, 255-(r+g)/2);
}
-inline int color_average(const int c1, const int c2) {
+inline int color_mul(const int c, const float x) { // c*x
+ const int r = min((int)fma((float)red (c), x, 0.5f), 255);
+ const int g = min((int)fma((float)green(c), x, 0.5f), 255);
+ const int b = min((int)fma((float)blue (c), x, 0.5f), 255);
+ return r<<16|g<<8|b; // values are already clamped
+}
+inline int color_add(const int c1, const int c2) {
const int r1=red(c1), g1=green(c1), b1=blue(c1);
const int r2=red(c2), g2=green(c2), b2=blue(c2);
- return color((r1+r2)/2, (g1+g2)/2, (b1+b2)/2);
+ return min(r1+r2, 255)<<16|min(g1+g2, 255)<<8|min(b1+b2, 255); // values are already clamped
}
-inline int color_add(const int c1, const int c2) {
+inline int color_average(const int c1, const int c2) {
const int r1=red(c1), g1=green(c1), b1=blue(c1);
const int r2=red(c2), g2=green(c2), b2=blue(c2);
- return color(min(r1+r2, 255), min(g1+g2, 255), min(b1+b2, 255));
+ return ((r1+r2)/2)<<16|((g1+g2)/2)<<8|((b1+b2)/2);
}
-inline int color_dim(const int c, const float x) {
- const int r = clamp((int)fma((float)red (c), x, 0.5f), 0, 255);
- const int g = clamp((int)fma((float)green(c), x, 0.5f), 0, 255);
- const int b = clamp((int)fma((float)blue (c), x, 0.5f), 0, 255);
- return color(r, g, b);
+inline int color_mix(const int c1, const int c2, const float w) { // w*c1+(1-w)*c2
+ const float3 fc1=float3((float)red(c1), (float)green(c1), (float)blue(c1)), fc2=float3((float)red(c2), (float)green(c2), (float)blue(c2));
+ const float3 fcm = w*fc1+(1.0f-w)*fc2+0.5f;
+ return color((int)fcm.x, (int)fcm.y, (int)fcm.z);
+}
+inline int color_mix_3(const int c0, const int c1, const int c2, const float w0, const float w1, const float w2) { // w1*c1+w2*c2+w3*c3, w0+w1+w2 = 1
+ const float3 fc0=float3((float)red(c0), (float)green(c0), (float)blue(c0)), fc1=float3((float)red(c1), (float)green(c1), (float)blue(c1)), fc2=float3((float)red(c2), (float)green(c2), (float)blue(c2));
+ const float3 fcm = w0*fc0+w1*fc1+w2*fc2+0.5f;
+ return color((int)fcm.x, (int)fcm.y, (int)fcm.z);
}
inline float3 rgb_to_hsv(const int red, const int green, const int blue) {
const int cmax = max(max(red, green), blue);
@@ -3028,7 +3047,7 @@ inline float3 rgb_to_hsv(const int red, const int green, const int blue) {
inline float3 rgb_to_hsv(const int color) {
return rgb_to_hsv(red(color), green(color), blue(color));
}
-inline uint hsv_to_rgb(const float h, const float s, const float v) {
+inline int hsv_to_rgb(const float h, const float s, const float v) {
const float c = v*s;
const float x = c*(1.0f-fabs(fmod(h/60.0f, 2.0f)-1.0f));
const float m = v-c;
@@ -3039,11 +3058,55 @@ inline uint hsv_to_rgb(const float h, const float s, const float v) {
else if(h<240.0f) { g = x; b = c; }
else if(h<300.0f) { r = x; b = c; }
else if(h<360.0f) { r = c; b = x; }
- return color((uchar)((r+m)*255.0f), (uchar)((g+m)*255.0f), (uchar)((b+m)*255.0f));
+ return color(r+m, g+m, b+m);
}
-inline uint hsv_to_rgb(const float3& hsv) {
+inline int hsv_to_rgb(const float3& hsv) {
return hsv_to_rgb(hsv.x, hsv.y, hsv.z);
}
+inline int colorscale_rainbow(float x) { // coloring scheme (float [0, 1] -> int color)
+ x = clamp(6.0f*(1.0f-x), 0.0f, 6.0f);
+ float r=0.0f, g=0.0f, b=0.0f; // black
+ if(x<1.2f) { // red - yellow
+ r = 1.0f;
+ g = x*0.83333333f;
+ } else if(x>=1.2f&&x<2.0f) { // yellow - green
+ r = 2.5f-x*1.25f;
+ g = 1.0f;
+ } else if(x>=2.0f&&x<3.0f) { // green - cyan
+ g = 1.0f;
+ b = x-2.0f;
+ } else if(x>=3.0f&&x<4.0f) { // cyan - blue
+ g = 4.0f-x;
+ b = 1.0f;
+ } else if(x>=4.0f&&x<5.0f) { // blue - violet
+ r = x*0.4f-1.6f;
+ b = 3.0f-x*0.5f;
+ } else { // violet - black
+ r = 2.4f-x*0.4f;
+ b = 3.0f-x*0.5f;
+ }
+ return color(r, g, b);
+}
+inline int colorscale_iron(float x) { // coloring scheme (float [0, 1] -> int color)
+ x = clamp(4.0f*(1.0f-x), 0.0f, 4.0f);
+ float r=1.0f, g=0.0f, b=0.0f;
+ if(x<0.66666667f) { // white - yellow
+ g = 1.0f;
+ b = 1.0f-x*1.5f;
+ } else if(x<2.0f) { // yellow - red
+ g = 1.5f-x*0.75f;
+ } else if(x<3.0f) { // red - violet
+ r = 2.0f-x*0.5f;
+ b = x-2.0f;
+ } else { // violet - black
+ r = 2.0f-x*0.5f;
+ b = 4.0f-x;
+ }
+ return color(r, g, b);
+}
+inline int colorscale_twocolor(float x) { // coloring scheme (float [0, 1] -> int color)
+ return x>0.5f ? color_mix(0xFFAA00, 0x181818, clamp(2.0f*x-1.0f, 0.0f, 1.0f)) : color_mix(0x181818, 0x0080FF, clamp(2.0f*x, 0.0f, 1.0f)); // red - gray - blue
+}
#define color_black 0
#define color_dark_blue 1