Set MSVC to use UTF-8 on source files #2346

drasticactions · 2024-08-12T08:10:40Z

Although the files in Whisper.cpp are encoded with UTF-8, MSVC seems to take the system's default text encoding into account when compiling.

https://github.com/ggerganov/whisper.cpp/blob/master/src/whisper.cpp#L4982C1-L4987C3

So when MSVC reencodes this into the system encoding (Ex. Shift-JIS), it blows up when compiling. We could change this to use Unicode escape characters, but if we set MSVC to treat all files as UTF-8, it should compile fine on all systems, regardless of the default encoding and whatever text is used here.

As far as I know, this should only affect compiling, and the build output should be the same.

chinshou · 2024-08-17T20:32:10Z

I can confirm that I have the same compile problem in a Japanese Windows.

This reverts commit c96906d.

* ggerganov/master: (40 commits) revert : cmake : set MSVC to use UTF-8 on source files (ggerganov#2346) sync : ggml ggml: fix ggml_graph_cpy undefined behavior (ggml/943) cann : fix doxy (ggml/0) vulkan : fix build (llama/0) cuda : mark BF16 CONT as unsupported ggml : fix cont with transposed tensors when one dimension is 1 (ggml/934) cmake : set MSVC to use UTF-8 on source files (ggerganov#2346) readme : remove invalid flag from Python example (ggerganov#2396) readme : fix link (ggerganov#2394) go : add beamsize/entropythold/maxcontext to context interface (ggerganov#2350) talk-llama : sync llama.cpp whisper : update FA call sync : ggml sync : vulkan (skip) (llama/0) ggml : do not crash when quantizing q4_x_x with an imatrix (llama/9192) metal : separate scale and mask from QKT in FA kernel (llama/9189) ggml : add SSM Metal kernels (llama/8546) metal : gemma2 flash attention support (llama/9159) CPU/CUDA: Gemma 2 FlashAttention support (llama/8542) ...

This reverts commit c96906d.

drasticactions added 2 commits August 12, 2024 17:04

Set MSVC to use UTF-8 on source files

540e29c

Merge branch 'master' into dev/da/set-utf-8-msvc

ea469af

Merge branch 'ggerganov:master' into dev/da/set-utf-8-msvc

c6cb40c

ggerganov merged commit c96906d into ggerganov:master Aug 30, 2024
44 of 46 checks passed

ggerganov added a commit that referenced this pull request Sep 2, 2024

revert : cmake : set MSVC to use UTF-8 on source files (#2346)

7819674

This reverts commit c96906d.

ggerganov added a commit that referenced this pull request Sep 2, 2024

revert : cmake : set MSVC to use UTF-8 on source files (#2346)

5236f02

This reverts commit c96906d.

iThalay pushed a commit to iThalay/whisper.cpp that referenced this pull request Sep 23, 2024

cmake : set MSVC to use UTF-8 on source files (ggerganov#2346)

20802ec

iThalay pushed a commit to iThalay/whisper.cpp that referenced this pull request Sep 23, 2024

revert : cmake : set MSVC to use UTF-8 on source files (ggerganov#2346)

f3c2bfe

This reverts commit c96906d.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set MSVC to use UTF-8 on source files #2346

Set MSVC to use UTF-8 on source files #2346

drasticactions commented Aug 12, 2024

chinshou commented Aug 17, 2024

Set MSVC to use UTF-8 on source files #2346

Set MSVC to use UTF-8 on source files #2346

Conversation

drasticactions commented Aug 12, 2024

chinshou commented Aug 17, 2024