Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/master' into fcitx
Browse files Browse the repository at this point in the history
  • Loading branch information
Fcitx Bot committed Sep 28, 2024
2 parents a47ae1c + c768d7f commit 921b931
Show file tree
Hide file tree
Showing 21 changed files with 238 additions and 68 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/android.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ concurrency:

jobs:
build:
# https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2204-Readme.md
runs-on: ubuntu-22.04
# https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2404-Readme.md
runs-on: ubuntu-24.04
timeout-minutes: 60

steps:
Expand Down
14 changes: 4 additions & 10 deletions .github/workflows/linux.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ concurrency:

jobs:
build:
# https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2204-Readme.md
runs-on: ubuntu-22.04
# https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2404-Readme.md
runs-on: ubuntu-24.04
timeout-minutes: 60

steps:
Expand All @@ -30,9 +30,6 @@ jobs:
#
# Unset the Android NDK setting to skip the unnecessary configuration.
echo "ANDROID_NDK_HOME=" >> $GITHUB_ENV
#
# Work around https://bugreports.qt.io/browse/QTBUG-86080 for Ubuntu 22.04
echo "PKG_CONFIG_PATH=${PWD}/docker/ubuntu22.04/qt6-core-pkgconfig:${PKG_CONFIG_PATH}" >> $GITHUB_ENV
- name: bazel build
working-directory: ./src
Expand All @@ -47,8 +44,8 @@ jobs:
if-no-files-found: warn

test:
# https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2204-Readme.md
runs-on: ubuntu-22.04
# https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2404-Readme.md
runs-on: ubuntu-24.04
timeout-minutes: 60

steps:
Expand All @@ -64,9 +61,6 @@ jobs:
#
# Unset the Android NDK setting to skip the unnecessary configuration.
echo "ANDROID_NDK_HOME=" >> $GITHUB_ENV
#
# Work around https://bugreports.qt.io/browse/QTBUG-86080 for Ubuntu 22.04
echo "PKG_CONFIG_PATH=${PWD}/docker/ubuntu22.04/qt6-core-pkgconfig:${PKG_CONFIG_PATH}" >> $GITHUB_ENV
- name: bazel test
working-directory: ./src
Expand Down
22 changes: 11 additions & 11 deletions docs/build_mozc_in_docker.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ If you are not sure what the following commands do, please check the description
and make sure the operations before running them.

```
curl -O https://raw.githubusercontent.com/google/mozc/master/docker/ubuntu22.04/Dockerfile
docker build --rm --tag mozc_ubuntu22.04 .
docker create --interactive --tty --name mozc_build mozc_ubuntu22.04
curl -O https://raw.githubusercontent.com/google/mozc/master/docker/ubuntu24.04/Dockerfile
docker build --rm --tag mozc_ubuntu24.04 .
docker create --interactive --tty --name mozc_build mozc_ubuntu24.04
docker start mozc_build
docker exec mozc_build bazel build package --config oss_linux --config release_build
Expand All @@ -22,24 +22,24 @@ docker cp mozc_build:/home/mozc_builder/work/mozc/src/bazel-bin/unix/mozc.zip .
Docker containers are available to build Mozc binaries for Android JNI library and Linux desktop.

## System Requirements
Currently, only Ubuntu 22.04 is tested to host the Docker container to build Mozc.
Currently, only Ubuntu 24.04 is tested to host the Docker container to build Mozc.

* [Dockerfile](https://github.com/google/mozc/blob/master/docker/ubuntu22.04/Dockerfile) for Ubuntu 22.04
* [Dockerfile](https://github.com/google/mozc/blob/master/docker/ubuntu24.04/Dockerfile) for Ubuntu 24.04

## Build in Docker

### Set up Ubuntu 22.04 Docker container
### Set up Ubuntu 24.04 Docker container

```
curl -O https://raw.githubusercontent.com/google/mozc/master/docker/ubuntu22.04/Dockerfile
docker build --rm --tag mozc_ubuntu22.04 .
docker create --interactive --tty --name mozc_build mozc_ubuntu22.04
curl -O https://raw.githubusercontent.com/google/mozc/master/docker/ubuntu24.04/Dockerfile
docker build --rm --tag mozc_ubuntu24.04 .
docker create --interactive --tty --name mozc_build mozc_ubuntu24.04
```

You may need to execute `docker` with `sudo` (e.g. `sudo docker build ...`).

Notes
* `mozc_ubuntu22.04` is a Docker image name (customizable).
* `mozc_ubuntu24.04` is a Docker image name (customizable).
* `mozc_build` is a Docker container name (customizable).
* Don't forget to rebuild Docker container when Dockerfile is updated.

Expand Down Expand Up @@ -148,7 +148,7 @@ Note: This section is not about our officially supported build process.

You may also need other libraries.
See the configurations of
[Dockerfile](https://github.com/google/mozc/blob/master/docker/ubuntu22.04/Dockerfile)
[Dockerfile](https://github.com/google/mozc/blob/master/docker/ubuntu24.04/Dockerfile)
and
[GitHub Actions](https://github.com/google/mozc/blob/master/.github/workflows/linux.yaml).

Expand Down
2 changes: 1 addition & 1 deletion docs/build_mozc_in_osx.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ python build_tools/update_deps.py
In this step, additional build dependencies will be downloaded.

* [Ninja 1.11.0](https://github.com/ninja-build/ninja/releases/download/v1.11.0/ninja-mac.zip)
* [Qt 6.7.2](https://download.qt.io/archive/qt/6.7/6.7.2/submodules/qtbase-everywhere-src-6.7.2.tar.xz)
* [Qt 6.7.3](https://download.qt.io/archive/qt/6.7/6.7.3/submodules/qtbase-everywhere-src-6.7.3.tar.xz)
* [git submodules](../.gitmodules)

You can specify `--noqt` option if you would like to use your own Qt binaries.
Expand Down
2 changes: 1 addition & 1 deletion docs/build_mozc_in_windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ python build_tools/update_deps.py
In this step, additional build dependencies will be downloaded.

* [Ninja 1.11.0](https://github.com/ninja-build/ninja/releases/download/v1.11.0/ninja-win.zip)
* [Qt 6.7.2](https://download.qt.io/archive/qt/6.7/6.7.2/submodules/qtbase-everywhere-src-6.7.2.tar.xz)
* [Qt 6.7.3](https://download.qt.io/archive/qt/6.7/6.7.3/submodules/qtbase-everywhere-src-6.7.3.tar.xz)
* [.NET tools](../dotnet-tools.json)
* [git submodules](../.gitmodules)

Expand Down
93 changes: 93 additions & 0 deletions src/base/util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,23 @@ size_t Util::CharsLen(absl::string_view str) {
return length;
}

std::u32string Util::Utf8ToUtf32(absl::string_view str) {
std::u32string codepoints;
char32_t codepoint;
while (Util::SplitFirstChar32(str, &codepoint, &str)) {
codepoints.push_back(codepoint);
}
return codepoints;
}

std::string Util::Utf32ToUtf8(const std::u32string_view str) {
std::string output;
for (const char32_t codepoint : str) {
CodepointToUtf8Append(codepoint, &output);
}
return output;
}

char32_t Util::Utf8ToCodepoint(const char *begin, const char *end,
size_t *mblen) {
absl::string_view s(begin, end - begin);
Expand Down Expand Up @@ -470,6 +487,82 @@ bool Util::SplitLastChar32(absl::string_view s, absl::string_view *rest,
return true;
}

bool Util::IsValidUtf8(absl::string_view s) {
char32_t first;
absl::string_view rest;
while (!s.empty()) {
if (!SplitFirstChar32(s, &first, &rest)) {
return false;
}
s = rest;
}
return true;
}

std::string Util::CodepointToUtf8(char32_t c) {
std::string output;
CodepointToUtf8Append(c, &output);
return output;
}

void Util::CodepointToUtf8Append(char32_t c, std::string *output) {
char buf[7];
output->append(buf, CodepointToUtf8(c, buf));
}

size_t Util::CodepointToUtf8(char32_t c, char *output) {
if (c == 0) {
// Do nothing if |c| is `\0`. Previous implementation of
// CodepointToUtf8Append worked like this.
output[0] = '\0';
return 0;
}
if (c < 0x00080) {
output[0] = static_cast<char>(c & 0xFF);
output[1] = '\0';
return 1;
}
if (c < 0x00800) {
output[0] = static_cast<char>(0xC0 + ((c >> 6) & 0x1F));
output[1] = static_cast<char>(0x80 + (c & 0x3F));
output[2] = '\0';
return 2;
}
if (c < 0x10000) {
output[0] = static_cast<char>(0xE0 + ((c >> 12) & 0x0F));
output[1] = static_cast<char>(0x80 + ((c >> 6) & 0x3F));
output[2] = static_cast<char>(0x80 + (c & 0x3F));
output[3] = '\0';
return 3;
}
if (c < 0x200000) {
output[0] = static_cast<char>(0xF0 + ((c >> 18) & 0x07));
output[1] = static_cast<char>(0x80 + ((c >> 12) & 0x3F));
output[2] = static_cast<char>(0x80 + ((c >> 6) & 0x3F));
output[3] = static_cast<char>(0x80 + (c & 0x3F));
output[4] = '\0';
return 4;
}
// below is not in UCS4 but in 32bit int.
if (c < 0x8000000) {
output[0] = static_cast<char>(0xF8 + ((c >> 24) & 0x03));
output[1] = static_cast<char>(0x80 + ((c >> 18) & 0x3F));
output[2] = static_cast<char>(0x80 + ((c >> 12) & 0x3F));
output[3] = static_cast<char>(0x80 + ((c >> 6) & 0x3F));
output[4] = static_cast<char>(0x80 + (c & 0x3F));
output[5] = '\0';
return 5;
}
output[0] = static_cast<char>(0xFC + ((c >> 30) & 0x01));
output[1] = static_cast<char>(0x80 + ((c >> 24) & 0x3F));
output[2] = static_cast<char>(0x80 + ((c >> 18) & 0x3F));
output[3] = static_cast<char>(0x80 + ((c >> 12) & 0x3F));
output[4] = static_cast<char>(0x80 + ((c >> 6) & 0x3F));
output[5] = static_cast<char>(0x80 + (c & 0x3F));
output[6] = '\0';
return 6;
}

absl::string_view Util::Utf8SubString(absl::string_view src, size_t start) {
const char *begin = src.data();
const char *end = begin + src.size();
Expand Down
37 changes: 11 additions & 26 deletions src/base/util.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,7 @@
#include <string_view>
#include <vector>

#include "absl/base/attributes.h"
#include "absl/base/macros.h"
#include "absl/strings/string_view.h"
#include "base/strings/unicode.h"

namespace mozc {

Expand Down Expand Up @@ -83,19 +80,12 @@ class Util {
static bool IsCapitalizedAscii(absl::string_view s);

// Returns the lengths of [src, src+size] encoded in UTF8.
ABSL_DEPRECATED("Use strings::CharsLen or AtLeastCharsLen.")
static size_t CharsLen(absl::string_view str);

// Converts a UTF-8 string to UTF-32.
ABSL_DEPRECATE_AND_INLINE()
static std::u32string Utf8ToUtf32(absl::string_view str) {
return strings::Utf8ToUtf32(str);
}
static std::u32string Utf8ToUtf32(absl::string_view str);
// Converts a UTF-32 string to UTF-8.
ABSL_DEPRECATE_AND_INLINE()
static std::string Utf32ToUtf8(std::u32string_view str) {
return strings::Utf32ToUtf8(str);
}
static std::string Utf32ToUtf8(std::u32string_view str);

// Converts the first character of UTF8 string starting at |begin| to UCS4.
// The read byte length is stored to |mblen|.
Expand All @@ -107,16 +97,16 @@ class Util {
}

// Converts a UCS4 code point to UTF8 string.
ABSL_DEPRECATE_AND_INLINE() static std::string CodepointToUtf8(char32_t c) {
return strings::Char32ToUtf8(c);
}
static std::string CodepointToUtf8(char32_t c);

// Converts a UCS4 code point to UTF8 string and appends it to |output|, i.e.,
// |output| is not cleared.
ABSL_DEPRECATE_AND_INLINE()
static void CodepointToUtf8Append(char32_t c, std::string *output) {
return strings::StrAppendChar32(output, c);
}
static void CodepointToUtf8Append(char32_t c, std::string *output);

// Converts a UCS4 code point to UTF8 and stores it to char array. The result
// is terminated by '\0'. Returns the byte length of converted UTF8 string.
// REQUIRES: The output buffer must be longer than 7 bytes.
static size_t CodepointToUtf8(char32_t c, char *output);

// Returns true if |s| is split into |first_char32| + |rest|.
// You can pass nullptr to |first_char32| and/or |rest| to ignore the matched
Expand All @@ -135,23 +125,18 @@ class Util {
char32_t *last_char32);

// Returns true if |s| is a valid UTF8.
ABSL_DEPRECATE_AND_INLINE() static bool IsValidUtf8(absl::string_view s) {
return strings::IsValidUtf8(s);
}
static bool IsValidUtf8(absl::string_view s);

// Extracts a substring range, where both start and length are in terms of
// UTF8 size. Note that the returned string view refers to the same memory
// block as the input.
ABSL_DEPRECATED("Use strings::Utf8AsChars or strings::Utf8Substring instead.")
static absl::string_view Utf8SubString(absl::string_view src, size_t start,
size_t length);
// This version extracts the substring to the end.
ABSL_DEPRECATED("Use strings::Utf8AsChars or strings::Utf8Substring instead.")
static absl::string_view Utf8SubString(absl::string_view src, size_t start);

// Extracts a substring of length |length| starting at |start|.
// Note: |start| is the start position in UTF8, not byte position.
ABSL_DEPRECATED("Use strings::Utf8AsChars or strings::Utf8Substring instead.")
static void Utf8SubString(absl::string_view src, size_t start, size_t length,
std::string *result);

Expand Down Expand Up @@ -274,7 +259,7 @@ class Util {
// char32_t c = iter.Get();
// ...
// }
class ABSL_DEPRECATED("Use strings::Utf8AsChars instead.") ConstChar32Iterator {
class ConstChar32Iterator {
public:
explicit ConstChar32Iterator(absl::string_view utf8_string);
ConstChar32Iterator(const ConstChar32Iterator &) = delete;
Expand Down
33 changes: 31 additions & 2 deletions src/base/util_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -326,8 +326,10 @@ TEST(UtilTest, Utf8ToCodepoint) {
}

TEST(UtilTest, CodepointToUtf8) {
// Do nothing if |c| is NUL. Previous implementation of CodepointToUtf8 worked
// like this even though the reason is unclear.
std::string output = Util::CodepointToUtf8(0);
EXPECT_EQ(output, absl::string_view("\0", 1));
EXPECT_TRUE(output.empty());

output = Util::CodepointToUtf8(0x7F);
EXPECT_EQ(output, "\x7F");
Expand All @@ -342,7 +344,34 @@ TEST(UtilTest, CodepointToUtf8) {
output = Util::CodepointToUtf8(0x10000);
EXPECT_EQ(output, "\xF0\x90\x80\x80");
output = Util::CodepointToUtf8(0x1FFFFF);
EXPECT_EQ(output, "\uFFFD");
EXPECT_EQ(output, "\xF7\xBF\xBF\xBF");

// Buffer version.
char buf[7];

EXPECT_EQ(Util::CodepointToUtf8(0, buf), 0);
EXPECT_EQ(strcmp(buf, ""), 0);

EXPECT_EQ(Util::CodepointToUtf8(0x7F, buf), 1);
EXPECT_EQ(strcmp("\x7F", buf), 0);

EXPECT_EQ(Util::CodepointToUtf8(0x80, buf), 2);
EXPECT_EQ(strcmp("\xC2\x80", buf), 0);

EXPECT_EQ(Util::CodepointToUtf8(0x7FF, buf), 2);
EXPECT_EQ(strcmp("\xDF\xBF", buf), 0);

EXPECT_EQ(Util::CodepointToUtf8(0x800, buf), 3);
EXPECT_EQ(strcmp("\xE0\xA0\x80", buf), 0);

EXPECT_EQ(Util::CodepointToUtf8(0xFFFF, buf), 3);
EXPECT_EQ(strcmp("\xEF\xBF\xBF", buf), 0);

EXPECT_EQ(Util::CodepointToUtf8(0x10000, buf), 4);
EXPECT_EQ(strcmp("\xF0\x90\x80\x80", buf), 0);

EXPECT_EQ(Util::CodepointToUtf8(0x1FFFFF, buf), 4);
EXPECT_EQ(strcmp("\xF7\xBF\xBF\xBF", buf), 0);
}

TEST(UtilTest, CharsLen) {
Expand Down
10 changes: 10 additions & 0 deletions src/build_tools/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -179,3 +179,13 @@ bzl_library(
"//bazel:run_build_tool_bzl",
],
)

exports_files([
"mozc_win32_resource_template.rc",
])

mozc_py_binary(
name = "gen_win32_resource_header",
srcs = ["gen_win32_resource_header.py"],
deps = [":mozc_version_lib"],
)
Loading

0 comments on commit 921b931

Please sign in to comment.