Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching translations implementation #202

Closed
wants to merge 175 commits into from
Closed
Show file tree
Hide file tree
Changes from 172 commits
Commits
Show all changes
175 commits
Select commit Hold shift + click to select a range
bfdc6d2
Adding a List + Hashmap LRU Cache with thread-safety
Jun 29, 2021
b4bef8a
simplifying some mess
Jun 29, 2021
1e78cb7
Temporary placement of cache for integration, pending hash(marian::Wo…
Jun 29, 2021
c1d8338
Missing comma
Jun 29, 2021
bed7f11
Getting compilation back to work
Jun 29, 2021
869a315
equal_to is defined for containers provided elements have it defined
Jun 29, 2021
d2cd678
Cache population code added
Jun 29, 2021
363fc65
All filled from cache corner case
Jun 29, 2021
d5df177
Adding a few cache-stats to enable checks
Jun 30, 2021
59a2139
brt: + tests for cache
Jun 30, 2021
ebe2bfa
cache stats and test app
Jun 30, 2021
c3a9cc6
brt: fix error ambiguous redirect
Jun 30, 2021
d1c7eb2
Initializing cache stats properly
Jun 30, 2021
3628b6d
Naming to Key, Value; Stats are now within cache
Jun 30, 2021
441704a
Bulk insert for cache; Flushing cache-refresh at request completion
Jun 30, 2021
db78fe0
move that guard back up
Jun 30, 2021
98e3a29
Adding L4
Jul 8, 2021
f0aca14
Submodule L4: Make boost optional
Jul 8, 2021
de5115f
Adding L4 onto bergamot-build
Jul 8, 2021
bc79cfb
sync with callback instead of future: service, brt
Jul 8, 2021
a83a254
Empty commit; brt push to github to resolve commit
Jul 8, 2021
421efbc
Fixing some merge artifacts
Jul 8, 2021
e1fe73d
L4: exp on windows on CI; Removing guard for MSVC sln only
Jul 8, 2021
79186db
submodule l4: cmake cmp0077
Jul 8, 2021
fed21de
L4: More boost strip
Jul 8, 2021
88aaaab
L4: <functional> for reference_wrapper
Jul 8, 2021
dedcc24
Intermediate; Make History eager instead of marian's lazy
Jul 14, 2021
da19790
L4 integration complete; Cache is not working - diagnosing at runtime…
Jul 15, 2021
8471254
Moving to cpp + h files
Jul 16, 2021
10e939b
cache.cpp file inclusion
Jul 16, 2021
c96fd47
[L4]: numeric_limits instead of integer-traits.
Jul 16, 2021
40512ab
[L4]: boost::any -> std::any
Jul 16, 2021
ba14228
[L4] This should fix a bunch of lock includes
Jul 16, 2021
d481502
Empty commit to relaunch with L4 clone
Jul 16, 2021
fd67da3
L4: dirty-fix std::vector instead of some boost interprocess vector
Jul 16, 2021
bef7de3
[L4]: fix header include to avoid compile fail
Jul 16, 2021
59e4e95
[L4]: boost::hash_combine import source code-segment
Jul 16, 2021
2ad52cb
[L4]: attempting include of strings.h
Jul 16, 2021
b2ddd36
Empty commit to relaunch with L4 clone
Jul 16, 2021
c9c08d8
[L4]: Removing boost::lexical_cast
Jul 16, 2021
25df00e
[L4]: std::memcpy upgrades
Jul 16, 2021
8fa7c2b
[L4]: i/ostream include fix
Jul 16, 2021
fc04be2
[L4]: memcmp -> std::memcmp
Jul 16, 2021
376732f
memcpy -> std::memcpy on binary IO for ProcessedRequestSentence
Jul 16, 2021
2097679
[L4]: Interprocess -> std::vector
Jul 16, 2021
7cee19f
[L4]: Interprocess - cherry picking AtomicOffsetPtr requirements
Jul 16, 2021
1e158c0
[L4]: boost format for serialization errstr gone
Jul 16, 2021
ca551da
[L4]: boost format for serialization fixing includes
Jul 16, 2021
50f2674
[L4]: boost format for serialization std::string(...) for err strings
Jul 16, 2021
2ec98ea
Empty commit to relaunch with L4 clone
Jul 16, 2021
e7f8e68
pair<Key, Value> -> struct Record{Key, Value}
Jul 17, 2021
bd66c15
Some more shuffling in LRUCache
Jul 17, 2021
0ae67d3
Reorg on cache header file; definitions are in cpp now; Some comments
Jul 17, 2021
e0f961e
Adding some comments
Jul 17, 2021
c427b97
It works; I have no idea why
Jul 25, 2021
c434f2d
Removing debug statements
Jul 26, 2021
0b4a192
L4: Some diff prettifying
Jul 26, 2021
bf97c34
Removing modelIdentifier string for now; will revisit for multiple mo…
Jul 26, 2021
e769749
LRUCache: Old naive-implementation is now nuked
Jul 26, 2021
c618165
Leftover comment from LRUCache removed
Jul 26, 2021
58d9b31
Dropping in another crude unordered_map based cache for WASM
Jul 26, 2021
2cd0e25
LRUCache without locks for WASM
Jul 26, 2021
6cff453
LRU Cache without thread capability for WASM
Jul 26, 2021
4f82ad8
Moving processed_request_sentence into a separate {h,cpp} file and co…
Jul 26, 2021
0eaca2f
More improvements to source / comments situation
Jul 26, 2021
c58d07d
TranslatorLRUCache -> TranslationCache
Jul 26, 2021
2199616
Documentation and naming improvements
Jul 27, 2021
0faa8ba
Adding empty() to ProcessedRequestSentence and getting rid of unique_ptr
Jul 27, 2021
c3374f3
Merge branch 'main' into translation-cache
Jul 27, 2021
8214d2c
opts available at config/cmdline, parsing code and adjustments added
Jul 27, 2021
70cb988
doc update to reflect explicit -> Options change
Jul 27, 2021
3999fec
Moving defaults to configparser creation
Jul 27, 2021
c6fc93f
Create a no-cache path using a nullptr
Jul 27, 2021
d1fd2d6
Merge branch 'main' into translation-cache
jerinphilip Jul 29, 2021
833a3d9
Merge branch 'main' into translation-cache
Jul 29, 2021
15ebb85
[L4]: std::optional stronger adhering to spec for MSVC
Jul 29, 2021
1be7265
Merge: sync with remote GitHub activity
Jul 29, 2021
ae9ad8a
[L4]: Require C++17 since we're using std::optional
Jul 29, 2021
b4c96a8
[L4]: <optional> include attempt to fix MSVC
Jul 29, 2021
4bc2828
Replicating tests for WASM_COMPATIBLE_SOURCE path
Aug 2, 2021
a35ee9c
Okay, BRT double path build exercise
Aug 2, 2021
f51d101
[empty-commit] Rerun CI after submodule push
Aug 2, 2021
80d3166
Debug messages are triggered by BERGAMOT_L4_CACHE_DEBUG env variable
Aug 2, 2021
293cf08
Fix ThreadUnsafeLRUCache doc
Aug 2, 2021
a624fdf
ResponseBuilder: mark input reference args as const
Aug 2, 2021
8d6bfbd
Improve ProcessedRequestSentence documentation
Aug 2, 2021
66a08c7
Bugfix: reserve -> resize in container before using raw memcpy
Aug 2, 2021
db61f1d
empty() is const
Aug 2, 2021
3704891
Cleaning up response_builder removing obsolete comments
Aug 2, 2021
59e5213
Stronger test-case; output with/without cache should match on similar…
Aug 2, 2021
d039e5f
Only print translated text
Aug 2, 2021
97759da
Use a simpler condition, explain in detail to fire empty/cache-comple…
Aug 2, 2021
af612c4
Corner cases in Request construction: Keep it simple and stupid
Aug 2, 2021
ff46ef3
(ThreadSafe)Batcher should rely on segments post cache prefill
Aug 2, 2021
eec1752
More cache op related comments in source for request
Aug 3, 2021
b7a654d
Using marian::util::hash_combine instead of intermediate string
Aug 20, 2021
0c082c5
[brt]: perf benchmarks
Aug 26, 2021
28c64ee
Relaxations recommended by Nick: Cache may return false-positives wit…
Sep 3, 2021
2b47c04
kpu review comments on FlatVector incorporated
Sep 3, 2021
28d87b0
We forgot the read into structured, fixed now, cache-tests pass
Sep 3, 2021
2f73835
Removing *stream headers
Sep 3, 2021
da669f1
Free only if initialized, some more comments
Sep 3, 2021
83d78a7
Comments on L4::IWritable::{Key, Value}
Sep 3, 2021
955479a
Remove unused RequestSentence::isCachePrefilled() method
Sep 3, 2021
ba1727b
Making initialized inline; Improving comments
Sep 6, 2021
c191065
Merge branch 'main' into translation-cache
Sep 6, 2021
9d5a164
Relaxations by Nick for ThreadUnsafeCache
Sep 6, 2021
b62ef7f
Fix ThreadUnsafeCache: Clone from cache and move into arg
Sep 6, 2021
45972f3
Expand stats to cover L4. CacheStats common to both cache, debug comm…
Sep 6, 2021
08cdfc8
Decrement activeRecords fix
Sep 6, 2021
bdfc1e8
Place ConstRangeView behind storage_
Sep 7, 2021
471e776
Storage for value in WASM Cache, string_view exchanges
Sep 7, 2021
7c7b9fe
Separate templated I/O on binary-blob storage from data
Sep 7, 2021
9435108
Improve comments: ConstRangeView, explain QE, Alignments being commented
Sep 7, 2021
84ddf66
Remove keySize, valueSize: We're only bothered about totalSize of Cache
Sep 7, 2021
103097f
Removing debug statements, we will now explicitly use them outside wi…
Sep 7, 2021
0705b29
Merge branch 'main' into translation-cache
Sep 7, 2021
f8dd9c9
Change malloc/free to new[], delete[], void* -> char* for typed memory
Sep 17, 2021
b20a188
Removing obsolete reinterpret_cast<const char*>
Sep 17, 2021
b5f6e49
Merge branch 'main' into translation-cache
Sep 21, 2021
0920525
Removing merge artifact: threadsafe_batcher.cpp
Sep 21, 2021
9bb58e3
Benchmark code kept separate, now being merged in.
Sep 21, 2021
cd7bec4
Clang-format fixes after vscode merge
Sep 21, 2021
b5eb45d
Fix cacheConfig init from marian to CLI11 style, add cache after merge
Sep 23, 2021
1ec29f3
Compute a unique Id for a model from AlignedMemory and config-string
Sep 24, 2021
7d6f238
Make Request aware of TranslationModel for use in Cache
Sep 24, 2021
3f131bd
Accept TranslationModel into cache fetch/insert now; Usage pending
Sep 24, 2021
7c7a318
Review comment: Change auto usage to uint64_t in cache
Sep 24, 2021
45163ae
Lifting hash(Words) -> hash(model, words); Got circular dependencies …
Sep 24, 2021
8a06ed2
Removing some template to specialize for size_t; All our functions ar…
Sep 24, 2021
8fffdf7
Removing remnant uint64_t in favour of size_t in TranslationModel
Sep 24, 2021
9b8383d
Update submodule to point to browsermt/L4 instead of microsoft/L4
Sep 24, 2021
c37d347
[L4]: Indicate boost copy in comments
Sep 27, 2021
1557f7a
Use static std::atomic<size_t> to assign modelIds for each Translatio…
Sep 27, 2021
2c6c6d5
Remove obsolete member TranslationModel::computeUniqueId definition
Sep 27, 2021
0204f07
Add a NoCache to fix compile at WASM on the ifndef else
Sep 27, 2021
62564ff
Bridging WASM/native or AsyncService/BlockingService at tests for wid…
Sep 27, 2021
f2d0c2c
Remove apps.cpp and references
Sep 27, 2021
8b00fc6
[BRT] Work with executable rename with bergamot-test-native
Sep 27, 2021
68475f8
Revert "Add a NoCache to fix compile at WASM on the ifndef else"
Sep 27, 2021
d8ee2d0
Fix lonely namespace cache_util::
Sep 27, 2021
31275c8
Empty commit: Submodule re-pull
Sep 27, 2021
1188d12
Make defaults CLI11 correct and explicit, make no-hits on cache-tests…
Sep 27, 2021
87030a9
Adding missing test templating files
Sep 27, 2021
5cda3d5
Let emscripten know L4 cache exists
Sep 27, 2021
40e8be0
[L4]: Fix WASM complaints, uint64_t - size_t
Sep 27, 2021
2df84fa
[BRT]: Partition cache test native disjoint wasm
Sep 27, 2021
2d2e96d
Merge branch 'main' into translation-cache
Sep 28, 2021
75b6867
app_ -> app
Sep 28, 2021
fe8c3f1
Manually create loggers in command line apps to get logging back
Sep 28, 2021
e0e45e7
[BRT]: Some syncing of scripts after config updates etc
Sep 28, 2021
a362317
Fixing lost merge content: numToBeFreshlyTranslated to avoid deadlocks
Sep 29, 2021
4e39252
[BRT]: Few more commits and sync
Sep 29, 2021
0ccce37
[BRT]: Add hparam bucket search script
Sep 29, 2021
19fdc8e
Revert "Manually create loggers in command line apps to get logging b…
Sep 29, 2021
1a4678d
Logging as RAII member on Service with proper cleanup
Sep 29, 2021
f287b73
Destroy loggers alternate route with checks
Sep 29, 2021
9872693
[L4]: Point to browsermt fork's master after merge of boost-dependenc…
Sep 29, 2021
b8efe0e
Make some editable <>Service::Config const
Oct 1, 2021
12eb0e1
Convert HashCacheKey struct to function
Oct 1, 2021
9326868
Reorganize config/common structs and functions into interface
Oct 1, 2021
3ddffd7
Massive config rework, things are hierarchical, cmd parsing is next t…
Oct 1, 2021
f3487c5
Make decoder benchmark consistent with rest of the test/benchmark suite
Oct 1, 2021
688e068
Empty commit to trigger workflows again
Oct 1, 2021
e8d1057
Syntax fixes
Oct 1, 2021
fd46682
Make configs + addOptions consistent
Oct 2, 2021
12d731e
Moving doc comment in ThreadsafeL4Cache
Oct 2, 2021
13666ae
Remove cli.h, make bergamot app native only
Oct 3, 2021
bfb203d
--model-config-path being required causes trouble for --build-info or…
Oct 3, 2021
80ffa15
BRT: app/bergamot --mode decoder -> bergamot-test-native --mode decoder
Oct 3, 2021
47be5ea
Remove confusing protected access, add some more documentation
Oct 4, 2021
b7de985
Removing duplication; If it's part of testSuite it can run at both th…
Oct 4, 2021
cd87254
hashTableIndex_ is const, still grabbed from L4
Oct 4, 2021
74e890c
context is RAII. Read, Write both are locking
Oct 4, 2021
90173c1
StorageIO constructor explicit
Oct 4, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,6 @@
[submodule "bergamot-translator-tests"]
path = bergamot-translator-tests
url = https://github.com/browsermt/bergamot-translator-tests
[submodule "3rd_party/L4"]
path = 3rd_party/L4
url = https://github.com/browsermt/L4
6 changes: 6 additions & 0 deletions 3rd_party/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ endif(COMPILE_WASM)

add_subdirectory(ssplit-cpp)

set(L4_COMPILE_UNIT_TESTS OFF)
add_subdirectory(L4)

# Add include directories for 3rd party targets to be able to use it anywhere in the
# project without explicitly specifying their include directories. Once they
# fixe this problem, it can be removed.
Expand All @@ -18,7 +21,10 @@ target_include_directories(marian PUBLIC ${INCDIRS})
get_property(INCLUDE_DIRECTORIES DIRECTORY ssplit-cpp/src PROPERTY INCLUDE_DIRECTORIES)
target_include_directories(ssplit PUBLIC ${INCLUDE_DIRECTORIES})

get_property(L4_INCLUDE_DIRECTORIES DIRECTORY L4 PROPERTY INCLUDE_DIRECTORIES)
target_include_directories(L4 PUBLIC ${L4_INCLUDE_DIRECTORIES})
# Compilation flags

get_directory_property(CMAKE_C_FLAGS DIRECTORY marian-dev DEFINITION CMAKE_C_FLAGS)
get_directory_property(CMAKE_CXX_FLAGS DIRECTORY marian-dev DEFINITION CMAKE_CXX_FLAGS)
set(CMAKE_C_FLAGS ${CMAKE_C_FLAGS} PARENT_SCOPE)
Expand Down
1 change: 1 addition & 0 deletions 3rd_party/L4
Submodule L4 added at 8103d0
38 changes: 22 additions & 16 deletions app/bergamot.cpp
Original file line number Diff line number Diff line change
@@ -1,22 +1,28 @@
#include "cli.h"
#include "translator/byte_array_util.h"
#include "translator/parser.h"
#include "translator/response.h"
#include "translator/response_options.h"
#include "translator/service.h"
#include "translator/utils.h"

int main(int argc, char *argv[]) {
marian::bergamot::ConfigParser configParser;
using namespace marian::bergamot;
ConfigParser<AsyncService> configParser;
configParser.parseArgs(argc, argv);
auto &config = configParser.getConfig();
using namespace marian::bergamot;
switch (config.opMode) {
case OpMode::APP_WASM:
app::wasm(config);
break;
case OpMode::APP_NATIVE:
app::native(config);
break;
case OpMode::APP_DECODER:
app::decoder(config);
break;
default:
break;
}

AsyncService service(config.serviceConfig);

// Construct a model.
auto options = parseOptionsFromFilePath(config.modelConfigPaths.front());

MemoryBundle memoryBundle;
std::shared_ptr<TranslationModel> model = service.createCompatibleModel(options, std::move(memoryBundle));

ResponseOptions responseOptions;
std::string input = readFromStdin();
Response response = TranslateForResponse<AsyncService>()(service, model, std::move(input), responseOptions);

std::cout << response.target.text;
return 0;
}
166 changes: 0 additions & 166 deletions app/cli.h

This file was deleted.

18 changes: 14 additions & 4 deletions src/tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,25 @@ endif (COMPILE_UNIT_TESTS)
if(NOT MSVC)
# Testing apps
set(APP_TESTS)
add_executable("bergamot-test" "cli.cpp" "apps.cpp")
add_executable("bergamot-test-native" "native-cli.cpp")

if(CUDA_FOUND)
target_link_libraries("bergamot-test" bergamot-translator)
target_link_libraries("bergamot-test-native" bergamot-translator)
else(CUDA_FOUND)
target_link_libraries("bergamot-test" bergamot-translator)
target_link_libraries("bergamot-test-native" bergamot-translator)
endif(CUDA_FOUND)

set_target_properties("bergamot-test" PROPERTIES RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}")
set_target_properties("bergamot-test-native" PROPERTIES RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}")

add_executable("bergamot-test-wasm" "wasm-cli.cpp")

if(CUDA_FOUND)
target_link_libraries("bergamot-test-wasm" bergamot-translator)
else(CUDA_FOUND)
target_link_libraries("bergamot-test-wasm" bergamot-translator)
endif(CUDA_FOUND)

set_target_properties("bergamot-test-wasm" PROPERTIES RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}")

# Adding an intgemm_resolve cmdline
add_executable(intgemm-resolve intgemm_resolve.cpp)
Expand Down
113 changes: 0 additions & 113 deletions src/tests/apps.cpp

This file was deleted.

Loading