Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added reproduction log entry for BM25 MS MARCO Passage Ranking #2600

Closed
wants to merge 1,817 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1817 commits
Select commit Hold shift + click to select a range
ea5f1ec
Update docs for SPLADE++ ED regressions on BEIR (#2279)
lintool Nov 28, 2023
88dc3a2
Add to onboarding reproduction logs (#2280)
tudou0002 Nov 29, 2023
e569408
Add to onboarding reproduction logs (#2284)
kdricci Dec 3, 2023
ed13b26
Add to onboarding reproduction logs (#2285)
AreelKhan Dec 3, 2023
8691c51
Add to onboarding reproduction logs (#2283)
sueszli Dec 3, 2023
aaec4e2
Add to onboarding reproduction logs (#2286)
ljk423 Dec 6, 2023
e05acb1
Upgrade to Jacoco 0.8.11 (#2291)
ChrisHegarty Dec 6, 2023
4354166
Add to onboarding reproduction logs (#2298)
Minhajul99 Dec 9, 2023
c39fef7
Add to onboarding reproduction logs (#2299)
Panizghi Dec 11, 2023
44c9abe
Extract common code paths in indexing pipeline (#2275)
lintool Dec 13, 2023
a4f944d
Add to onboarding reproduction logs (#2303)
saharsamr Dec 14, 2023
09beb94
Refactor search pipeline for HNSW and inverted dense (#2300)
lintool Dec 17, 2023
da8a8bb
Add download-from-remote feature for pre-built indexes (#2301)
ArthurChen189 Dec 19, 2023
00a8428
Upgrade to Lucene 9.9.1 (#2302)
lintool Dec 19, 2023
b83b83d
Update Lucene badge to 9.9.1; record recent regressions (#2312)
lintool Dec 20, 2023
cc570f5
Minor tweak to regression order to de-conflict out runfiles (#2313)
lintool Dec 21, 2023
6f681df
Major refactoring of search code paths (#2310)
lintool Dec 24, 2023
9445ba5
Improve logging for HNSW indexing w/ -optimize (#2319)
lintool Dec 28, 2023
3002cd7
Dial down -maxThreadMemoryBeforeFlush for indexing OpenAI Ada2 embedd…
lintool Dec 28, 2023
3398eb5
[maven-release-plugin] prepare release anserini-0.24.0
lintool Dec 28, 2023
8218183
[maven-release-plugin] prepare for next development iteration
lintool Dec 28, 2023
63f6446
Release notes for v0.24.0 (#2320)
lintool Dec 28, 2023
3e231b1
Add reproduction script for "End-to-End Retrieval with Learned Dense …
ArthurChen189 Dec 29, 2023
9446bed
Integrate jtreceval (#2324)
jasper-xian Jan 9, 2024
e3fb7d5
Update regressions to use trec_eval.jar (#2326)
jasper-xian Jan 9, 2024
3d19c7e
Update tools/ submodule to include trec_eval 9.0.8 (#2327)
jasper-xian Jan 9, 2024
eaa23e7
Add to onboarding reproduction logs (#2321)
wu-ming233 Jan 9, 2024
e104f21
Advance tools/ submodule (#2328)
lintool Jan 9, 2024
f36f676
Add to onboarding reproduction logs (#2329)
Yuan-Hou Jan 9, 2024
b85384f
Add to onboarding reproduction logs (#2325)
himasheth Jan 10, 2024
0c79fca
Fix trec_eval.jar assembly issue: edit yamls to use appassembler trec…
jasper-xian Jan 10, 2024
a3d0ab1
Update linux + WSL trec_eval binaries (#2331)
jasper-xian Jan 10, 2024
637d149
Update experiment docs to use trec_eval 9.0.8 (#2332)
jasper-xian Jan 11, 2024
5a2b88d
Rollback trec_eval to 9.0.4 (#2334)
jasper-xian Jan 12, 2024
efaecf7
Add MS MARCO passage regressions for BGE-base-en-v1.5 (#2335)
lintool Jan 13, 2024
895abe0
update splade-pp-ed beir topics (#2337)
justram Jan 13, 2024
3ac55ba
Updated regression log and batches to include BGE (#2338)
lintool Jan 14, 2024
31e8989
Add to onboarding reproduction logs (#2336)
Tanngent Jan 15, 2024
94e4e29
Add to onboarding reproduction logs (#2339)
BeginningGradeMaker Jan 18, 2024
066103f
Update start-here.md: fixed broken URL (#2340)
lintool Jan 18, 2024
8c4fc46
Add to onboarding reproduction logs (#2342)
ia03 Jan 18, 2024
eaf1e84
Remove splade-distil-cocodenser-medium regressions to reduce confusio…
lintool Jan 19, 2024
5a7c0f7
Add to onboarding reproduction logs (#2345)
AlexStan0 Jan 19, 2024
6122ab1
Update reproduction instructions for BEIR (with tarball downloads) (#…
lintool Jan 19, 2024
d3dc168
Add bge-base-en-v1.5 ONNX encoder (#2341)
ArthurChen189 Jan 21, 2024
2027047
Add to onboarding reproduction logs (#2349)
charlie-liuu Jan 22, 2024
a95f300
Add MS MARCO v1 passage regressions for BGE w/ ONNX (#2350)
lintool Jan 23, 2024
6dfff0c
Add MS MARCO v1 passage regressions for BGE w/ ONNX to documentation …
lintool Jan 24, 2024
f4d5681
Add BEIR regressions for BGE (#2353)
lintool Jan 25, 2024
0bef9f2
Update prebuilt dense indexes + add ability to specify topic enum (#2…
lintool Jan 27, 2024
b0f8407
[maven-release-plugin] prepare release anserini-0.24.1
lintool Jan 27, 2024
eb18235
[maven-release-plugin] prepare for next development iteration
lintool Jan 27, 2024
8c62095
Release notes for v0.24.1 (#2358)
lintool Jan 27, 2024
d077e3b
Various updates and cleanup (#2359)
lintool Jan 28, 2024
1ecd5a9
Add to onboarding reproduction logs (#2360)
dannychn11 Jan 29, 2024
ad106fa
Refactor regressions (#2361)
lintool Feb 5, 2024
4ef99e4
Tweak script for summarizing BEIR results (#2363)
lintool Feb 8, 2024
bb04130
Add to onboarding reproduction logs (#2362)
chloeqxq Feb 8, 2024
7e5d369
upgrade onnx version (#2365)
ArthurChen189 Feb 9, 2024
196874d
Cohere MS MARCO v1 passage 2CR (#2357)
jasper-xian Feb 9, 2024
84dab92
Refactor MS MARCO passage dev with Cohere V3 embeddings (#2366)
lintool Feb 11, 2024
b3b5eca
Add SPLADE++ ED w/ ONNX on BEIR (#2354)
lintool Feb 12, 2024
6466eff
Fix splade-pp-ed arguana regression issues (#2368)
ArthurChen189 Feb 14, 2024
e0bacf8
Add docs for SPLADE++ ED w/ ONNX (#2369)
lintool Feb 14, 2024
f1da4ff
Add BGE regressions for BEIR with quantized HNSW indexes (#2372)
lintool Feb 14, 2024
b0ebdb7
Update regression logs (#2373)
lintool Feb 14, 2024
dbf4e82
Regressions and bindings for CIRAL (#2377)
Mofetoluwa Feb 16, 2024
78b84dc
Update tools submodule (#2370)
Mofetoluwa Feb 18, 2024
1e7f806
Add languages to Simplesearcher (#2381)
Mofetoluwa Feb 18, 2024
c15c4a8
Update regression docs wrt CIRAL (#2382)
lintool Feb 19, 2024
2a0a623
Add to onboarding reproduction logs (#2384)
ru5h16h Feb 19, 2024
9459499
Cohere embed-english-v3 DL19/20 2CR (#2385)
jasper-xian Feb 20, 2024
92705b9
Add to onboarding reproduction logs (#2379)
16BitNarwhal Feb 21, 2024
910b6c9
Tweak scores for Cohere regressions (#2386)
lintool Feb 21, 2024
f7703fb
Fix BGE maxlen issue and add maxlen test cases for splade-pp-ed and b…
ArthurChen189 Feb 21, 2024
bf882fc
Add BGE regressions with ONNX on BEIR (#2375)
lintool Feb 22, 2024
3165c60
Update BEIR scores using BGE w/ ONNX (#2388)
lintool Feb 22, 2024
4626edd
Add to onboarding reproduction logs (#2389)
ASChampOmega Feb 23, 2024
dc7fa81
Add to onboarding reproduction logs (#2392)
17Melissa Feb 23, 2024
3ad51e8
Install pre-built BGE indexes for BEIR (#2390)
lintool Feb 23, 2024
a530342
Add support for readable lowercase topics (#2393)
lintool Feb 26, 2024
29e57da
Add pre-built BEIR indexes for flat, multifield, SPLADE++ ED; renamed…
lintool Feb 26, 2024
68dedf1
[maven-release-plugin] prepare release anserini-0.24.2
lintool Feb 27, 2024
a16638b
[maven-release-plugin] prepare for next development iteration
lintool Feb 27, 2024
c04ae04
Release notes for v0.24.2 (#2398)
lintool Feb 27, 2024
f9c4dea
Update README to add "try it" section: reproduce directly from fatjar…
lintool Feb 27, 2024
d651180
Add to onboarding reproduction logs (#2397)
haeriamin Feb 28, 2024
1296a40
Tweak tolerance for regression tests (again) (#2400)
lintool Mar 1, 2024
7fb96f5
Install Cohere embed-english-v3.0 pre-built indexes, add relevant bin…
lintool Mar 1, 2024
6ac5b46
Rename dl20-passage to dl20 to be consistent with other topics (#2404)
lintool Mar 2, 2024
1836942
Add DL23 passage and DL22/23 doc bindings + BM25 regressions (#2403)
lintool Mar 4, 2024
bd329a5
Add to onboarding reproduction logs (#2405)
devesh-002 Mar 8, 2024
1c392dc
Add to onboarding reproduction logs (#2408)
JodyZ0203 Mar 15, 2024
d65a817
Add to onboarding reproduction logs (#2409)
kxwtan Mar 16, 2024
4a65470
Add to onboarding reproduction logs (#2407)
xpbowler Mar 16, 2024
991b159
Add 2cr for Anserini (#2395)
ArthurChen189 Mar 18, 2024
f813d48
Fill in missing DL23 passage and DL22/23 doc conditions for doc2query…
lintool Mar 20, 2024
6720f1c
Post-pend md5 to prebuilt indexes (#2412)
16BitNarwhal Mar 21, 2024
c34be0b
Fix duplicate prebuilt indexes with and without postpended md5 (#2413)
16BitNarwhal Mar 22, 2024
56d6ad1
Tweak 'name' in regression yaml files to make more unique (#2415)
lintool Mar 24, 2024
0369155
Add missing conditions for MS MARCO v2 passage/doc (#2414)
lintool Mar 24, 2024
fad61d9
Tweak 2CRs with pre-built indexes (#2416)
lintool Mar 25, 2024
b3dda8a
Add to onboarding reproduction logs (#2422)
khufia Mar 27, 2024
2ace70d
[maven-release-plugin] prepare release anserini-0.25.0
lintool Mar 28, 2024
65bf134
[maven-release-plugin] prepare for next development iteration
lintool Mar 28, 2024
e55b673
Release notes for v0.25.0 (#2426)
lintool Mar 28, 2024
70d852d
Add to onboarding reproduction logs (#2420)
Lindaaa8 Mar 28, 2024
4619df3
Use 'encoders' cache directory instead of 'test' cache directory: eli…
16BitNarwhal Mar 28, 2024
df974ba
Fix TREC-COVID regressions and fine-tuning experiments (#2431)
lintool Mar 29, 2024
5544ebb
Add to onboarding reproduction logs (#2432)
SyedHuq28 Mar 29, 2024
5a7d6d0
Upgrade to JDK 21 (#2410)
lintool Apr 4, 2024
2493812
[maven-release-plugin] prepare release anserini-0.35.0
lintool Apr 4, 2024
e544e7b
[maven-release-plugin] prepare for next development iteration
lintool Apr 4, 2024
698c008
Release notes for v0.35.0 (#2436)
lintool Apr 4, 2024
610267a
Add to onboarding reproduction logs (#2435)
thiendan Apr 4, 2024
b43cd02
Fix deprecation warnings for JDK 21; update docs (#2437)
lintool Apr 5, 2024
4b8ec12
Add Spring Boot backend and next.js frontend (#2425)
16BitNarwhal Apr 6, 2024
8a3b2b0
Merge pyserini/anserini cache directory; rename BEIR indexes to new s…
lintool Apr 6, 2024
8c6cdd9
Bump word-wrap from 1.2.3 to 1.2.5 in /src/main/frontend (#2440)
dependabot[bot] Apr 7, 2024
621f93a
Bump semver from 6.3.0 to 6.3.1 in /src/main/frontend (#2441)
dependabot[bot] Apr 7, 2024
a22204a
Bump json5 from 1.0.1 to 1.0.2 in /src/main/frontend (#2442)
dependabot[bot] Apr 7, 2024
94271ed
Remove appassembler-maven-plugin (#2444)
lintool Apr 8, 2024
bd85e36
Refactor regressions with prebuilt indexes following removal of appas…
lintool Apr 8, 2024
0323aa1
Fix bug in test case PrebuiltIndexTest: correct URL parameter usage (…
Gelardinio Apr 9, 2024
15a6187
Fix dependabot issues (#2450)
16BitNarwhal Apr 9, 2024
f401873
Refactor OpenAI ada2 regressions (#2448)
lintool Apr 10, 2024
6c4046d
Upgrade postcss and zod to address security vulnerabilities (#2452)
16BitNarwhal Apr 11, 2024
a57dcd5
Rename/refactor MS MARCO v1 regressions: adopts new, consistent schem…
lintool Apr 11, 2024
2120d73
Add to onboarding reproduction logs (#2453)
a68lin Apr 12, 2024
1423361
Rename/refactor MS MARCO v2 regressions: adopts new, consistent schem…
lintool Apr 13, 2024
2891edd
Fix TREC-COVID regressions and fine-tuning experiments (#2455)
lintool Apr 14, 2024
667e61c
update readme (#2458)
ToluClassics Apr 17, 2024
b2410e9
Refactor regressions with prebuilt indexes (#2457)
lintool Apr 17, 2024
4510050
Add bindings for MS MARCO V2.1 prebuilt indexes + qrels (#2459)
lintool Apr 20, 2024
fec33c8
Add to onboarding reproduction logs (#2462)
DanielKohn1208 Apr 23, 2024
cbf7882
Add `-outputRerankerRequests` option to create input for RankLLM (#2…
ronakice Apr 24, 2024
87402ce
[maven-release-plugin] prepare release anserini-0.35.1
lintool Apr 24, 2024
654fbc8
[maven-release-plugin] prepare for next development iteration
lintool Apr 24, 2024
d9d9c5c
Release notes for v0.35.1 (#2464)
lintool Apr 24, 2024
a3889cb
Add bindings for RAGgy topics for TREC 2024 RAG (#2465)
lintool Apr 25, 2024
840b1b1
Add regressions for MS MARCO V2.1 corpora: document + segmented docum…
lintool Apr 27, 2024
f376394
Refactor and clean-up POM (#2468)
lintool Apr 28, 2024
68c430b
Adding support for trec_eval to take symbols representing standard qr…
DanielKohn1208 Apr 28, 2024
db8d7f2
Fix bug for symbol expansion in trec_eval (#2472)
DanielKohn1208 Apr 28, 2024
f3fd7aa
Add fatjar regression doc for v0.35.2-SNAPSHOT (#2471)
lintool Apr 28, 2024
99be23a
Add ability to read msmarco passage yaml from fatjar resources (#2469)
16BitNarwhal Apr 28, 2024
b111670
Update regressions log to document recently added regressions (#2474)
lintool Apr 28, 2024
0c5df22
[maven-release-plugin] prepare release anserini-0.36.0
lintool Apr 28, 2024
a7b3018
[maven-release-plugin] prepare for next development iteration
lintool Apr 28, 2024
fd266c6
Release notes for v0.36.0 (#2475)
lintool Apr 29, 2024
edfca5c
Locate and run fatjar in RunMSMarco (#2476)
16BitNarwhal Apr 29, 2024
3a16ae8
Add RunMsMarco for V2.1 and RunBeir (#2477)
wu-ming233 Apr 30, 2024
b7236cc
Add v0.36.1-SNAPSHOT docs; fixed typos in existing docs (#2479)
lintool Apr 30, 2024
da1a90b
Add to onboarding reproduction logs (#2478)
emadahmed19 Apr 30, 2024
d1a8de7
Fix MS MARCO V2.1 repo experiments on segmented doc collection (#2483)
lintool May 2, 2024
dc4a283
Fixed bge-base-en-v1.5 Dp yaml typo (#2484)
wu-ming233 May 2, 2024
2fa76ea
Align doc output format with repro scripts (#2485)
wu-ming233 May 5, 2024
1d9061b
Add to onboarding reproduction logs (#2487)
CheranMahalingam May 7, 2024
669f369
Add to onboarding reproduction logs (#2491)
billycz8 May 8, 2024
669d2e2
Add to onboarding reproduction logs (#2492)
KenWuqianghao May 10, 2024
5fa89cd
Bump next from 13.5.6 to 14.1.1 in /src/main/frontend (#2493)
dependabot[bot] May 11, 2024
1437a03
Reorganize documentation: align docs with 2CRs (#2490)
lintool May 12, 2024
fff4e3e
Add to onboarding reproduction logs (#2494)
hrouzegar May 12, 2024
32b3242
Web app and Rest API: Support for multiple indexes (#2486)
16BitNarwhal May 13, 2024
6917e97
Add to onboarding reproduction logs (#2496)
baixabhi May 14, 2024
2851e14
Add to onboarding reproduction logs (#2497)
Yuv-sue1005 May 14, 2024
42f3f2c
Fix default params in webapp (#2498)
16BitNarwhal May 15, 2024
33d2c83
Update API retrieval results format (#2499)
16BitNarwhal May 16, 2024
ceee2ba
Add to onboarding reproduction logs (#2500)
RohanNankani May 18, 2024
8d0e4a2
[maven-release-plugin] prepare release anserini-0.36.1
lintool May 23, 2024
90b9f6b
[maven-release-plugin] prepare for next development iteration
lintool May 23, 2024
8f6147b
Release notes for v0.36.1 (#2501)
lintool May 24, 2024
c08761d
rename collection to index in route (#2503)
16BitNarwhal May 25, 2024
2aa655c
Augment IndexInfo with corpus/model information (#2504)
lintool May 25, 2024
ed65af9
Add to onboarding reproduction logs (#2507)
IR3KT4FUNZ May 28, 2024
ba24b5a
Add bindings researchy dev topics (#2511)
16BitNarwhal May 30, 2024
20cce50
Initial implementation of flat vector search (#2510)
lintool May 30, 2024
8f49498
Add to onboarding reproduction logs (#2513)
bilet-13 Jun 1, 2024
c55b3f3
Improvements to flat vector search (#2512)
lintool Jun 3, 2024
b6143fc
Update regressions log to add brute-force search on dense vectors for…
lintool Jun 3, 2024
72d765a
Renaming BEIR regressions into consistent schema (#2518)
lintool Jun 5, 2024
1102fbe
Add to onboarding reproduction logs (#2520)
SeanSong25 Jun 5, 2024
5cc87fd
Rename MS MARCO regressions into consistent schema (#2519)
lintool Jun 8, 2024
1650285
Add to onboarding reproduction logs (#2522)
alireza-taban Jun 13, 2024
b5d8d05
Add new flat regressions for MS MARCO v1 passage (#2521)
lintool Jun 13, 2024
ae88b46
Add to regression log: flat indexes for MS MARCO v1 passage (#2523)
lintool Jun 13, 2024
4adbeab
Refactor regression documentation to fix consistency issues (#2524)
lintool Jun 15, 2024
28e80dc
Add to onboarding reproduction logs (#2526)
Feng-12138 Jun 17, 2024
8bff6f9
Bump ai.djl:api from 0.21.0 to 0.28.0 (#2529)
dependabot[bot] Jun 18, 2024
592cd34
Add to onboarding reproduction logs (#2530)
hosnahoseini Jun 18, 2024
f7998d2
Simplify options for HNSW indexes (#2531)
lintool Jun 18, 2024
4b40ec9
Bump braces from 3.0.2 to 3.0.3 in /src/main/frontend (#2532)
dependabot[bot] Jun 18, 2024
66d6567
Webapp update (#2525)
16BitNarwhal Jun 19, 2024
4d35c2a
Fix tokenizer length issue with DJL upgrade (#2536)
lintool Jun 21, 2024
fd9c3c2
Simplify options for HNSW indexes (#2533)
lintool Jun 24, 2024
f9c2d56
Refactor tolerance settings for BEIR dense vector regressions (#2538)
lintool Jun 26, 2024
c7e658d
Webapp get doc by ID (#2539)
16BitNarwhal Jun 26, 2024
68bd526
Fix rest-api.md (#2540)
16BitNarwhal Jun 26, 2024
8fa6c90
Add to onboarding reproduction logs (#2537)
FaizanFaisal25 Jun 29, 2024
675748c
Refactor tolerance settings for MS MARCO dense vector regressions (#2…
lintool Jul 8, 2024
06497b3
Add to onboarding reproduction logs (#2546)
XKTZ Jul 14, 2024
4b463fc
Tweak score tolerance for regressions (#2545)
lintool Jul 17, 2024
428a40a
Add to onboarding reproduction logs (#2544)
MehrnazSadeghieh Jul 19, 2024
6227adc
Add to onboarding reproduction logs (#2551)
alireza-nasirian Jul 19, 2024
6a5df23
Add to onboarding reproduction logs (#2552)
MariaPonomarenko38 Jul 22, 2024
bf8cb52
Tweak score tolerance for regressions (#2553)
lintool Jul 23, 2024
c833cc8
Webapp UI update (#2554)
16BitNarwhal Jul 23, 2024
e9bc602
Tweak score tolerance for regressions (#2555)
lintool Jul 30, 2024
14bdc2c
Tweak score tolerance for regressions (#2557)
lintool Aug 1, 2024
cf8c2f6
Update documentation for SPLADE++ ED huggingface -> onnx (#2556)
AndreSlavescu Aug 2, 2024
40cc1c1
Add to onboarding reproduction logs (#2558)
daisyyedda Aug 3, 2024
fe40b37
Add to onboarding reproduction logs (#2559)
valamuri2020 Aug 5, 2024
855845a
Update onnx docs (#2561)
AndreSlavescu Aug 5, 2024
b148fb8
Add to onboarding reproduction logs (#2562)
natek-1 Aug 6, 2024
80f4d39
Add Test Set Bindings for TREC 2024 RAG (#2566)
ronakice Aug 7, 2024
974a11b
Fixed command for indexing/search flat vectors (#2567)
lintool Aug 8, 2024
2a30147
Add BEIR BGE prebuilt flat indexes and repro bindings (#2564)
lintool Aug 8, 2024
4d38206
Add to onboarding reproduction logs (#2569)
emily-emily Aug 17, 2024
243f4ac
Add to ONNX reproduction logs (#2565)
valamuri2020 Aug 17, 2024
7a2b50f
Add test case for tweets / delinting code (#2534)
lintool Aug 18, 2024
907150b
Add to onboarding reproduction logs (#2572)
npjd Aug 18, 2024
350de1d
Tweak score tolerance for regressions (#2571)
lintool Aug 18, 2024
e9f65cf
Add model quantization instructions to ONNX documentation (#2570)
AndreSlavescu Aug 22, 2024
9904a86
Minor doc tweaks (#2576)
lintool Aug 22, 2024
d54900d
[maven-release-plugin] prepare release anserini-0.37.0
lintool Aug 22, 2024
bd00627
[maven-release-plugin] prepare for next development iteration
lintool Aug 22, 2024
4187878
Release notes for v0.37.0 (#2577)
lintool Aug 23, 2024
8c79066
Add to onboarding reproduction logs (#2574)
nicoella Aug 23, 2024
5a39d8c
Update fatjar doc to add TREC 2024 RAG test topics (#2578)
lintool Aug 23, 2024
c805521
Add to onboarding reproduction logs + instructions for Windows (#2583)
setarehbabajani Aug 30, 2024
8c0952d
Tweak parameters in regression yaml (#2581)
lintool Aug 31, 2024
f2737a8
Add to onboarding reproduction logs (#2586)
antea-ab Sep 3, 2024
68a2811
Refactor to enable cleaner Pyserini bindings (#2584)
lintool Sep 6, 2024
72efbbf
[maven-release-plugin] prepare release anserini-0.38.0
lintool Sep 6, 2024
c5f2d4f
[maven-release-plugin] prepare for next development iteration
lintool Sep 6, 2024
7978b4a
Release notes for v0.38.0 (#2589)
lintool Sep 6, 2024
4c3b840
Bump micromatch from 4.0.5 to 4.0.8 in /src/main/frontend (#2588)
dependabot[bot] Sep 7, 2024
e623527
Added a missing else to enable setting BM25Accurate properly (#2587)
Axiomatic314 Sep 7, 2024
0c2b578
Add to onboarding reproduction logs (#2591)
anshulsc Sep 7, 2024
a4daf32
Add to onboarding reproduction logs (#2593)
r-aya Sep 8, 2024
5732232
Add rank fusion - initial implementation (#2590)
Stefan824 Sep 8, 2024
3714825
Improve search UI landing page + search bar when displaying results (…
nicoella Sep 8, 2024
fed6527
Add Parquet dense vector support (#2582)
valamuri2020 Sep 10, 2024
b3ec459
Remove junit5 dependencies causing test cases to fail to run (#2597)
Stefan824 Sep 11, 2024
6ddade7
Added reproduction log entry for BM25 MS MARCO Passage Ranking
RMaarefdoust Sep 13, 2024
a60a33b
the changes
RMaarefdoust Sep 16, 2024
d2d4245
Add large file with Git LFS
RMaarefdoust Sep 16, 2024
bde0148
Ignore jar files
RMaarefdoust Sep 16, 2024
66a64c5
Update .gitattributes to untrack .jar files
RMaarefdoust Sep 16, 2024
9f7b76b
Add .jar files to .gitignore
RMaarefdoust Sep 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
File renamed without changes.
27 changes: 27 additions & 0 deletions .github/workflows/maven.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: Java CI with Maven

on:
push:
branches: [ master ]
pull_request:
branches: [ master ]

jobs:
build:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- name: Set up JDK 21
uses: actions/setup-java@v2
with:
java-version: '21'
distribution: 'adopt'
cache: maven
- name: Build with Maven
run: mvn -B package --file pom.xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
token: ${{ secrets.CODECOV_TOKEN }}
88 changes: 77 additions & 11 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,19 +1,85 @@
.DS_Store
.classpath
.project
.idea
target/
eval/trec_eval.9.0/
*~
bin/
eval/trec_eval.9.0.4/
eval/ndeval/ndeval
src/test/test-index*
*~
*.iml
.idea
*.pyc
src/test/test-index*
*node_modules/
src/main/js/SpeechDemo-darwin-x64/
config.cfg
log*
out.*
*.swp
tmp-files
*.out
lucene-index*
run.*
log.*
*.log
out.*
runs.regression/
runs.jdiq2018/
src/main/resources/solr/anserini-twitter/conf/lang/
src/main/resources/solr/anserini/conf/lang/
src/main/resources/cacm/collection/

# automatically generated by ECIR2019_axiomatic scripts
tools/topics-and-qrels/qrels.cw09.all.txt
tools/topics-and-qrels/qrels.cw12.all.txt
tools/topics-and-qrels/qrels.disk12.all.txt
tools/topics-and-qrels/qrels.gov2.all.txt
tools/topics-and-qrels/qrels.mb11.all.txt
tools/topics-and-qrels/qrels.mb13.all.txt

# vscode related files
.settings
.factorypath
.vscode/

# elastirini and solrini installation
solrini/
elastirini/

# default directory where runs go
runs/

# default directory where logs go
logs/

# default directory where collections go
collections/

# default directory where indexes go
indexes/

# default output location of "Neural Hype" experiments: https://github.com/castorini/anserini/blob/master/docs/experiments-forum2018.md
fine_tuning_results/

# directory where we keep throw-away Java classes that we don't want checked into the repo.
src/main/java/io/anserini/scratch/

# these are just concatenation of TREC-COVID round 1 and round2 qrels, so no need to check in.
tools/topics-and-qrels/qrels.covid-round12.txt
tools/topics-and-qrels/qrels.covid-round2-cumulative.txt

# frontend related files
/src/main/frontend/node_modules
/src/main/frontend/.pnp
/src/main/frontend/.pnp.js
/src/main/frontend/coverage
/src/main/frontend/.next/
/src/main/frontend/out/
/src/main/frontend/build
/src/main/frontend/.DS_Store
/src/main/frontend/*.pem
/src/main/frontend/npm-debug.log*
/src/main/frontend/yarn-debug.log*
/src/main/frontend/yarn-error.log*
/src/main/frontend.pnpm-debug.log*
/src/main/frontend/.env*.local
/src/main/frontend/.vercel
/src/main/frontend/*.tsbuildinfo
/src/main/frontend/next-env.d.ts
/src/main/python/onnx/models/*.onnx
/src/main/python/parquet/venv*
*.jar
*.jar
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "tools"]
path = tools
url = https://github.com/castorini/anserini-tools.git
191 changes: 191 additions & 0 deletions LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@

Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/

TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

1. Definitions.

"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.

"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.

"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.

"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.

"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.

"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.

"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).

"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.

"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."

"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.

2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.

3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.

4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:

(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and

(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and

(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and

(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.

You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.

5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.

6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.

7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.

8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.

9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

Copyright 2019-2021 Anserini authors

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Loading