Skip to content

Commit

Permalink
Adding sub-sampling option, fixing inconsistencies in temporary file …
Browse files Browse the repository at this point in the history
…names, plus a few optimizations.

Summary: Adding sub-sampling option, fixing inconsistencies in temporary file names, plus a few optimizations.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
  • Loading branch information
mnaumovfb committed Nov 7, 2019
1 parent 864bf0b commit 1c439ea
Show file tree
Hide file tree
Showing 6 changed files with 234 additions and 264 deletions.
2 changes: 2 additions & 0 deletions cython/cython_criteo.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
)
# model related parameters
parser.add_argument("--max-ind-range", type=int, default=-1)
parser.add_argument("--data-sub-sample-rate", type=float, default=0.0) # in [0, 1]
parser.add_argument("--data-randomize", type=str, default="total") # or day or none
parser.add_argument("--memory-map", action="store_true", default=False)
parser.add_argument("--data-set", type=str, default="kaggle") # or terabyte
Expand All @@ -45,6 +46,7 @@
duc.loadDataset(
args.data_set,
args.max_ind_range,
args.data_sub_sample_rate,
args.data_randomize,
"train",
args.raw_data_file,
Expand Down
Loading

0 comments on commit 1c439ea

Please sign in to comment.