Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/559
cleanly_stop is a manually set variable which needs to be placed on the correct device. Otherwise we will see errors like in f301990179.
Also, ddp is not needed in single cpu/gpu training.
Reviewed By: alexnikulkov
Differential Revision: D31530342
fbshipit-source-id: 98879fc130616aaccc454f939cd7cf2a704eb0eb
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/555
The current implementation was buggy if the env was reusing the same variable for possible_actions_mask and modifying it in place. I fix the bug by copying the possible_action_mask values instead of assigning the variable directly.
Reviewed By: czxttkl
Differential Revision: D31487641
fbshipit-source-id: ebc70164e42dc097291a7aeecba60d2ef30117b3
Summary: Replace numpy with PyTorch. This is a step towards using the standard ReAgent interface for MABs
Reviewed By: czxttkl
Differential Revision: D31423841
fbshipit-source-id: 04ccf92fba7b0f44ab6c19bdef3d098bf62394cf
Summary:
Adding basic UCB MAB classes to ReAgent.
3 variants of UCB are added (including the one currently used for Ads Creative Exploration - MetricUCB)
Supported functionality:
1. Batch training (feed in counts of samples and total reward from each arm). We'll use this mode for Ads Creative Exploration.
2. Online training (query the bandit for next action one step at a time).
3. Dumping the state of the bandit and loading it from a JSON string
Reviewed By: czxttkl
Differential Revision: D31355506
fbshipit-source-id: 978ec16cba289dc08af599a2c05bb49fcae2843a
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/554
as titled.
This is one step towards a config/script-based rl orchestrator which can start necessary workflows automatically.
Reviewed By: j-jiafei
Differential Revision: D31334081
fbshipit-source-id: 0355b46396d922cf82f041734ffb8d20ceeab8e5
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/553
Use [0.01, 0.99] may cause some performance loss in boosting with entropy
metrics.
Reviewed By: czxttkl
Differential Revision: D31346456
fbshipit-source-id: dae1ef0f6e36e67a182ced5793555e0d78dbf51e
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/552
By relaxing the threshold...
Also set seeds
Reviewed By: bankawas
Differential Revision: D31334025
fbshipit-source-id: d5d666b2b5f5e5e4f06dea2a1353e85456f39a60
Summary: I found some of the documentation confusing, this is an attempt to clarify the functionality of the code.
Reviewed By: czxttkl
Differential Revision: D31071280
fbshipit-source-id: 62e7e299d40e7a431ed29dea0c6582646a855fd9
Summary: Adds unit test to the test_processing.py for columnvector function from transform.py
Reviewed By: igfox
Differential Revision: D31247953
fbshipit-source-id: 8e6eee0fecf3dfb0bff8fb3d168e15f002c0acf3
Summary: Add a unit test for OneHotActions.
Reviewed By: igfox
Differential Revision: D31248082
fbshipit-source-id: 74d55ab5d3a23c75f5d0020b53616c87023afcf0
Summary:
1. super net sampling (with Reagent APIs)
2. Other utils to support 1
2.1. update `SuperNNConfig` attribute by a path str so that samples from Reagent ng.p.Dict can be easily mapped to masks within `SuperNNConfig`: `replace_named_tuple_by_path`
3. test samples such that counts of masks are close to configured probabilities
Reviewed By: dehuacheng
Differential Revision: D31126805
fbshipit-source-id: 95e48728773c2afd7e6856f8a7a831b00214bbda
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/546
Write a unit test for SlateView class to test that it functions as expected and to ensure it raises errors when it should
Reviewed By: igfox
Differential Revision: D31151826
fbshipit-source-id: e5750eff2a256c04ab5740d94917cee321c0265e
Summary:
### New commit log messages
e0f2e041b Share the training step output data via `ClosureResult` (#9349)
Reviewed By: kandluis
Differential Revision: D31058705
fbshipit-source-id: 1b7b59087129406c0164b30b49a40383c65e6250
Summary:
### New commit log messages
15d943089 Enforce that the optimizer closure is executed when `optimizer_step` is overridden (#9360)
Reviewed By: kandluis
Differential Revision: D30817624
fbshipit-source-id: 653debef741fb59736b07b960bc6505d466f1105
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/539
1. fix anneal rate and temperature
2. remove maintaining self.step
3. make every optimizer calls sample_internal() and update_params() in _optimize_step(). Users with tailored needs will call sample_internal() and update_params() manually for performing an optimization step.
Reviewed By: dehuacheng
Differential Revision: D30947741
fbshipit-source-id: e45ab20baefb2422e40931785f4578f98bf58ec4
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/540
create a unit test to cover the MapIDListFeatures function.
Reviewed By: igfox
Differential Revision: D31007991
fbshipit-source-id: 9f9299f7494f7822f6d43032501104795efa1d95
Summary: add test for mask by presence
Reviewed By: igfox
Differential Revision: D30993349
fbshipit-source-id: a870fa8fe3773ca4dfac91d781b80701a5e6719c
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/537
We need to have a unique identity for each epoch and dataset type (train/val/test).
We must use cpu-based batch preprocessor
Some other small fixes.
Reviewed By: j-jiafei
Differential Revision: D30861672
fbshipit-source-id: e89a1a03bc345123a164987c3f4c7876fc783b93
Summary: add function to convert idx to raw choices. More tests with probability assert.
Reviewed By: czxttkl
Differential Revision: D30824852
fbshipit-source-id: 502c814f8cf629603fa7ee9576706d1833ca182e
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/505
When we set `reader_options.min_nodes` > 1, we turn on distributed training. The koski reader in each trainer process should only read `1/min_nodes` data.
Reviewed By: j-jiafei
Differential Revision: D28779856
fbshipit-source-id: 9665c6b65b6d02066ae38d2f37be8d268c624797
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/531
A lite API for solving combinatorial problems. Currently only support discrete input spaces.
Reviewed By: kittipatv
Differential Revision: D30453019
fbshipit-source-id: 47d0cdb12ef4e2b7b26d1a00a90f70016ba67af0
Summary: Exposes the upper bound clip limit for action weights in CRR as a max_weight parameter
Reviewed By: DavidV17
Differential Revision: D30739945
fbshipit-source-id: 3a8273d32f0566e4801ae30c90703e880a4f6691
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/534
Catching PickleError stops working as it's now RuntimeError. Since RuntimeError is quite generic, I don't think it's a good idea to catch it. Therefore, let's just disable parallel evaluation.
Reviewed By: igfox
Differential Revision: D30730645
fbshipit-source-id: 4f9be1dd5fd9e559d76c6cda0aaa183da410d2ed
Summary:
Gym will be installed by tox before running unittests. No need to install Gym outside of virtual env.
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/533
Reviewed By: czxttkl
Differential Revision: D30731643
fbshipit-source-id: 19ad746de6712bebb89770366b3d04a65294eeb9
Summary: Some choices of feature type overrides were not respected.
Reviewed By: DavidV17
Differential Revision: D30658323
fbshipit-source-id: 5d6d2f54a7904ef47b5c1e89fdca858cb0af5c61
Summary: A lighter weight way to experiment with sparse features
Reviewed By: czxttkl
Differential Revision: D30560575
fbshipit-source-id: 21ea8b560c0578e81f3ddf127b017db16630da3c
Summary:
Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/532
Adding unit tests to cover some functions in transform.py
I'm leaving some methods uncovered in this diff to try out bootcamping unit test creation
Reviewed By: czxttkl
Differential Revision: D30607144
fbshipit-source-id: 08a993ab8afadd49cc30c6b691989b8f867a151a