ReAgent

mirror of https://github.com/facebookresearch/ReAgent.git synced 2026-05-17 12:40:39 +00:00

Author	SHA1	Message	Date
Alex Nikulkov	57b58a8b3a	add assertion for non-empty possible action mask (#557 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/557 See title Reviewed By: czxttkl Differential Revision: D31524614 fbshipit-source-id: e7aa7996de570f4ff990b402fbd23688a4ed12f4	2021-10-13 18:53:55 -07:00
Pavlos Athanasios Apostolopoulos	4ce275bc7e	Adding Bayesian Optimization Optimizer with ensemble of feedforward networks, independent Thompson sampling, and mutation. (#561 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/561 Bayesian Optimization Optimizer with ensemble of feedforward networks, ITS, and mutation based optimization. Reviewed By: czxttkl Differential Revision: D31424065 fbshipit-source-id: 8ffc1e7fd5de303cd572ea5bcd880429af67d173	2021-10-13 12:57:29 -07:00
Pavlos Athanasios Apostolopoulos	1e2b2656f7	Adding Bayesian Optimization Optimizer (#560 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/560 Bayesian Optimization Optimizer mutation-based optimization and acquisition function. Reviewed By: czxttkl Differential Revision: D31424105 fbshipit-source-id: 97872516e1c633071f983ebe6b254cbabee7b037	2021-10-13 12:57:01 -07:00
Pyre Bot Jr	2b65e91182	suppress errors in `reagent` Differential Revision: D31605682 fbshipit-source-id: 6c2d89926ecab45cdbbcdd48058ef3697f94f92b	2021-10-13 03:18:29 -07:00
Zhengxing Chen	4f8fe6592b	Fix ReAgentLightningModule (#559 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/559 cleanly_stop is a manually set variable which needs to be placed on the correct device. Otherwise we will see errors like in f301990179. Also, ddp is not needed in single cpu/gpu training. Reviewed By: alexnikulkov Differential Revision: D31530342 fbshipit-source-id: 98879fc130616aaccc454f939cd7cf2a704eb0eb	2021-10-12 14:55:13 -07:00
Alex Nikulkov	dba2fd9735	Convert possible_actions_mask to a Tensor (#556 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/556 Convert possible_actions_mask to a Tensor Reviewed By: czxttkl Differential Revision: D31497491 fbshipit-source-id: c0b8eb479b6be517a9c74c1d61ad68e4120d388a	2021-10-11 17:57:32 -07:00
Zhengxing Chen	b70c43e4ba	Improve REINFORCE trainer (#558 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/558 add some input check and simplify code Reviewed By: gji1 Differential Revision: D31529090 fbshipit-source-id: 0c38d9b927d0149256fa78d373687bc9048a0c85	2021-10-10 19:46:05 -07:00
Alex Nikulkov	4808562479	copy possible_action_maks from the env at each step instead of re-using the same variable (#555 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/555 The current implementation was buggy if the env was reusing the same variable for possible_actions_mask and modifying it in place. I fix the bug by copying the possible_action_mask values instead of assigning the variable directly. Reviewed By: czxttkl Differential Revision: D31487641 fbshipit-source-id: ebc70164e42dc097291a7aeecba60d2ef30117b3	2021-10-09 10:30:00 -07:00
Pyre Bot Jr	34fe167add	suppress errors in `reagent` Differential Revision: D31496257 fbshipit-source-id: 0f6b56075e4d24bdfd9d54bcecee90c5d86efbaf	2021-10-07 21:38:50 -07:00
Alex Nikulkov	bb357dc599	Move ReAgent MAB from numpy to PyTorch Summary: Replace numpy with PyTorch. This is a step towards using the standard ReAgent interface for MABs Reviewed By: czxttkl Differential Revision: D31423841 fbshipit-source-id: 04ccf92fba7b0f44ab6c19bdef3d098bf62394cf	2021-10-06 19:06:26 -07:00
Alex Nikulkov	46de5c36fb	add basic MAB classes to reagent Summary: Adding basic UCB MAB classes to ReAgent. 3 variants of UCB are added (including the one currently used for Ads Creative Exploration - MetricUCB) Supported functionality: 1. Batch training (feed in counts of samples and total reward from each arm). We'll use this mode for Ads Creative Exploration. 2. Online training (query the bandit for next action one step at a time). 3. Dumping the state of the bandit and loading it from a JSON string Reviewed By: czxttkl Differential Revision: D31355506 fbshipit-source-id: 978ec16cba289dc08af599a2c05bb49fcae2843a	2021-10-06 19:06:05 -07:00
Zhengxing Chen	d219a0c044	Change fb core types from namedtuple to dataclass (#554 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/554 as titled. This is one step towards a config/script-based rl orchestrator which can start necessary workflows automatically. Reviewed By: j-jiafei Differential Revision: D31334081 fbshipit-source-id: 0355b46396d922cf82f041734ffb8d20ceeab8e5	2021-10-06 18:45:51 -07:00
Fei Jia	f8bb0bf658	Change clampping of probability feature preprocessing. (#553 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/553 Use [0.01, 0.99] may cause some performance loss in boosting with entropy metrics. Reviewed By: czxttkl Differential Revision: D31346456 fbshipit-source-id: dae1ef0f6e36e67a182ced5793555e0d78dbf51e	2021-10-01 14:59:05 -07:00
Zhengxing Chen	9b7281d9b2	Fix last two circle ci tests (#552 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/552 By relaxing the threshold... Also set seeds Reviewed By: bankawas Differential Revision: D31334025 fbshipit-source-id: d5d666b2b5f5e5e4f06dea2a1353e85456f39a60	2021-10-01 13:50:14 -07:00
Danielle Pintz	48a5a286a7	Deprecate TrainerProperties Mixin and move property definitions directly into `trainer.py` (#9495 ) Summary: ### New commit log messages 290398f81 Deprecate TrainerProperties Mixin and move property definitions directly into `trainer.py` (#9495) Reviewed By: ananthsub Differential Revision: D31317981 fbshipit-source-id: 9a6270f326cebb59ef5fb53b8db9d0797f62be77	2021-09-30 22:08:26 -07:00
Zhengxing Chen	603387e052	Fix gym_cpu_unittest (#551 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/551 as titled Reviewed By: igfox Differential Revision: D31296738 fbshipit-source-id: 3672485ccd230f9b1a029f90759bdf598f5990e4	2021-09-30 12:01:18 -07:00
Danielle Pintz	2e716827a3	Remove `ABC` from `LightningModule` (#9517 ) Summary: ### New commit log messages 3aba9d16a Remove `ABC` from `LightningModule` (#9517) Reviewed By: ananthsub Differential Revision: D31296721 fbshipit-source-id: a9992486c61a6f86fb251f2733bbc9311d93f293	2021-09-30 09:47:34 -07:00
Zhengxing Chen	05179022df	Add test_gym_replay_buffer (#549 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/549 Tests for replay buffer's behavior Reviewed By: alexnikulkov Differential Revision: D30978005 fbshipit-source-id: aa034db5699071654d607fe7795bc8be232157c2	2021-09-29 21:46:45 -07:00
Zhengxing Chen	c41b961df9	Fix rasp tests (#550 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/550 update miniconda and update T101565175 Reviewed By: gji1 Differential Revision: D31290939 fbshipit-source-id: cbecdb63048fb3fb79a7b7eb87406408309026c1	2021-09-29 21:21:25 -07:00
Zhengxing Chen	b5afcc01e6	Allow obj_func be optional (#548 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/548 as titled Reviewed By: gji1 Differential Revision: D31217654 fbshipit-source-id: 514ab8ae7561b8a5a7ff5094642314f83c6b5be1	2021-09-29 21:21:03 -07:00
Ian Fox	c703915d80	Update docstring for transforms.py Summary: I found some of the documentation confusing, this is an attempt to clarify the functionality of the code. Reviewed By: czxttkl Differential Revision: D31071280 fbshipit-source-id: 62e7e299d40e7a431ed29dea0c6582646a855fd9	2021-09-29 17:55:58 -07:00
Yunus Emre	57f27dbb36	Adds unit test for columnvector function Summary: Adds unit test to the test_processing.py for columnvector function from transform.py Reviewed By: igfox Differential Revision: D31247953 fbshipit-source-id: 8e6eee0fecf3dfb0bff8fb3d168e15f002c0acf3	2021-09-29 10:12:04 -07:00
Bo Gong	5f0b21ee15	Add a unit test for OneHotActions Summary: Add a unit test for OneHotActions. Reviewed By: igfox Differential Revision: D31248082 fbshipit-source-id: 74d55ab5d3a23c75f5d0020b53616c87023afcf0	2021-09-29 09:59:38 -07:00
Wei Wen	e6b2e6ed2b	Super net config sampling Summary: 1. super net sampling (with Reagent APIs) 2. Other utils to support 1 2.1. update `SuperNNConfig` attribute by a path str so that samples from Reagent ng.p.Dict can be easily mapped to masks within `SuperNNConfig`: `replace_named_tuple_by_path` 3. test samples such that counts of masks are close to configured probabilities Reviewed By: dehuacheng Differential Revision: D31126805 fbshipit-source-id: 95e48728773c2afd7e6856f8a7a831b00214bbda	2021-09-27 09:19:51 -07:00
Avery Faller	67c0a559e3	Unit Test for SlateView (#546 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/546 Write a unit test for SlateView class to test that it functions as expected and to ensure it raises errors when it should Reviewed By: igfox Differential Revision: D31151826 fbshipit-source-id: e5750eff2a256c04ab5740d94917cee321c0265e	2021-09-24 09:38:56 -07:00
Pierre Gleize	99e3c0d180	Add unit test for FixedLengthSequenceDenseNormalization. (#545 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/545 Reviewed By: igfox Differential Revision: D31136906 fbshipit-source-id: 63e7b2555bff4a6cda8487f85218473ed736a4c9	2021-09-23 18:14:32 -07:00
Siyu Wang	312cf971bb	Share the training step output data via `ClosureResult` (#9349 ) Summary: ### New commit log messages e0f2e041b Share the training step output data via `ClosureResult` (#9349) Reviewed By: kandluis Differential Revision: D31058705 fbshipit-source-id: 1b7b59087129406c0164b30b49a40383c65e6250	2021-09-23 08:54:29 -07:00
Eric Spellman	042820ac4f	Adding transform.StackDenseFixedSizeArray unit test (#544 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/544 Adding unit test for transforms.StackDenseFixedSizeArray Reviewed By: igfox Differential Revision: D31114407 fbshipit-source-id: acd1a15c524ca2a990b879e31bea2832c8549be2	2021-09-22 16:58:44 -07:00
Pavlos Athanasios Apostolopoulos	fd11fe3669	Create a Unit Test for FixedLengthSequences (#543 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/543 Creating a unit test to cover FixedLengthSequences function. Reviewed By: igfox Differential Revision: D31084450 fbshipit-source-id: 747caa5669ea6f353009236311f66c2ba2bd20a2	2021-09-21 16:54:35 -07:00
Ian Fox	7dc90f320f	Fix preprocessor error (#541 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/541 fix https://app.circleci.com/pipelines/github/facebookresearch/ReAgent/1963/workflows/5b311365-d50c-4e91-8bd7-21db74c2ef7c/jobs/15000 Data preprocessing will happen on cpu. Then preprocessed data will be moved to gpu by pytorch lightning. Reviewed By: gji1 Differential Revision: D31057900 fbshipit-source-id: ae6bb1ad62cec40a3deb91f8f00120cdd1281435	2021-09-21 16:33:06 -07:00
Ananth Subramaniam	5918384fb0	Update Lightning version (#542 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/542 Update Lightning version for ReAgent Reviewed By: igfox Differential Revision: D31092583 fbshipit-source-id: 0d7d7d37caa01e5b95d3ce233e3d6e62fff6139b	2021-09-21 13:44:55 -07:00
Ananth Subramaniam	8ae9850277	Enforce that the optimizer closure is executed when `optimizer_step` is overridden (#9360 ) Summary: ### New commit log messages 15d943089 Enforce that the optimizer closure is executed when `optimizer_step` is overridden (#9360) Reviewed By: kandluis Differential Revision: D30817624 fbshipit-source-id: 653debef741fb59736b07b960bc6505d466f1105	2021-09-21 09:27:48 -07:00
Zhengxing Chen	a94e01ec14	Refactor reagent lite (#539 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/539 1. fix anneal rate and temperature 2. remove maintaining self.step 3. make every optimizer calls sample_internal() and update_params() in _optimize_step(). Users with tailored needs will call sample_internal() and update_params() manually for performing an optimization step. Reviewed By: dehuacheng Differential Revision: D30947741 fbshipit-source-id: e45ab20baefb2422e40931785f4578f98bf58ec4	2021-09-20 10:32:45 -07:00
Wonjae Lee	0d2f8c7a4d	Create a Unit Test for MapIDListFeatures (#540 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/540 create a unit test to cover the MapIDListFeatures function. Reviewed By: igfox Differential Revision: D31007991 fbshipit-source-id: 9f9299f7494f7822f6d43032501104795efa1d95	2021-09-17 16:26:58 -07:00
Leo Huang	60f23d0358	write test for test_MaskByPresence Summary: add test for mask by presence Reviewed By: igfox Differential Revision: D30993349 fbshipit-source-id: a870fa8fe3773ca4dfac91d781b80701a5e6719c	2021-09-17 14:30:52 -07:00
Zhengxing Chen	8b9b2427fa	Add constructor method for nevergrad (#538 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/538 Support specifying estimated_budgets and optimizer_name. Reviewed By: teytaud Differential Revision: D30912782 fbshipit-source-id: e4dd8804face839bb6175afd22944dd7893fe5c7	2021-09-15 07:26:09 -07:00
Zhengxing Chen	5d2f27de6d	Type fix for lite optimizer Summary: as titled Reviewed By: wenwei202 Differential Revision: D30909621 fbshipit-source-id: a76f5298566dfc05360f83be565f91714eac4084	2021-09-14 13:36:03 -07:00
Zhengxing Chen	7f5dfe7262	Fix data loader identity (#537 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/537 We need to have a unique identity for each epoch and dataset type (train/val/test). We must use cpu-based batch preprocessor Some other small fixes. Reviewed By: j-jiafei Differential Revision: D30861672 fbshipit-source-id: e89a1a03bc345123a164987c3f4c7876fc783b93	2021-09-12 15:24:57 -07:00
Wei Wen	345be18595	Add a function to convert idx to raw choices. More tests with probability assert. Summary: add function to convert idx to raw choices. More tests with probability assert. Reviewed By: czxttkl Differential Revision: D30824852 fbshipit-source-id: 502c814f8cf629603fa7ee9576706d1833ca182e	2021-09-09 08:53:03 -07:00
Zhengxing Chen	a8c9b70ca7	Fix type hint in Optimizers (#536 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/536 np.array -> np.ndarray Reviewed By: wenwei202 Differential Revision: D30812091 fbshipit-source-id: 52e6fea3be48983981e28b49b5e709593951763f	2021-09-08 17:11:18 -07:00
Pyre Bot Jr	e66d29a462	suppress errors in `reagent` Differential Revision: D30797764 fbshipit-source-id: c7c9fa99d5de21acb6917e7d70ade5049e20bab3	2021-09-07 20:52:37 -07:00
Zhengxing Chen	fd32017df6	Read partitioned data by Koski when distributed training is turned on (#505 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/505 When we set `reader_options.min_nodes` > 1, we turn on distributed training. The koski reader in each trainer process should only read `1/min_nodes` data. Reviewed By: j-jiafei Differential Revision: D28779856 fbshipit-source-id: 9665c6b65b6d02066ae38d2f37be8d268c624797	2021-09-04 19:31:00 -07:00
Zhengxing Chen	ab1ebc3d19	ReAgent Lite API (#531 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/531 A lite API for solving combinatorial problems. Currently only support discrete input spaces. Reviewed By: kittipatv Differential Revision: D30453019 fbshipit-source-id: 47d0cdb12ef4e2b7b26d1a00a90f70016ba67af0	2021-09-03 16:02:03 -07:00
Ian Fox	7b4374dd41	Add max_weight parameter to CRR Summary: Exposes the upper bound clip limit for action weights in CRR as a max_weight parameter Reviewed By: DavidV17 Differential Revision: D30739945 fbshipit-source-id: 3a8273d32f0566e4801ae30c90703e880a4f6691	2021-09-02 18:23:04 -07:00
Kittipat Virochsiri	d52b64e3e7	Disable parallel policy evaluation by default (#534 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/534 Catching PickleError stops working as it's now RuntimeError. Since RuntimeError is quite generic, I don't think it's a good idea to catch it. Therefore, let's just disable parallel evaluation. Reviewed By: igfox Differential Revision: D30730645 fbshipit-source-id: 4f9be1dd5fd9e559d76c6cda0aaa183da410d2ed	2021-09-02 13:53:05 -07:00
Kittipat Virochsiri	e690184db4	update CircleCI config (#533 ) Summary: Gym will be installed by tox before running unittests. No need to install Gym outside of virtual env. Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/533 Reviewed By: czxttkl Differential Revision: D30731643 fbshipit-source-id: 19ad746de6712bebb89770366b3d04a65294eeb9	2021-09-02 12:32:21 -07:00
Kittipat Virochsiri	ca2dc4ce0b	Ensure feature type override works as expected Summary: Some choices of feature type overrides were not respected. Reviewed By: DavidV17 Differential Revision: D30658323 fbshipit-source-id: 5d6d2f54a7904ef47b5c1e89fdca858cb0af5c61	2021-08-31 09:35:31 -07:00
Kittipat Virochsiri	cc6a4a3dbc	Adding modulo ID-list mapping Summary: A lighter weight way to experiment with sparse features Reviewed By: czxttkl Differential Revision: D30560575 fbshipit-source-id: 21ea8b560c0578e81f3ddf127b017db16630da3c	2021-08-30 17:52:21 -07:00
Ian Fox	cf72bf1b43	Adding transform unit tests (#532 ) Summary: Pull Request resolved: https://github.com/facebookresearch/ReAgent/pull/532 Adding unit tests to cover some functions in transform.py I'm leaving some methods uncovered in this diff to try out bootcamping unit test creation Reviewed By: czxttkl Differential Revision: D30607144 fbshipit-source-id: 08a993ab8afadd49cc30c6b691989b8f867a151a	2021-08-30 12:41:35 -07:00
Kittipat Virochsiri	a6d5394031	Minor typing fixes Summary: make pyre complains less Reviewed By: czxttkl Differential Revision: D30560574 fbshipit-source-id: ec419dd2ec0fae0285f916d61d6f262e1732eb00	2021-08-30 11:51:35 -07:00

1 2 3 4 5 ...

1339 Commits