Questions tagged [ray-tune]
72 questions
12
votes
3 answers
Change Logdir of Ray RLlib Training instead of ~/ray_results
I'm using Ray & RLlib to train RL agents on an Ubuntu system. Tensorboard is used to monitor the training progress by pointing it to ~/ray_results where all the log files for all runs are stored. Ray Tune is not being used.
For example, on starting…

Nyxynyx
- 61,411
- 155
- 482
- 830
9
votes
3 answers
Raytune is throwing error: "module 'pickle' has no attribute 'PickleBuffer'" when attempting hyperparameter search
I am more or less following this example to integrate the ray tune hyperparameter library with the huggingface transformers library using my own dataset.
Here is my script:
import ray
from ray import tune
from ray.tune import CLIReporter
from…

Luca Guarro
- 1,085
- 1
- 11
- 25
7
votes
1 answer
Checkpoint best model for a trial in ray tune
So I just ran a tune experiment and got the following output:
+--------------------+------------+-------+-------------+----------------+--------+------------+
| Trial name | status | loc | lr | weight_decay | loss | …

Kiran Sanjeevan
- 149
- 1
- 7
4
votes
1 answer
Ray[tune] for pytorch TypeError: ray.cloudpickle.dumps
I am having trouble getting started with tune from Ray. I have a PyTorch model to be trained and I am trying to fine-tune using this library. I am very new to Raytune so please bear with me and help me understand where the error stems from.
my…

CtrlMj
- 119
- 7
4
votes
1 answer
Out of memory at every second trial using Ray Tune
I am tuning the hyperparameters using ray tune. The model is built in the tensorflow library, it occupies a large part of the available GPU memory. I noticed that every second call reports an out of memory error.It looks like the memory is being…

Emil Nowosielski
- 43
- 3
3
votes
1 answer
How do I checkpoint only the best model from a ray tune run?
NOTE: To some extent, this was already asked here but my question tackles a different aspect of getting the best checkpoint.
In the referenced question, the author only desired to retrieve the best checkpoint from a set of checkpoints after the ray…

c0mr4t
- 311
- 2
- 17
3
votes
1 answer
How to define SearchAlgorithm-agnostic, high-dimensional search space in Ray Tune?
I have two questions concerning Ray Tune. First, how can I define a hyperparameter search space independently from the particular SearchAlgorithm used. For instance, HyperOpt uses something like 'height': hp.uniform('height', -100, 100) whereas…

Rylan Schaeffer
- 1,945
- 2
- 28
- 50
2
votes
0 answers
Raytune tune.choice Typeerror: int() argument must be a string, a bytes-like object or a number, not 'Categorical'
I am trying hyperparameter tuning using Ray-tune.
current my tune_config is shown in below code
self.tune_config = {
"batch_size": tune.choice([128, 256, 512]),
"epoch": tune.choice([50, 100, 200]),
"sequence_length": tune.choice([128,…

hjsg1010
- 165
- 3
- 13
2
votes
0 answers
ValueError: The actor ImplicitFunc is too large (106 MiB > FUNCTION_SIZE_ERROR_THRESHOLD=95 MiB)
While I used the ray tune toolbox to find the optimal hyperparameters
I encountered the following error:
ValueError: The actor ImplicitFunc is too large (106 MiB > FUNCTION_SIZE_ERROR_THRESHOLD=95 MiB). Check that its definition is not implicitly…

Echolst 1
- 21
- 1
2
votes
0 answers
The actor died unexpectedly before finishing this task ( Ray1.7.0 , Sagemaker )
I am running Ray rllib on sagemaker with 8 cores CPU using the sagemaker_rl library, I set num_workers to 7.
After a long execution I face The actor died unexpectedly before finishing this task
class MyLauncher(SageMakerRayLauncher):
def…

Amir Reza SH
- 21
- 4
2
votes
0 answers
Nested hyperparameters in Ray Tune?
I am using Ray Tune and I am disappointed by the lack of options for conditional / nested hyperparameters. It seems I will have to hack something together, but since I can't be the first one who had this problem I'm wondering how other people solved…

Florian Dietz
- 877
- 9
- 20
2
votes
0 answers
How get optimal number of iterations in ray tune
If I'm using ray tune without a scheduler, how can I determine the number of iterations after which the network starts to overfit? I.e. I need an iteration, when the model achieved the best score on a validation set.

Cat-with-a-pipe
- 135
- 9
2
votes
1 answer
When using ray tune, value defined in config returns a non-float value
I'm new to use Ray Tune.
I defined my ray config as below:
ray_config = {
"estimator/dropout_rate": tune.uniform(0.0, 0.3),
"estimator/d_model": tune.choice([64]),
"estimator/num_encoder_layers": tune.choice([3]),
…

Ashikandi
- 47
- 5
2
votes
1 answer
What does 'output_dir' mean in transformers.TrainingArguments?
On the huggingface site documentation, it says 'The output directory where the model predictions and checkpoints will be written'. I don't quite understand what it means. Do I have to create any file for that?

abhishekkuber
- 45
- 6
2
votes
0 answers
Insufficient cluster resources to launch trial - has only 0 GPUs
I am following this tutorial (which is basically this) in order to use ray tune for hyperparemeter optimization. My model is training fine on the GPU without the optimization but now I want to optimize.
I applied the tutorial to my code but when I…

m02ph3u5
- 3,022
- 7
- 38
- 51