Questions tagged [higher]
5 questions
3
votes
0 answers
Adafactor from transformers hugging face only works with Transfromers - does it not work with Resnets and MAML with higher?
To reproduce
I am running the MAML (with higher) meta-learning algorithm with a resnet. I see this gives issues in my script (error message pasted bellow).
Is Adafactor not suppose to work with Resnets or other models?
Steps to reproduce the…

Charlie Parker
- 5,884
- 57
- 198
- 323
1
vote
2 answers
What is the official implementation of first order MAML using the higher PyTorch library?
After noticing that my custom implementation of first order MAML might be wrong I decided to google how the official way to do first order MAML is. I found a useful gitissue that suggests to stop tracking the higher order gradients. Which makes…

Charlie Parker
- 5,884
- 57
- 198
- 323
1
vote
1 answer
When should one call .eval() and .train() when doing MAML with the PyTorch higher library?
I was going through the omniglot maml example and saw that they have net.train() at the top of their testing code. This seems like a mistake since that means the stats from each task at meta-testing is shared:
def test(db, net, device, epoch, log):
…

Charlie Parker
- 5,884
- 57
- 198
- 323
0
votes
0 answers
How does one use the mean and std from training in Batch Norm?
I wanted to use the means, stds from training rather than batch stats since it seems if I use batch statistics my model diverges (as outline here When should one call .eval() and .train() when doing MAML with the PyTorch higher library?). How does…

Charlie Parker
- 5,884
- 57
- 198
- 323
0
votes
2 answers
How to use have batch norm not forget batch statistics it just used in Pytorch?
I am in an unusual setting where I should not use running statistics (as that would be considered cheating e.g. meta-learning). However, I often run a forward pass on a set of points (5 in fact) and then I want to evaluate only on 1 point using the…

Charlie Parker
- 5,884
- 57
- 198
- 323