{mem: 4, cores: 1}
.Processed
to Approved
.vgcn-infrastructure
, and run ./add-training.sh
with the left parameters: Usage:
./add-training.sh <training-identifier> <vm-size (e.g. c1.c28m225d50)> <vm-count> <start in YYYY-mm-dd> <end in YYYY-mm-dd> [-- donotautocommitpush]
resources.yaml
with the information for the training.donotautocommitpush
was not used, the script will commit and push.todays_date >= start date
, vgcn-infrastructure will automatically try and launch the VM.Question | Answer |
---|---|
What if there are multiple trainings? | It has not happened yet. If it does, it is with long-running trainings and we usually give them fewer / smaller machines. |
What is the recommended machine? | c1.c28m225d50 , generally used for training, so no changes to the main queue needed. |
What is the format for the date? | YYYY-mm-dd |
How many machines can be assigned? | There is a maximum of 7 machines. |
How do I estimate the resources accurately? | It’s hard to estimate the number correctly without more information that isn’t so easy to collect. Usually there’s no information about the time limitations they have, maybe they expect a step to run in 1 minute, but we don’t know that. We don’t know either if they’ll run a dataset collection. |