Created
April 10, 2024 21:42
-
-
Save priyakasimbeg/4ac0309a7a6bc0464a605ea67565bc52 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Run docker pull us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_dev | |
docker pull us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_dev | |
docker run -v $HOME/data/:/data/ -v $HOME/experiment_runs/:/experiment_runs -v $HOME/experiment_runs/logs:/logs --gpus all --ipc=host us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_dev -d imagenet -f pytorch -s reference_algorithms/paper_baselines/adamw/pytorch/submission.py -w imagenet_resnet -t reference_algorithms/paper_baselines/adamw/tuning_search_space.json -e tests/regression_tests/adamw -m 10 -c False -o True -r false | |
shell: /usr/bin/bash -e {0} | |
Using default tag: latest | |
latest: Pulling from training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_dev | |
Digest: sha256:b63f6d6b1eb7610299ff239af48be392e40bcf3a8922a777e29c79043475d245 | |
Status: Image is up to date for us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_dev:latest | |
us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_dev:latest | |
torchrun --redirects 1:0,2:0,3:0,4:0,5:0,6:0,7:0 --standalone --nnodes=1 --nproc_per_node=8 submission_runner.py --framework=pytorch --workload=imagenet_resnet --submission_path=reference_algorithms/paper_baselines/adamw/pytorch/submission.py --data_dir=/data/imagenet/pytorch --num_tuning_trials=1 --experiment_dir=/experiment_runs --experiment_name=tests/regression_tests/adamw --overwrite=True --save_checkpoints=False --max_global_steps=10 --imagenet_v2_data_dir=/data/imagenet/pytorch --torch_compile=true --tuning_ruleset=external --tuning_search_space=reference_algorithms/paper_baselines/adamw/tuning_search_space.json 2>&1 | tee -a /logs/imagenet_resnet_pytorch_03-31-2024-08-09-56.log | |
Warning: 31 08:09:58,279] torch.distributed.run: [WARNING] master_addr is only used for static rdzv_backend and when rdzv_endpoint is not specified. | |
Warning: 31 08:09:58,279] torch.distributed.run: [WARNING] | |
Warning: 31 08:09:58,279] torch.distributed.run: [WARNING] ***************************************** | |
Warning: 31 08:09:58,279] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. | |
Warning: 31 08:09:58,279] torch.distributed.run: [WARNING] ***************************************** | |
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: | |
TensorFlow Addons (TFA) has ended development and introduction of new features. | |
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024. | |
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). | |
For more information see: https://github.com/tensorflow/addons/issues/2807 | |
warnings.warn( | |
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: | |
TensorFlow Addons (TFA) has ended development and introduction of new features. | |
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024. | |
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). | |
For more information see: https://github.com/tensorflow/addons/issues/2807 | |
warnings.warn( | |
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: | |
TensorFlow Addons (TFA) has ended development and introduction of new features. | |
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024. | |
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). | |
For more information see: https://github.com/tensorflow/addons/issues/2807 | |
warnings.warn( | |
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: | |
TensorFlow Addons (TFA) has ended development and introduction of new features. | |
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024. | |
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). | |
For more information see: https://github.com/tensorflow/addons/issues/2807 | |
warnings.warn( | |
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: | |
TensorFlow Addons (TFA) has ended development and introduction of new features. | |
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024. | |
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). | |
For more information see: https://github.com/tensorflow/addons/issues/2807 | |
warnings.warn( | |
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: | |
TensorFlow Addons (TFA) has ended development and introduction of new features. | |
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024. | |
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). | |
For more information see: https://github.com/tensorflow/addons/issues/2807 | |
warnings.warn( | |
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: | |
TensorFlow Addons (TFA) has ended development and introduction of new features. | |
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024. | |
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). | |
For more information see: https://github.com/tensorflow/addons/issues/2807 | |
warnings.warn( | |
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: | |
TensorFlow Addons (TFA) has ended development and introduction of new features. | |
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024. | |
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). | |
For more information see: https://github.com/tensorflow/addons/issues/2807 | |
warnings.warn( | |
I0331 08:10:12.752967 140111137965888 logger_utils.py:61] Removing existing experiment directory /experiment_runs/tests/regression_tests/adamw/imagenet_resnet_pytorch because --overwrite was set. | |
I0331 08:10:12.755207 140111137965888 logger_utils.py:76] Creating experiment directory at /experiment_runs/tests/regression_tests/adamw/imagenet_resnet_pytorch. | |
I0331 08:10:12.790524 140111137965888 submission_runner.py:561] Using RNG seed 1054528477 | |
I0331 08:10:12.791703 140111137965888 submission_runner.py:570] --- Tuning run 1/1 --- | |
I0331 08:10:12.791840 140111137965888 submission_runner.py:575] Creating tuning directory at /experiment_runs/tests/regression_tests/adamw/imagenet_resnet_pytorch/trial_1. | |
I0331 08:10:12.792509 140111137965888 logger_utils.py:92] Saving hparams to /experiment_runs/tests/regression_tests/adamw/imagenet_resnet_pytorch/trial_1/hparams.json. | |
I0331 08:10:13.060805 140111137965888 submission_runner.py:215] Initializing dataset. | |
I0331 08:10:25.238856 140111137965888 submission_runner.py:226] Initializing model. | |
I0331 08:10:34.396007 140111137965888 submission_runner.py:264] Performing `torch.compile`. | |
I0331 08:10:37.181028 140111137965888 submission_runner.py:268] Initializing optimizer. | |
I0331 08:10:37.183613 140111137965888 submission_runner.py:275] Initializing metrics bundle. | |
I0331 08:10:37.183767 140111137965888 submission_runner.py:293] Initializing checkpoint and logger. | |
I0331 08:10:37.184618 140111137965888 submission_runner.py:313] Saving meta data to /experiment_runs/tests/regression_tests/adamw/imagenet_resnet_pytorch/trial_1/meta_data_0.json. | |
I0331 08:10:37.725978 140111137965888 submission_runner.py:317] Saving flags to /experiment_runs/tests/regression_tests/adamw/imagenet_resnet_pytorch/trial_1/flags_0.json. | |
I0331 08:10:37.773440 140111137965888 submission_runner.py:327] Starting training loop. | |
Warning: 2024-03-31 08:10:40,154] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:10:40,183] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:10:40,352] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:10:40,388] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:10:40,404] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:10:40,406] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:10:40,411] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:10:40,442] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py:251: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. | |
grad.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 2048, 2048] | |
bucket_view.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 1, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:320.) | |
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass | |
/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py:251: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. | |
grad.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 2048, 2048] | |
bucket_view.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 1, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:320.) | |
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass | |
/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py:251: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. | |
grad.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 2048, 2048] | |
bucket_view.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 1, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:320.) | |
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass | |
/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py:251: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. | |
grad.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 2048, 2048] | |
bucket_view.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 1, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:320.) | |
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass | |
/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py:251: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. | |
grad.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 2048, 2048] | |
bucket_view.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 1, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:320.) | |
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass | |
/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py:251: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. | |
grad.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 2048, 2048] | |
bucket_view.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 1, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:320.) | |
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass | |
/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py:251: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. | |
grad.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 2048, 2048] | |
bucket_view.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 1, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:320.) | |
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass | |
/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py:251: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. | |
grad.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 2048, 2048] | |
bucket_view.sizes() = [512, 2048, 1, 1], strides() = [2048, 1, 1, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:320.) | |
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass | |
I0331 08:12:35.179400 140080983574272 logging_writer.py:48] [0] global_step=0, grad_norm=0.587501, loss=6.925025 | |
I0331 08:12:35.195179 140111137965888 submission.py:120] 0) loss = 6.925, grad_norm = 0.588 | |
I0331 08:12:35.826992 140111137965888 spec.py:321] Evaluating on the training split. | |
Warning: 2024-03-31 08:12:45,933] [0/1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:12:45,964] [0/1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:12:46,030] [0/1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:12:46,126] [0/1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:12:46,176] [0/1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:12:46,181] [0/1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:12:46,288] [0/1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:12:46,511] [0/1] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
I0331 08:13:58.720955 140111137965888 spec.py:333] Evaluating on the validation split. | |
Warning: 2024-03-31 08:14:51,491] [0/2] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:14:51,719] [0/2] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:14:52,093] [0/2] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:14:52,553] [0/2] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:14:53,071] [0/2] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:14:54,379] [0/2] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:14:54,424] [0/2] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:14:56,719] [0/2] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:110: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()' | |
torch.has_cuda, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:111: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()' | |
torch.has_cudnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:117: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()' | |
torch.has_mps, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:118: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()' | |
torch.has_mkldnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:110: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()' | |
torch.has_cuda, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:111: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()' | |
torch.has_cudnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:117: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()' | |
torch.has_mps, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:118: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()' | |
torch.has_mkldnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:110: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()' | |
torch.has_cuda, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:111: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()' | |
torch.has_cudnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:117: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()' | |
torch.has_mps, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:118: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()' | |
torch.has_mkldnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:110: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()' | |
torch.has_cuda, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:111: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()' | |
torch.has_cudnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:117: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()' | |
torch.has_mps, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:118: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()' | |
torch.has_mkldnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:110: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()' | |
torch.has_cuda, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:111: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()' | |
torch.has_cudnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:117: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()' | |
torch.has_mps, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:118: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()' | |
torch.has_mkldnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:110: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()' | |
torch.has_cuda, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:111: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()' | |
torch.has_cudnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:117: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()' | |
torch.has_mps, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:118: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()' | |
torch.has_mkldnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:110: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()' | |
torch.has_cuda, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:111: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()' | |
torch.has_cudnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:117: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()' | |
torch.has_mps, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:118: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()' | |
torch.has_mkldnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:110: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()' | |
torch.has_cuda, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:111: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()' | |
torch.has_cudnn, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:117: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()' | |
torch.has_mps, | |
/usr/local/lib/python3.8/dist-packages/torch/overrides.py:118: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()' | |
torch.has_mkldnn, | |
I0331 08:15:32.784438 140111137965888 spec.py:349] Evaluating on the test split. | |
I0331 08:15:32.801020 140111137965888 dataset_info.py:578] Load dataset info from /data/imagenet/pytorch/imagenet_v2/matched-frequency/3.0.0 | |
I0331 08:15:32.807415 140111137965888 dataset_builder.py:528] Reusing dataset imagenet_v2 (/data/imagenet/pytorch/imagenet_v2/matched-frequency/3.0.0) | |
I0331 08:15:32.875621 140111137965888 logging_logger.py:49] Constructing tf.data.Dataset imagenet_v2 for split test, from /data/imagenet/pytorch/imagenet_v2/matched-frequency/3.0.0 | |
Warning: 2024-03-31 08:15:34,725] [0/3] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:15:34,765] [0/3] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:15:34,790] [0/3] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:15:34,941] [0/3] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:15:34,952] [0/3] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:15:35,052] [0/3] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:15:35,511] [0/3] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
Warning: 2024-03-31 08:15:35,777] [0/3] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored | |
I0331 08:16:14.840835 140111137965888 submission_runner.py:426] Time since start: 337.07s, Step: 1, {'train/accuracy': 0.0017737563775510204, 'train/loss': 6.928203972018495, 'validation/accuracy': 0.00134, 'validation/loss': 6.928408125, 'validation/num_examples': 50000, 'test/accuracy': 0.0007, 'test/loss': 6.93043828125, 'test/num_examples': 10000, 'score': 117.4308168888092, 'total_duration': 337.0682120323181, 'accumulated_submission_time': 117.4308168888092, 'accumulated_eval_time': 219.0140655040741, 'accumulated_logging_time': 0} | |
I0331 08:16:14.855430 140065913165568 logging_writer.py:48] [1] accumulated_eval_time=219.014066, accumulated_logging_time=0, accumulated_submission_time=117.430817, global_step=1, preemption_count=0, score=117.430817, test/accuracy=0.000700, test/loss=6.930438, test/num_examples=10000, total_duration=337.068212, train/accuracy=0.001774, train/loss=6.928204, validation/accuracy=0.001340, validation/loss=6.928408, validation/num_examples=50000 | |
I0331 08:16:15.974320 140065904772864 logging_writer.py:48] [1] global_step=1, grad_norm=0.612899, loss=6.924363 | |
I0331 08:16:15.978590 140111137965888 submission.py:120] 1) loss = 6.924, grad_norm = 0.613 | |
I0331 08:16:16.346361 140065913165568 logging_writer.py:48] [2] global_step=2, grad_norm=0.605277, loss=6.926661 | |
I0331 08:16:16.350334 140111137965888 submission.py:120] 2) loss = 6.927, grad_norm = 0.605 | |
I0331 08:16:16.719233 140065904772864 logging_writer.py:48] [3] global_step=3, grad_norm=0.605843, loss=6.928249 | |
I0331 08:16:16.723332 140111137965888 submission.py:120] 3) loss = 6.928, grad_norm = 0.606 | |
I0331 08:16:17.091580 140065913165568 logging_writer.py:48] [4] global_step=4, grad_norm=0.598531, loss=6.928620 | |
I0331 08:16:17.099368 140111137965888 submission.py:120] 4) loss = 6.929, grad_norm = 0.599 | |
I0331 08:16:17.469580 140065904772864 logging_writer.py:48] [5] global_step=5, grad_norm=0.594602, loss=6.929913 | |
I0331 08:16:17.475684 140111137965888 submission.py:120] 5) loss = 6.930, grad_norm = 0.595 | |
I0331 08:16:17.847174 140065913165568 logging_writer.py:48] [6] global_step=6, grad_norm=0.609444, loss=6.932057 | |
I0331 08:16:17.852178 140111137965888 submission.py:120] 6) loss = 6.932, grad_norm = 0.609 | |
I0331 08:16:18.223795 140065904772864 logging_writer.py:48] [7] global_step=7, grad_norm=0.611987, loss=6.926456 | |
I0331 08:16:18.228719 140111137965888 submission.py:120] 7) loss = 6.926, grad_norm = 0.612 | |
I0331 08:16:18.601529 140065913165568 logging_writer.py:48] [8] global_step=8, grad_norm=0.606430, loss=6.921168 | |
I0331 08:16:18.606384 140111137965888 submission.py:120] 8) loss = 6.921, grad_norm = 0.606 | |
I0331 08:16:18.991785 140065904772864 logging_writer.py:48] [9] global_step=9, grad_norm=0.619730, loss=6.922999 | |
I0331 08:16:18.995997 140111137965888 submission.py:120] 9) loss = 6.923, grad_norm = 0.620 | |
I0331 08:16:19.791687 140111137965888 spec.py:321] Evaluating on the training split. | |
I0331 08:17:08.523149 140111137965888 spec.py:333] Evaluating on the validation split. | |
I0331 08:17:54.704064 140111137965888 spec.py:349] Evaluating on the test split. | |
I0331 08:17:55.841307 140111137965888 submission_runner.py:426] Time since start: 438.07s, Step: 10, {'train/accuracy': 0.0008569834183673469, 'train/loss': 6.916042405731824, 'validation/accuracy': 0.0008, 'validation/loss': 6.915869375, 'validation/num_examples': 50000, 'test/accuracy': 0.0008, 'test/loss': 6.91645234375, 'test/num_examples': 10000, 'score': 120.95486831665039, 'total_duration': 438.06874418258667, 'accumulated_submission_time': 120.95486831665039, 'accumulated_eval_time': 315.0638761520386, 'accumulated_logging_time': 0.026315689086914062} | |
I0331 08:17:55.855891 140065921558272 logging_writer.py:48] [10] accumulated_eval_time=315.063876, accumulated_logging_time=0.026316, accumulated_submission_time=120.954868, global_step=10, preemption_count=0, score=120.954868, test/accuracy=0.000800, test/loss=6.916452, test/num_examples=10000, total_duration=438.068744, train/accuracy=0.000857, train/loss=6.916042, validation/accuracy=0.000800, validation/loss=6.915869, validation/num_examples=50000 | |
I0331 08:17:56.505723 140065929950976 logging_writer.py:48] [10] global_step=10, preemption_count=0, score=120.954868 | |
I0331 08:17:56.975934 140111137965888 submission_runner.py:600] Tuning trial 1/1 | |
I0331 08:17:56.976132 140111137965888 submission_runner.py:601] Hyperparameters: Hyperparameters(learning_rate=0.0019814680146414726, one_minus_beta1=0.22838767981804783, beta2=0.999, warmup_factor=0.05, weight_decay=0.010340635370188849, label_smoothing=0.1, dropout_rate=0.0) | |
I0331 08:17:56.976719 140111137965888 submission_runner.py:602] Metrics: {'eval_results': [(1, {'train/accuracy': 0.0017737563775510204, 'train/loss': 6.928203972018495, 'validation/accuracy': 0.00134, 'validation/loss': 6.928408125, 'validation/num_examples': 50000, 'test/accuracy': 0.0007, 'test/loss': 6.93043828125, 'test/num_examples': 10000, 'score': 117.4308168888092, 'total_duration': 337.0682120323181, 'accumulated_submission_time': 117.4308168888092, 'accumulated_eval_time': 219.0140655040741, 'accumulated_logging_time': 0, 'global_step': 1, 'preemption_count': 0}), (10, {'train/accuracy': 0.0008569834183673469, 'train/loss': 6.916042405731824, 'validation/accuracy': 0.0008, 'validation/loss': 6.915869375, 'validation/num_examples': 50000, 'test/accuracy': 0.0008, 'test/loss': 6.91645234375, 'test/num_examples': 10000, 'score': 120.95486831665039, 'total_duration': 438.06874418258667, 'accumulated_submission_time': 120.95486831665039, 'accumulated_eval_time': 315.0638761520386, 'accumulated_logging_time': 0.026315689086914062, 'global_step': 10, 'preemption_count': 0})], 'global_step': 10} | |
I0331 08:17:56.976811 140111137965888 submission_runner.py:603] Timing: 120.95486831665039 | |
I0331 08:17:56.976862 140111137965888 submission_runner.py:605] Total number of evals: 2 | |
I0331 08:17:56.976909 140111137965888 submission_runner.py:606] ==================== | |
I0331 08:17:56.977010 140111137965888 submission_runner.py:696] Final imagenet_resnet score: 0 | |
Exiting with 0 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment