Created
March 29, 2024 20:54
-
-
Save priyakasimbeg/c66aa6785353705ad977f59070aad365 to your computer and use it in GitHub Desktop.
resnet momentum 9-2023
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
python3 submission_runner.py --framework=jax --workload=imagenet_resnet --submission_path=reference_algorithms/target_setting_algorithms/jax_momentum.py --tuning_search_space=reference_algorithms/target_setting_algorithms/imagenet_resnet/tuning_search_space.json --data_dir=/data/imagenet/jax --num_tuning_trials=1 --experiment_dir=/experiment_runs --experiment_name=targets_check_jax/momentum_run_0 --overwrite=true --save_checkpoints=false --max_global_steps=140000 --imagenet_v2_data_dir=/data/imagenet/jax 2>&1 | tee -a /logs/imagenet_resnet_jax_09-14-2023-07-13-53.log | |
2023-09-14 07:13:58.404017: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT | |
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: | |
TensorFlow Addons (TFA) has ended development and introduction of new features. | |
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024. | |
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). | |
For more information see: https://github.com/tensorflow/addons/issues/2807 | |
warnings.warn( | |
I0914 07:14:16.944514 139785753851712 logger_utils.py:76] Creating experiment directory at /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax. | |
I0914 07:14:17.916458 139785753851712 xla_bridge.py:455] Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Host Interpreter CUDA | |
I0914 07:14:17.917233 139785753851712 xla_bridge.py:455] Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client' | |
I0914 07:14:17.917379 139785753851712 xla_bridge.py:455] Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this. | |
I0914 07:14:17.923749 139785753851712 submission_runner.py:500] Using RNG seed 2760784846 | |
I0914 07:14:23.812150 139785753851712 submission_runner.py:509] --- Tuning run 1/1 --- | |
I0914 07:14:23.812360 139785753851712 submission_runner.py:514] Creating tuning directory at /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1. | |
I0914 07:14:23.812535 139785753851712 logger_utils.py:92] Saving hparams to /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1/hparams.json. | |
I0914 07:14:23.996968 139785753851712 submission_runner.py:185] Initializing dataset. | |
I0914 07:14:24.013192 139785753851712 dataset_info.py:578] Load dataset info from /data/imagenet/jax/imagenet2012/5.1.0 | |
I0914 07:14:24.023574 139785753851712 dataset_info.py:669] Fields info.[splits, supervised_keys] from disk and from code do not match. Keeping the one from code. | |
I0914 07:14:24.404851 139785753851712 logging_logger.py:49] Constructing tf.data.Dataset imagenet2012 for split train, from /data/imagenet/jax/imagenet2012/5.1.0 | |
I0914 07:14:25.602084 139785753851712 submission_runner.py:192] Initializing model. | |
I0914 07:14:36.559219 139785753851712 submission_runner.py:226] Initializing optimizer. | |
I0914 07:14:38.135653 139785753851712 submission_runner.py:233] Initializing metrics bundle. | |
I0914 07:14:38.135895 139785753851712 submission_runner.py:251] Initializing checkpoint and logger. | |
I0914 07:14:38.137257 139785753851712 checkpoints.py:915] Found no checkpoint files in /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1 with prefix checkpoint_ | |
I0914 07:14:39.022184 139785753851712 submission_runner.py:272] Saving meta data to /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1/meta_data_0.json. | |
I0914 07:14:39.023321 139785753851712 submission_runner.py:275] Saving flags to /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1/flags_0.json. | |
I0914 07:14:39.033189 139785753851712 submission_runner.py:285] Starting training loop. | |
2023-09-14 07:15:37.344488: E external/xla/xla/service/rendezvous.cc:31] This thread has been waiting for 10 seconds and may be stuck: | |
2023-09-14 07:15:39.805411: E external/xla/xla/service/rendezvous.cc:36] Thread is unstuck! Warning above was a false-positive. Perhaps the timeout is too short. | |
I0914 07:15:41.341467 139620620691200 logging_writer.py:48] [0] global_step=0, grad_norm=0.5389119982719421, loss=6.927049160003662 | |
I0914 07:15:41.356961 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 07:15:42.325510 139785753851712 dataset_info.py:578] Load dataset info from /data/imagenet/jax/imagenet2012/5.1.0 | |
I0914 07:15:42.334606 139785753851712 dataset_info.py:669] Fields info.[splits, supervised_keys] from disk and from code do not match. Keeping the one from code. | |
I0914 07:15:42.417476 139785753851712 logging_logger.py:49] Constructing tf.data.Dataset imagenet2012 for split train, from /data/imagenet/jax/imagenet2012/5.1.0 | |
I0914 07:15:55.259609 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 07:15:56.705544 139785753851712 dataset_info.py:578] Load dataset info from /data/imagenet/jax/imagenet2012/5.1.0 | |
I0914 07:15:56.731734 139785753851712 dataset_info.py:669] Fields info.[splits, supervised_keys] from disk and from code do not match. Keeping the one from code. | |
I0914 07:15:56.805313 139785753851712 logging_logger.py:49] Constructing tf.data.Dataset imagenet2012 for split validation, from /data/imagenet/jax/imagenet2012/5.1.0 | |
I0914 07:16:16.525549 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 07:16:17.323033 139785753851712 dataset_info.py:578] Load dataset info from /data/imagenet/jax/imagenet_v2/matched-frequency/3.0.0 | |
I0914 07:16:17.328617 139785753851712 dataset_builder.py:528] Reusing dataset imagenet_v2 (/data/imagenet/jax/imagenet_v2/matched-frequency/3.0.0) | |
I0914 07:16:17.367002 139785753851712 logging_logger.py:49] Constructing tf.data.Dataset imagenet_v2 for split test, from /data/imagenet/jax/imagenet_v2/matched-frequency/3.0.0 | |
I0914 07:16:28.126622 139785753851712 submission_runner.py:376] Time since start: 109.09s, Step: 1, {'train/accuracy': 0.0009367027669213712, 'train/loss': 6.9118571281433105, 'validation/accuracy': 0.0010400000028312206, 'validation/loss': 6.911978721618652, 'validation/num_examples': 50000, 'test/accuracy': 0.0014000000664964318, 'test/loss': 6.91181755065918, 'test/num_examples': 10000, 'score': 62.32368874549866, 'total_duration': 109.0933792591095, 'accumulated_submission_time': 62.32368874549866, 'accumulated_eval_time': 46.76959991455078, 'accumulated_logging_time': 0} | |
I0914 07:16:28.146100 139590321039104 logging_writer.py:48] [1] accumulated_eval_time=46.769600, accumulated_logging_time=0, accumulated_submission_time=62.323689, global_step=1, preemption_count=0, score=62.323689, test/accuracy=0.001400, test/loss=6.911818, test/num_examples=10000, total_duration=109.093379, train/accuracy=0.000937, train/loss=6.911857, validation/accuracy=0.001040, validation/loss=6.911979, validation/num_examples=50000 | |
I0914 07:16:28.500410 139590329431808 logging_writer.py:48] [1] global_step=1, grad_norm=0.5436305403709412, loss=6.927357196807861 | |
I0914 07:16:28.843997 139590321039104 logging_writer.py:48] [2] global_step=2, grad_norm=0.5420133471488953, loss=6.9325995445251465 | |
I0914 07:16:29.183627 139590329431808 logging_writer.py:48] [3] global_step=3, grad_norm=0.5460174679756165, loss=6.930103302001953 | |
I0914 07:16:29.517827 139590321039104 logging_writer.py:48] [4] global_step=4, grad_norm=0.5241100788116455, loss=6.91835880279541 | |
I0914 07:16:29.859511 139590329431808 logging_writer.py:48] [5] global_step=5, grad_norm=0.5328718423843384, loss=6.935100078582764 | |
I0914 07:16:30.198420 139590321039104 logging_writer.py:48] [6] global_step=6, grad_norm=0.5385733842849731, loss=6.926104545593262 | |
I0914 07:16:30.534452 139590329431808 logging_writer.py:48] [7] global_step=7, grad_norm=0.5387418866157532, loss=6.923493385314941 | |
I0914 07:16:30.869961 139590321039104 logging_writer.py:48] [8] global_step=8, grad_norm=0.5350285172462463, loss=6.930649280548096 | |
I0914 07:16:31.209498 139590329431808 logging_writer.py:48] [9] global_step=9, grad_norm=0.5604594945907593, loss=6.93665075302124 | |
I0914 07:16:31.547532 139590321039104 logging_writer.py:48] [10] global_step=10, grad_norm=0.5472524762153625, loss=6.926746368408203 | |
I0914 07:16:31.883261 139590329431808 logging_writer.py:48] [11] global_step=11, grad_norm=0.5602211356163025, loss=6.931008338928223 | |
I0914 07:16:32.220176 139590321039104 logging_writer.py:48] [12] global_step=12, grad_norm=0.5302987694740295, loss=6.921492576599121 | |
I0914 07:16:32.559637 139590329431808 logging_writer.py:48] [13] global_step=13, grad_norm=0.5309913754463196, loss=6.919932842254639 | |
I0914 07:16:32.898500 139590321039104 logging_writer.py:48] [14] global_step=14, grad_norm=0.5365314483642578, loss=6.923084259033203 | |
I0914 07:16:33.235160 139590329431808 logging_writer.py:48] [15] global_step=15, grad_norm=0.5396639108657837, loss=6.920051574707031 | |
I0914 07:16:33.573624 139590321039104 logging_writer.py:48] [16] global_step=16, grad_norm=0.5340028405189514, loss=6.925442218780518 | |
I0914 07:16:33.910352 139590329431808 logging_writer.py:48] [17] global_step=17, grad_norm=0.5570560097694397, loss=6.911653518676758 | |
I0914 07:16:34.247454 139590321039104 logging_writer.py:48] [18] global_step=18, grad_norm=0.545469343662262, loss=6.915667533874512 | |
I0914 07:16:34.590326 139590329431808 logging_writer.py:48] [19] global_step=19, grad_norm=0.5328952670097351, loss=6.908725261688232 | |
I0914 07:16:34.927754 139590321039104 logging_writer.py:48] [20] global_step=20, grad_norm=0.5461469888687134, loss=6.909825325012207 | |
I0914 07:16:35.265347 139590329431808 logging_writer.py:48] [21] global_step=21, grad_norm=0.5239453911781311, loss=6.9054274559021 | |
I0914 07:16:35.601920 139590321039104 logging_writer.py:48] [22] global_step=22, grad_norm=0.536472499370575, loss=6.912668228149414 | |
I0914 07:16:35.942337 139590329431808 logging_writer.py:48] [23] global_step=23, grad_norm=0.5243082046508789, loss=6.909153461456299 | |
I0914 07:16:36.279525 139590321039104 logging_writer.py:48] [24] global_step=24, grad_norm=0.5406576991081238, loss=6.90311336517334 | |
I0914 07:16:36.630361 139590329431808 logging_writer.py:48] [25] global_step=25, grad_norm=0.5318547487258911, loss=6.890480995178223 | |
I0914 07:16:36.971984 139590321039104 logging_writer.py:48] [26] global_step=26, grad_norm=0.5387895107269287, loss=6.902065277099609 | |
I0914 07:16:37.316698 139590329431808 logging_writer.py:48] [27] global_step=27, grad_norm=0.5189248323440552, loss=6.895449638366699 | |
I0914 07:16:37.653784 139590321039104 logging_writer.py:48] [28] global_step=28, grad_norm=0.5199077725410461, loss=6.890660285949707 | |
I0914 07:16:38.001337 139590329431808 logging_writer.py:48] [29] global_step=29, grad_norm=0.5134077072143555, loss=6.892641067504883 | |
I0914 07:16:38.346123 139590321039104 logging_writer.py:48] [30] global_step=30, grad_norm=0.526112973690033, loss=6.894507884979248 | |
I0914 07:16:38.683314 139590329431808 logging_writer.py:48] [31] global_step=31, grad_norm=0.5195169448852539, loss=6.889142036437988 | |
I0914 07:16:39.021874 139590321039104 logging_writer.py:48] [32] global_step=32, grad_norm=0.5332788825035095, loss=6.886667251586914 | |
I0914 07:16:39.359480 139590329431808 logging_writer.py:48] [33] global_step=33, grad_norm=0.5372386574745178, loss=6.903325080871582 | |
I0914 07:16:39.696024 139590321039104 logging_writer.py:48] [34] global_step=34, grad_norm=0.5352444648742676, loss=6.8947858810424805 | |
I0914 07:16:40.031192 139590329431808 logging_writer.py:48] [35] global_step=35, grad_norm=0.5384101271629333, loss=6.880593776702881 | |
I0914 07:16:40.370426 139590321039104 logging_writer.py:48] [36] global_step=36, grad_norm=0.5275368690490723, loss=6.8852620124816895 | |
I0914 07:16:40.709239 139590329431808 logging_writer.py:48] [37] global_step=37, grad_norm=0.5337218642234802, loss=6.8845086097717285 | |
I0914 07:16:41.057002 139590321039104 logging_writer.py:48] [38] global_step=38, grad_norm=0.5435961484909058, loss=6.879532814025879 | |
I0914 07:16:41.394190 139590329431808 logging_writer.py:48] [39] global_step=39, grad_norm=0.551781177520752, loss=6.884259223937988 | |
I0914 07:16:41.729692 139590321039104 logging_writer.py:48] [40] global_step=40, grad_norm=0.548706591129303, loss=6.873312950134277 | |
I0914 07:16:42.069289 139590329431808 logging_writer.py:48] [41] global_step=41, grad_norm=0.5305836796760559, loss=6.873071670532227 | |
I0914 07:16:42.407241 139590321039104 logging_writer.py:48] [42] global_step=42, grad_norm=0.5488153100013733, loss=6.874739646911621 | |
I0914 07:16:42.750670 139590329431808 logging_writer.py:48] [43] global_step=43, grad_norm=0.5339630246162415, loss=6.8718953132629395 | |
I0914 07:16:43.091645 139590321039104 logging_writer.py:48] [44] global_step=44, grad_norm=0.5474773049354553, loss=6.872103691101074 | |
I0914 07:16:43.431213 139590329431808 logging_writer.py:48] [45] global_step=45, grad_norm=0.5713949203491211, loss=6.85746431350708 | |
I0914 07:16:43.771043 139590321039104 logging_writer.py:48] [46] global_step=46, grad_norm=0.5614795684814453, loss=6.87053918838501 | |
I0914 07:16:44.118433 139590329431808 logging_writer.py:48] [47] global_step=47, grad_norm=0.5435739755630493, loss=6.856149673461914 | |
I0914 07:16:44.460348 139590321039104 logging_writer.py:48] [48] global_step=48, grad_norm=0.5413722991943359, loss=6.855347633361816 | |
I0914 07:16:44.802573 139590329431808 logging_writer.py:48] [49] global_step=49, grad_norm=0.5439017415046692, loss=6.848852157592773 | |
I0914 07:16:45.140948 139590321039104 logging_writer.py:48] [50] global_step=50, grad_norm=0.557495653629303, loss=6.874087810516357 | |
I0914 07:16:45.481902 139590329431808 logging_writer.py:48] [51] global_step=51, grad_norm=0.5573470592498779, loss=6.860464096069336 | |
I0914 07:16:45.824782 139590321039104 logging_writer.py:48] [52] global_step=52, grad_norm=0.5395156145095825, loss=6.848325729370117 | |
I0914 07:16:46.168753 139590329431808 logging_writer.py:48] [53] global_step=53, grad_norm=0.5514255166053772, loss=6.842846393585205 | |
I0914 07:16:46.508444 139590321039104 logging_writer.py:48] [54] global_step=54, grad_norm=0.5766259431838989, loss=6.854337215423584 | |
I0914 07:16:46.847536 139590329431808 logging_writer.py:48] [55] global_step=55, grad_norm=0.548021137714386, loss=6.82659912109375 | |
I0914 07:16:47.191233 139590321039104 logging_writer.py:48] [56] global_step=56, grad_norm=0.5588611364364624, loss=6.844107151031494 | |
I0914 07:16:47.528039 139590329431808 logging_writer.py:48] [57] global_step=57, grad_norm=0.5480129718780518, loss=6.835235118865967 | |
I0914 07:16:47.864345 139590321039104 logging_writer.py:48] [58] global_step=58, grad_norm=0.5657825469970703, loss=6.8115458488464355 | |
I0914 07:16:48.199600 139590329431808 logging_writer.py:48] [59] global_step=59, grad_norm=0.5682162046432495, loss=6.814191818237305 | |
I0914 07:16:48.537695 139590321039104 logging_writer.py:48] [60] global_step=60, grad_norm=0.5582947731018066, loss=6.820403099060059 | |
I0914 07:16:48.878333 139590329431808 logging_writer.py:48] [61] global_step=61, grad_norm=0.561733067035675, loss=6.821713924407959 | |
I0914 07:16:49.216136 139590321039104 logging_writer.py:48] [62] global_step=62, grad_norm=0.5706186294555664, loss=6.80854606628418 | |
I0914 07:16:49.550317 139590329431808 logging_writer.py:48] [63] global_step=63, grad_norm=0.5717720985412598, loss=6.817758083343506 | |
I0914 07:16:49.898351 139590321039104 logging_writer.py:48] [64] global_step=64, grad_norm=0.5713201761245728, loss=6.8105292320251465 | |
I0914 07:16:50.235677 139590329431808 logging_writer.py:48] [65] global_step=65, grad_norm=0.5701669454574585, loss=6.792567253112793 | |
I0914 07:16:50.571818 139590321039104 logging_writer.py:48] [66] global_step=66, grad_norm=0.5822696089744568, loss=6.782873630523682 | |
I0914 07:16:50.909978 139590329431808 logging_writer.py:48] [67] global_step=67, grad_norm=0.584923505783081, loss=6.796526908874512 | |
I0914 07:16:51.247188 139590321039104 logging_writer.py:48] [68] global_step=68, grad_norm=0.5622556209564209, loss=6.78099250793457 | |
I0914 07:16:51.587746 139590329431808 logging_writer.py:48] [69] global_step=69, grad_norm=0.6046913266181946, loss=6.79979944229126 | |
I0914 07:16:51.922210 139590321039104 logging_writer.py:48] [70] global_step=70, grad_norm=0.587131917476654, loss=6.79705810546875 | |
I0914 07:16:52.263475 139590329431808 logging_writer.py:48] [71] global_step=71, grad_norm=0.5949347615242004, loss=6.78912353515625 | |
I0914 07:16:52.607199 139590321039104 logging_writer.py:48] [72] global_step=72, grad_norm=0.5920963883399963, loss=6.785521507263184 | |
I0914 07:16:52.947867 139590329431808 logging_writer.py:48] [73] global_step=73, grad_norm=0.5777257084846497, loss=6.789895534515381 | |
I0914 07:16:53.290904 139590321039104 logging_writer.py:48] [74] global_step=74, grad_norm=0.5883252024650574, loss=6.7708892822265625 | |
I0914 07:16:53.629848 139590329431808 logging_writer.py:48] [75] global_step=75, grad_norm=0.6013261079788208, loss=6.777152061462402 | |
I0914 07:16:53.973304 139590321039104 logging_writer.py:48] [76] global_step=76, grad_norm=0.5913688540458679, loss=6.762724876403809 | |
I0914 07:16:54.312757 139590329431808 logging_writer.py:48] [77] global_step=77, grad_norm=0.6064963936805725, loss=6.779829025268555 | |
I0914 07:16:54.654629 139590321039104 logging_writer.py:48] [78] global_step=78, grad_norm=0.6053351759910583, loss=6.746735572814941 | |
I0914 07:16:54.994006 139590329431808 logging_writer.py:48] [79] global_step=79, grad_norm=0.5931686758995056, loss=6.754239082336426 | |
I0914 07:16:55.331990 139590321039104 logging_writer.py:48] [80] global_step=80, grad_norm=0.5849683880805969, loss=6.748963356018066 | |
I0914 07:16:55.670944 139590329431808 logging_writer.py:48] [81] global_step=81, grad_norm=0.5978469252586365, loss=6.737304210662842 | |
I0914 07:16:56.019518 139590321039104 logging_writer.py:48] [82] global_step=82, grad_norm=0.5988901853561401, loss=6.765451908111572 | |
I0914 07:16:56.365621 139590329431808 logging_writer.py:48] [83] global_step=83, grad_norm=0.6001420617103577, loss=6.743647575378418 | |
I0914 07:16:56.708692 139590321039104 logging_writer.py:48] [84] global_step=84, grad_norm=0.602545976638794, loss=6.7492852210998535 | |
I0914 07:16:57.049725 139590329431808 logging_writer.py:48] [85] global_step=85, grad_norm=0.6386739611625671, loss=6.7281060218811035 | |
I0914 07:16:57.395249 139590321039104 logging_writer.py:48] [86] global_step=86, grad_norm=0.6063979268074036, loss=6.74575138092041 | |
I0914 07:16:57.735456 139590329431808 logging_writer.py:48] [87] global_step=87, grad_norm=0.6207473874092102, loss=6.7130446434021 | |
I0914 07:16:58.072184 139590321039104 logging_writer.py:48] [88] global_step=88, grad_norm=0.6147616505622864, loss=6.738728046417236 | |
I0914 07:16:58.412707 139590329431808 logging_writer.py:48] [89] global_step=89, grad_norm=0.6013393402099609, loss=6.706392288208008 | |
I0914 07:16:58.749669 139590321039104 logging_writer.py:48] [90] global_step=90, grad_norm=0.6069715619087219, loss=6.70119571685791 | |
I0914 07:16:59.086777 139590329431808 logging_writer.py:48] [91] global_step=91, grad_norm=0.6262779235839844, loss=6.752510070800781 | |
I0914 07:16:59.431769 139590321039104 logging_writer.py:48] [92] global_step=92, grad_norm=0.6138813495635986, loss=6.705592155456543 | |
I0914 07:16:59.780551 139590329431808 logging_writer.py:48] [93] global_step=93, grad_norm=0.6135011315345764, loss=6.708225250244141 | |
I0914 07:17:00.120775 139590321039104 logging_writer.py:48] [94] global_step=94, grad_norm=0.6277234554290771, loss=6.7320170402526855 | |
I0914 07:17:00.455492 139590329431808 logging_writer.py:48] [95] global_step=95, grad_norm=0.6197319030761719, loss=6.67308235168457 | |
I0914 07:17:00.791080 139590321039104 logging_writer.py:48] [96] global_step=96, grad_norm=0.6064175367355347, loss=6.688239097595215 | |
I0914 07:17:01.137901 139590329431808 logging_writer.py:48] [97] global_step=97, grad_norm=0.6338247060775757, loss=6.662160873413086 | |
I0914 07:17:01.489339 139590321039104 logging_writer.py:48] [98] global_step=98, grad_norm=0.6168617606163025, loss=6.703256607055664 | |
I0914 07:17:01.829795 139590329431808 logging_writer.py:48] [99] global_step=99, grad_norm=0.6174088716506958, loss=6.699215888977051 | |
I0914 07:17:02.169407 139590321039104 logging_writer.py:48] [100] global_step=100, grad_norm=0.6163209080696106, loss=6.67160701751709 | |
I0914 07:19:17.025958 139590329431808 logging_writer.py:48] [500] global_step=500, grad_norm=0.5776574611663818, loss=6.170225143432617 | |
I0914 07:22:05.670546 139590321039104 logging_writer.py:48] [1000] global_step=1000, grad_norm=0.4634348154067993, loss=5.531285762786865 | |
I0914 07:24:54.105559 139590329431808 logging_writer.py:48] [1500] global_step=1500, grad_norm=0.4536585807800293, loss=5.192131996154785 | |
I0914 07:24:58.233243 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 07:25:05.431389 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 07:25:13.491719 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 07:25:15.792362 139785753851712 submission_runner.py:376] Time since start: 636.76s, Step: 1514, {'train/accuracy': 0.18895487487316132, 'train/loss': 4.249844551086426, 'validation/accuracy': 0.1698399931192398, 'validation/loss': 4.387373924255371, 'validation/num_examples': 50000, 'test/accuracy': 0.12890000641345978, 'test/loss': 4.780310153961182, 'test/num_examples': 10000, 'score': 572.3801600933075, 'total_duration': 636.7591044902802, 'accumulated_submission_time': 572.3801600933075, 'accumulated_eval_time': 64.32867097854614, 'accumulated_logging_time': 0.02817511558532715} | |
I0914 07:25:15.810337 139590530758400 logging_writer.py:48] [1514] accumulated_eval_time=64.328671, accumulated_logging_time=0.028175, accumulated_submission_time=572.380160, global_step=1514, preemption_count=0, score=572.380160, test/accuracy=0.128900, test/loss=4.780310, test/num_examples=10000, total_duration=636.759104, train/accuracy=0.188955, train/loss=4.249845, validation/accuracy=0.169840, validation/loss=4.387374, validation/num_examples=50000 | |
I0914 07:27:59.783074 139590614619904 logging_writer.py:48] [2000] global_step=2000, grad_norm=0.39719879627227783, loss=4.812987327575684 | |
I0914 07:30:48.130979 139590530758400 logging_writer.py:48] [2500] global_step=2500, grad_norm=0.34383267164230347, loss=4.628384590148926 | |
I0914 07:33:36.385151 139590614619904 logging_writer.py:48] [3000] global_step=3000, grad_norm=0.34005293250083923, loss=4.377949237823486 | |
I0914 07:33:45.896412 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 07:33:53.137686 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 07:34:01.295050 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 07:34:03.611705 139785753851712 submission_runner.py:376] Time since start: 1164.58s, Step: 3030, {'train/accuracy': 0.35439252853393555, 'train/loss': 3.16475772857666, 'validation/accuracy': 0.32311999797821045, 'validation/loss': 3.354123592376709, 'validation/num_examples': 50000, 'test/accuracy': 0.24500000476837158, 'test/loss': 3.896735429763794, 'test/num_examples': 10000, 'score': 1082.4337828159332, 'total_duration': 1164.5784351825714, 'accumulated_submission_time': 1082.4337828159332, 'accumulated_eval_time': 82.04390573501587, 'accumulated_logging_time': 0.05537152290344238} | |
I0914 07:34:03.630025 139620444509952 logging_writer.py:48] [3030] accumulated_eval_time=82.043906, accumulated_logging_time=0.055372, accumulated_submission_time=1082.433783, global_step=3030, preemption_count=0, score=1082.433783, test/accuracy=0.245000, test/loss=3.896735, test/num_examples=10000, total_duration=1164.578435, train/accuracy=0.354393, train/loss=3.164758, validation/accuracy=0.323120, validation/loss=3.354124, validation/num_examples=50000 | |
I0914 07:36:42.183385 139621082027776 logging_writer.py:48] [3500] global_step=3500, grad_norm=0.3057088851928711, loss=4.292041301727295 | |
I0914 07:39:30.385200 139620444509952 logging_writer.py:48] [4000] global_step=4000, grad_norm=0.29683586955070496, loss=4.165511608123779 | |
I0914 07:42:18.588014 139621082027776 logging_writer.py:48] [4500] global_step=4500, grad_norm=0.2872718274593353, loss=4.058863162994385 | |
I0914 07:42:33.815204 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 07:42:41.004353 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 07:42:49.069682 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 07:42:51.320148 139785753851712 submission_runner.py:376] Time since start: 1692.29s, Step: 4547, {'train/accuracy': 0.4696468412876129, 'train/loss': 2.565059185028076, 'validation/accuracy': 0.4343400001525879, 'validation/loss': 2.7238423824310303, 'validation/num_examples': 50000, 'test/accuracy': 0.3296000063419342, 'test/loss': 3.3760478496551514, 'test/num_examples': 10000, 'score': 1592.5839030742645, 'total_duration': 1692.2868733406067, 'accumulated_submission_time': 1592.5839030742645, 'accumulated_eval_time': 99.54879140853882, 'accumulated_logging_time': 0.08617043495178223} | |
I0914 07:42:51.339463 139620444509952 logging_writer.py:48] [4547] accumulated_eval_time=99.548791, accumulated_logging_time=0.086170, accumulated_submission_time=1592.583903, global_step=4547, preemption_count=0, score=1592.583903, test/accuracy=0.329600, test/loss=3.376048, test/num_examples=10000, total_duration=1692.286873, train/accuracy=0.469647, train/loss=2.565059, validation/accuracy=0.434340, validation/loss=2.723842, validation/num_examples=50000 | |
I0914 07:45:24.082047 139620452902656 logging_writer.py:48] [5000] global_step=5000, grad_norm=0.2850759029388428, loss=4.08101749420166 | |
I0914 07:48:12.281002 139620444509952 logging_writer.py:48] [5500] global_step=5500, grad_norm=0.2590577006340027, loss=3.891889810562134 | |
I0914 07:51:00.544494 139620452902656 logging_writer.py:48] [6000] global_step=6000, grad_norm=0.2445574253797531, loss=3.913510799407959 | |
I0914 07:51:21.496845 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 07:51:28.630486 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 07:51:36.807730 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 07:51:39.056757 139785753851712 submission_runner.py:376] Time since start: 2220.02s, Step: 6064, {'train/accuracy': 0.485072523355484, 'train/loss': 2.4722769260406494, 'validation/accuracy': 0.45179998874664307, 'validation/loss': 2.632256269454956, 'validation/num_examples': 50000, 'test/accuracy': 0.3456000089645386, 'test/loss': 3.28208327293396, 'test/num_examples': 10000, 'score': 2102.707985162735, 'total_duration': 2220.0234982967377, 'accumulated_submission_time': 2102.707985162735, 'accumulated_eval_time': 117.10866022109985, 'accumulated_logging_time': 0.11611032485961914} | |
I0914 07:51:39.075066 139620436117248 logging_writer.py:48] [6064] accumulated_eval_time=117.108660, accumulated_logging_time=0.116110, accumulated_submission_time=2102.707985, global_step=6064, preemption_count=0, score=2102.707985, test/accuracy=0.345600, test/loss=3.282083, test/num_examples=10000, total_duration=2220.023498, train/accuracy=0.485073, train/loss=2.472277, validation/accuracy=0.451800, validation/loss=2.632256, validation/num_examples=50000 | |
I0914 07:54:06.019992 139620444509952 logging_writer.py:48] [6500] global_step=6500, grad_norm=0.2569459080696106, loss=3.9474620819091797 | |
I0914 07:56:54.227489 139620436117248 logging_writer.py:48] [7000] global_step=7000, grad_norm=0.2399711310863495, loss=3.8695645332336426 | |
I0914 07:59:42.452682 139620444509952 logging_writer.py:48] [7500] global_step=7500, grad_norm=0.23975849151611328, loss=3.802103281021118 | |
I0914 08:00:09.119609 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 08:00:16.465856 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 08:00:24.594934 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 08:00:26.870099 139785753851712 submission_runner.py:376] Time since start: 2747.84s, Step: 7581, {'train/accuracy': 0.5335220098495483, 'train/loss': 2.235621452331543, 'validation/accuracy': 0.5026400089263916, 'validation/loss': 2.388488531112671, 'validation/num_examples': 50000, 'test/accuracy': 0.392300009727478, 'test/loss': 3.0264182090759277, 'test/num_examples': 10000, 'score': 2612.7207324504852, 'total_duration': 2747.8368368148804, 'accumulated_submission_time': 2612.7207324504852, 'accumulated_eval_time': 134.8591091632843, 'accumulated_logging_time': 0.14357876777648926} | |
I0914 08:00:26.887415 139620469688064 logging_writer.py:48] [7581] accumulated_eval_time=134.859109, accumulated_logging_time=0.143579, accumulated_submission_time=2612.720732, global_step=7581, preemption_count=0, score=2612.720732, test/accuracy=0.392300, test/loss=3.026418, test/num_examples=10000, total_duration=2747.836837, train/accuracy=0.533522, train/loss=2.235621, validation/accuracy=0.502640, validation/loss=2.388489, validation/num_examples=50000 | |
I0914 08:02:48.232521 139621090420480 logging_writer.py:48] [8000] global_step=8000, grad_norm=0.24518267810344696, loss=3.8370201587677 | |
I0914 08:05:36.454721 139620469688064 logging_writer.py:48] [8500] global_step=8500, grad_norm=0.2367110252380371, loss=3.8449392318725586 | |
I0914 08:08:24.702806 139621090420480 logging_writer.py:48] [9000] global_step=9000, grad_norm=0.23814517259597778, loss=3.7492542266845703 | |
I0914 08:08:57.097401 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 08:09:04.500794 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 08:09:12.674903 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 08:09:14.951228 139785753851712 submission_runner.py:376] Time since start: 3275.92s, Step: 9098, {'train/accuracy': 0.5826291441917419, 'train/loss': 2.0229527950286865, 'validation/accuracy': 0.5064799785614014, 'validation/loss': 2.3824005126953125, 'validation/num_examples': 50000, 'test/accuracy': 0.3856000304222107, 'test/loss': 3.0475265979766846, 'test/num_examples': 10000, 'score': 3122.8992822170258, 'total_duration': 3275.9179759025574, 'accumulated_submission_time': 3122.8992822170258, 'accumulated_eval_time': 152.71290373802185, 'accumulated_logging_time': 0.17004680633544922} | |
I0914 08:09:14.969036 139620461295360 logging_writer.py:48] [9098] accumulated_eval_time=152.712904, accumulated_logging_time=0.170047, accumulated_submission_time=3122.899282, global_step=9098, preemption_count=0, score=3122.899282, test/accuracy=0.385600, test/loss=3.047527, test/num_examples=10000, total_duration=3275.917976, train/accuracy=0.582629, train/loss=2.022953, validation/accuracy=0.506480, validation/loss=2.382401, validation/num_examples=50000 | |
I0914 08:11:30.622020 139621082027776 logging_writer.py:48] [9500] global_step=9500, grad_norm=0.2386445701122284, loss=3.8122379779815674 | |
I0914 08:14:18.848359 139620461295360 logging_writer.py:48] [10000] global_step=10000, grad_norm=0.23931212723255157, loss=3.726754903793335 | |
I0914 08:17:07.096012 139621082027776 logging_writer.py:48] [10500] global_step=10500, grad_norm=0.23180843889713287, loss=3.6859421730041504 | |
I0914 08:17:45.199921 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 08:17:52.574122 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 08:18:00.787055 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 08:18:03.038870 139785753851712 submission_runner.py:376] Time since start: 3804.01s, Step: 10615, {'train/accuracy': 0.5702128410339355, 'train/loss': 2.0194029808044434, 'validation/accuracy': 0.5161600112915039, 'validation/loss': 2.2897284030914307, 'validation/num_examples': 50000, 'test/accuracy': 0.3993000090122223, 'test/loss': 2.9481253623962402, 'test/num_examples': 10000, 'score': 3633.0976436138153, 'total_duration': 3804.0056059360504, 'accumulated_submission_time': 3633.0976436138153, 'accumulated_eval_time': 170.5518193244934, 'accumulated_logging_time': 0.1971442699432373} | |
I0914 08:18:03.065747 139620452902656 logging_writer.py:48] [10615] accumulated_eval_time=170.551819, accumulated_logging_time=0.197144, accumulated_submission_time=3633.097644, global_step=10615, preemption_count=0, score=3633.097644, test/accuracy=0.399300, test/loss=2.948125, test/num_examples=10000, total_duration=3804.005606, train/accuracy=0.570213, train/loss=2.019403, validation/accuracy=0.516160, validation/loss=2.289728, validation/num_examples=50000 | |
I0914 08:20:12.967016 139620461295360 logging_writer.py:48] [11000] global_step=11000, grad_norm=0.24939975142478943, loss=3.7546169757843018 | |
I0914 08:23:01.209996 139620452902656 logging_writer.py:48] [11500] global_step=11500, grad_norm=0.23770606517791748, loss=3.735459566116333 | |
I0914 08:25:49.416116 139620461295360 logging_writer.py:48] [12000] global_step=12000, grad_norm=0.2510932385921478, loss=3.805049180984497 | |
I0914 08:26:33.239436 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 08:26:40.575763 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 08:26:48.802873 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 08:26:51.083399 139785753851712 submission_runner.py:376] Time since start: 4332.05s, Step: 12132, {'train/accuracy': 0.5753945708274841, 'train/loss': 2.0027692317962646, 'validation/accuracy': 0.5300599932670593, 'validation/loss': 2.2071869373321533, 'validation/num_examples': 50000, 'test/accuracy': 0.414900004863739, 'test/loss': 2.8690314292907715, 'test/num_examples': 10000, 'score': 4143.2383596897125, 'total_duration': 4332.050138235092, 'accumulated_submission_time': 4143.2383596897125, 'accumulated_eval_time': 188.3957643508911, 'accumulated_logging_time': 0.23388051986694336} | |
I0914 08:26:51.102065 139620444509952 logging_writer.py:48] [12132] accumulated_eval_time=188.395764, accumulated_logging_time=0.233881, accumulated_submission_time=4143.238360, global_step=12132, preemption_count=0, score=4143.238360, test/accuracy=0.414900, test/loss=2.869031, test/num_examples=10000, total_duration=4332.050138, train/accuracy=0.575395, train/loss=2.002769, validation/accuracy=0.530060, validation/loss=2.207187, validation/num_examples=50000 | |
I0914 08:28:55.254952 139620452902656 logging_writer.py:48] [12500] global_step=12500, grad_norm=0.24407465755939484, loss=3.693490743637085 | |
I0914 08:31:43.490170 139620444509952 logging_writer.py:48] [13000] global_step=13000, grad_norm=0.24153253436088562, loss=3.671910524368286 | |
I0914 08:34:31.727334 139620452902656 logging_writer.py:48] [13500] global_step=13500, grad_norm=0.2499263882637024, loss=3.6926016807556152 | |
I0914 08:35:21.274003 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 08:35:29.321512 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 08:35:37.648797 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 08:35:39.919688 139785753851712 submission_runner.py:376] Time since start: 4860.89s, Step: 13649, {'train/accuracy': 0.5742785334587097, 'train/loss': 2.003106117248535, 'validation/accuracy': 0.5323799848556519, 'validation/loss': 2.2081964015960693, 'validation/num_examples': 50000, 'test/accuracy': 0.4196000099182129, 'test/loss': 2.892209529876709, 'test/num_examples': 10000, 'score': 4653.37796998024, 'total_duration': 4860.886433124542, 'accumulated_submission_time': 4653.37796998024, 'accumulated_eval_time': 207.04141402244568, 'accumulated_logging_time': 0.26221203804016113} | |
I0914 08:35:39.956246 139620444509952 logging_writer.py:48] [13649] accumulated_eval_time=207.041414, accumulated_logging_time=0.262212, accumulated_submission_time=4653.377970, global_step=13649, preemption_count=0, score=4653.377970, test/accuracy=0.419600, test/loss=2.892210, test/num_examples=10000, total_duration=4860.886433, train/accuracy=0.574279, train/loss=2.003106, validation/accuracy=0.532380, validation/loss=2.208196, validation/num_examples=50000 | |
I0914 08:37:38.445249 139620452902656 logging_writer.py:48] [14000] global_step=14000, grad_norm=0.24695931375026703, loss=3.6911535263061523 | |
I0914 08:40:26.670311 139620444509952 logging_writer.py:48] [14500] global_step=14500, grad_norm=0.25017544627189636, loss=3.5895378589630127 | |
I0914 08:43:14.935185 139620452902656 logging_writer.py:48] [15000] global_step=15000, grad_norm=0.2352597415447235, loss=3.5701823234558105 | |
I0914 08:44:10.209634 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 08:44:17.744957 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 08:44:26.187425 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 08:44:28.424991 139785753851712 submission_runner.py:376] Time since start: 5389.39s, Step: 15166, {'train/accuracy': 0.5779455900192261, 'train/loss': 2.057650327682495, 'validation/accuracy': 0.5369799733161926, 'validation/loss': 2.2513246536254883, 'validation/num_examples': 50000, 'test/accuracy': 0.414900004863739, 'test/loss': 2.942534923553467, 'test/num_examples': 10000, 'score': 5163.59117102623, 'total_duration': 5389.39173579216, 'accumulated_submission_time': 5163.59117102623, 'accumulated_eval_time': 225.25672912597656, 'accumulated_logging_time': 0.31658077239990234} | |
I0914 08:44:28.446834 139620436117248 logging_writer.py:48] [15166] accumulated_eval_time=225.256729, accumulated_logging_time=0.316581, accumulated_submission_time=5163.591171, global_step=15166, preemption_count=0, score=5163.591171, test/accuracy=0.414900, test/loss=2.942535, test/num_examples=10000, total_duration=5389.391736, train/accuracy=0.577946, train/loss=2.057650, validation/accuracy=0.536980, validation/loss=2.251325, validation/num_examples=50000 | |
I0914 08:46:21.164688 139620444509952 logging_writer.py:48] [15500] global_step=15500, grad_norm=0.2539088726043701, loss=3.7177305221557617 | |
I0914 08:49:09.408946 139620436117248 logging_writer.py:48] [16000] global_step=16000, grad_norm=0.2508637309074402, loss=3.5680971145629883 | |
I0914 08:51:57.652329 139620444509952 logging_writer.py:48] [16500] global_step=16500, grad_norm=0.26029929518699646, loss=3.609468698501587 | |
I0914 08:52:58.664029 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 08:53:06.896972 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 08:53:15.780883 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 08:53:18.137967 139785753851712 submission_runner.py:376] Time since start: 5919.10s, Step: 16683, {'train/accuracy': 0.5338209271430969, 'train/loss': 2.335331678390503, 'validation/accuracy': 0.5003600120544434, 'validation/loss': 2.478173017501831, 'validation/num_examples': 50000, 'test/accuracy': 0.38210001587867737, 'test/loss': 3.1922895908355713, 'test/num_examples': 10000, 'score': 5673.775738954544, 'total_duration': 5919.104665517807, 'accumulated_submission_time': 5673.775738954544, 'accumulated_eval_time': 244.7305908203125, 'accumulated_logging_time': 0.3483104705810547} | |
I0914 08:53:18.170782 139620444509952 logging_writer.py:48] [16683] accumulated_eval_time=244.730591, accumulated_logging_time=0.348310, accumulated_submission_time=5673.775739, global_step=16683, preemption_count=0, score=5673.775739, test/accuracy=0.382100, test/loss=3.192290, test/num_examples=10000, total_duration=5919.104666, train/accuracy=0.533821, train/loss=2.335332, validation/accuracy=0.500360, validation/loss=2.478173, validation/num_examples=50000 | |
I0914 08:55:04.959357 139620452902656 logging_writer.py:48] [17000] global_step=17000, grad_norm=0.25013065338134766, loss=3.6450700759887695 | |
I0914 08:57:53.102471 139620444509952 logging_writer.py:48] [17500] global_step=17500, grad_norm=0.2470160871744156, loss=3.6803393363952637 | |
I0914 09:00:41.390616 139620452902656 logging_writer.py:48] [18000] global_step=18000, grad_norm=0.24692237377166748, loss=3.684927225112915 | |
I0914 09:01:48.390591 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 09:01:56.225740 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 09:02:05.743061 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 09:02:08.019340 139785753851712 submission_runner.py:376] Time since start: 6448.99s, Step: 18201, {'train/accuracy': 0.6382134556770325, 'train/loss': 1.74356210231781, 'validation/accuracy': 0.5595200061798096, 'validation/loss': 2.1038706302642822, 'validation/num_examples': 50000, 'test/accuracy': 0.43620002269744873, 'test/loss': 2.7823822498321533, 'test/num_examples': 10000, 'score': 6183.962655305862, 'total_duration': 6448.986067771912, 'accumulated_submission_time': 6183.962655305862, 'accumulated_eval_time': 264.35929918289185, 'accumulated_logging_time': 0.39099645614624023} | |
I0914 09:02:08.039974 139621082027776 logging_writer.py:48] [18201] accumulated_eval_time=264.359299, accumulated_logging_time=0.390996, accumulated_submission_time=6183.962655, global_step=18201, preemption_count=0, score=6183.962655, test/accuracy=0.436200, test/loss=2.782382, test/num_examples=10000, total_duration=6448.986068, train/accuracy=0.638213, train/loss=1.743562, validation/accuracy=0.559520, validation/loss=2.103871, validation/num_examples=50000 | |
I0914 09:03:48.995682 139621090420480 logging_writer.py:48] [18500] global_step=18500, grad_norm=0.24664853513240814, loss=3.5926640033721924 | |
I0914 09:06:37.140828 139621082027776 logging_writer.py:48] [19000] global_step=19000, grad_norm=0.2476327270269394, loss=3.5965888500213623 | |
I0914 09:09:25.389448 139621090420480 logging_writer.py:48] [19500] global_step=19500, grad_norm=0.24864919483661652, loss=3.5664284229278564 | |
I0914 09:10:38.143328 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 09:10:46.363858 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 09:10:55.978470 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 09:10:58.233433 139785753851712 submission_runner.py:376] Time since start: 6979.20s, Step: 19718, {'train/accuracy': 0.5982142686843872, 'train/loss': 1.9227712154388428, 'validation/accuracy': 0.5423600077629089, 'validation/loss': 2.185263156890869, 'validation/num_examples': 50000, 'test/accuracy': 0.4272000193595886, 'test/loss': 2.8164870738983154, 'test/num_examples': 10000, 'score': 6694.032505512238, 'total_duration': 6979.200145483017, 'accumulated_submission_time': 6694.032505512238, 'accumulated_eval_time': 284.4493384361267, 'accumulated_logging_time': 0.4227602481842041} | |
I0914 09:10:58.268679 139620436117248 logging_writer.py:48] [19718] accumulated_eval_time=284.449338, accumulated_logging_time=0.422760, accumulated_submission_time=6694.032506, global_step=19718, preemption_count=0, score=6694.032506, test/accuracy=0.427200, test/loss=2.816487, test/num_examples=10000, total_duration=6979.200145, train/accuracy=0.598214, train/loss=1.922771, validation/accuracy=0.542360, validation/loss=2.185263, validation/num_examples=50000 | |
I0914 09:12:33.433119 139620444509952 logging_writer.py:48] [20000] global_step=20000, grad_norm=0.2531515657901764, loss=3.5633723735809326 | |
I0914 09:15:21.484676 139620436117248 logging_writer.py:48] [20500] global_step=20500, grad_norm=0.2647220194339752, loss=3.608556032180786 | |
I0914 09:18:09.739126 139620444509952 logging_writer.py:48] [21000] global_step=21000, grad_norm=0.2493944615125656, loss=3.5959181785583496 | |
I0914 09:19:28.235945 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 09:19:35.889616 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 09:19:45.067148 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 09:19:47.308690 139785753851712 submission_runner.py:376] Time since start: 7508.28s, Step: 21235, {'train/accuracy': 0.6011040806770325, 'train/loss': 1.9267860651016235, 'validation/accuracy': 0.5546199679374695, 'validation/loss': 2.1422059535980225, 'validation/num_examples': 50000, 'test/accuracy': 0.4358000159263611, 'test/loss': 2.809361457824707, 'test/num_examples': 10000, 'score': 7203.964419841766, 'total_duration': 7508.275420188904, 'accumulated_submission_time': 7203.964419841766, 'accumulated_eval_time': 303.52203822135925, 'accumulated_logging_time': 0.47086191177368164} | |
I0914 09:19:47.331918 139621082027776 logging_writer.py:48] [21235] accumulated_eval_time=303.522038, accumulated_logging_time=0.470862, accumulated_submission_time=7203.964420, global_step=21235, preemption_count=0, score=7203.964420, test/accuracy=0.435800, test/loss=2.809361, test/num_examples=10000, total_duration=7508.275420, train/accuracy=0.601104, train/loss=1.926786, validation/accuracy=0.554620, validation/loss=2.142206, validation/num_examples=50000 | |
I0914 09:21:16.821161 139621090420480 logging_writer.py:48] [21500] global_step=21500, grad_norm=0.2546178698539734, loss=3.6239407062530518 | |
I0914 09:24:04.748127 139621082027776 logging_writer.py:48] [22000] global_step=22000, grad_norm=0.2460772544145584, loss=3.537613868713379 | |
I0914 09:26:52.895702 139621090420480 logging_writer.py:48] [22500] global_step=22500, grad_norm=0.2539433240890503, loss=3.622434139251709 | |
I0914 09:28:17.428775 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 09:28:24.868973 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 09:28:34.428437 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 09:28:36.700894 139785753851712 submission_runner.py:376] Time since start: 8037.67s, Step: 22753, {'train/accuracy': 0.5950454473495483, 'train/loss': 1.9752215147018433, 'validation/accuracy': 0.5528799891471863, 'validation/loss': 2.1724321842193604, 'validation/num_examples': 50000, 'test/accuracy': 0.4336000084877014, 'test/loss': 2.810899257659912, 'test/num_examples': 10000, 'score': 7714.026890993118, 'total_duration': 8037.667640447617, 'accumulated_submission_time': 7714.026890993118, 'accumulated_eval_time': 322.7941265106201, 'accumulated_logging_time': 0.5051746368408203} | |
I0914 09:28:36.720781 139620452902656 logging_writer.py:48] [22753] accumulated_eval_time=322.794127, accumulated_logging_time=0.505175, accumulated_submission_time=7714.026891, global_step=22753, preemption_count=0, score=7714.026891, test/accuracy=0.433600, test/loss=2.810899, test/num_examples=10000, total_duration=8037.667640, train/accuracy=0.595045, train/loss=1.975222, validation/accuracy=0.552880, validation/loss=2.172432, validation/num_examples=50000 | |
I0914 09:30:00.012202 139620461295360 logging_writer.py:48] [23000] global_step=23000, grad_norm=0.2528955638408661, loss=3.527564287185669 | |
I0914 09:32:48.240035 139620452902656 logging_writer.py:48] [23500] global_step=23500, grad_norm=0.257273405790329, loss=3.5579819679260254 | |
I0914 09:35:36.496314 139620461295360 logging_writer.py:48] [24000] global_step=24000, grad_norm=0.2570257782936096, loss=3.555431365966797 | |
I0914 09:37:06.786156 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 09:37:14.321404 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 09:37:24.197975 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 09:37:26.461626 139785753851712 submission_runner.py:376] Time since start: 8567.43s, Step: 24270, {'train/accuracy': 0.5795798897743225, 'train/loss': 2.023024320602417, 'validation/accuracy': 0.5407800078392029, 'validation/loss': 2.230668306350708, 'validation/num_examples': 50000, 'test/accuracy': 0.41540002822875977, 'test/loss': 2.895124673843384, 'test/num_examples': 10000, 'score': 8224.06004691124, 'total_duration': 8567.42837190628, 'accumulated_submission_time': 8224.06004691124, 'accumulated_eval_time': 342.4695653915405, 'accumulated_logging_time': 0.5343174934387207} | |
I0914 09:37:26.481554 139620444509952 logging_writer.py:48] [24270] accumulated_eval_time=342.469565, accumulated_logging_time=0.534317, accumulated_submission_time=8224.060047, global_step=24270, preemption_count=0, score=8224.060047, test/accuracy=0.415400, test/loss=2.895125, test/num_examples=10000, total_duration=8567.428372, train/accuracy=0.579580, train/loss=2.023024, validation/accuracy=0.540780, validation/loss=2.230668, validation/num_examples=50000 | |
I0914 09:38:43.997312 139620452902656 logging_writer.py:48] [24500] global_step=24500, grad_norm=0.25301215052604675, loss=3.5550875663757324 | |
I0914 09:41:32.170509 139620444509952 logging_writer.py:48] [25000] global_step=25000, grad_norm=0.2632233500480652, loss=3.5506157875061035 | |
I0914 09:44:20.310847 139620452902656 logging_writer.py:48] [25500] global_step=25500, grad_norm=0.2634458839893341, loss=3.6524815559387207 | |
I0914 09:45:56.501331 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 09:46:04.409524 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 09:46:13.919648 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 09:46:16.160629 139785753851712 submission_runner.py:376] Time since start: 9097.13s, Step: 25788, {'train/accuracy': 0.6070232391357422, 'train/loss': 1.906052589416504, 'validation/accuracy': 0.5681399703025818, 'validation/loss': 2.081261396408081, 'validation/num_examples': 50000, 'test/accuracy': 0.4448000192642212, 'test/loss': 2.7579729557037354, 'test/num_examples': 10000, 'score': 8734.046847581863, 'total_duration': 9097.127324581146, 'accumulated_submission_time': 8734.046847581863, 'accumulated_eval_time': 362.12878465652466, 'accumulated_logging_time': 0.5639915466308594} | |
I0914 09:46:16.183245 139621090420480 logging_writer.py:48] [25788] accumulated_eval_time=362.128785, accumulated_logging_time=0.563992, accumulated_submission_time=8734.046848, global_step=25788, preemption_count=0, score=8734.046848, test/accuracy=0.444800, test/loss=2.757973, test/num_examples=10000, total_duration=9097.127325, train/accuracy=0.607023, train/loss=1.906053, validation/accuracy=0.568140, validation/loss=2.081261, validation/num_examples=50000 | |
I0914 09:47:27.752431 139621098813184 logging_writer.py:48] [26000] global_step=26000, grad_norm=0.25486627221107483, loss=3.5052614212036133 | |
I0914 09:50:15.872663 139621090420480 logging_writer.py:48] [26500] global_step=26500, grad_norm=0.24980434775352478, loss=3.4703752994537354 | |
I0914 09:53:04.141244 139621098813184 logging_writer.py:48] [27000] global_step=27000, grad_norm=0.25720298290252686, loss=3.4930596351623535 | |
I0914 09:54:46.202930 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 09:54:54.460194 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 09:55:04.621035 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 09:55:06.895931 139785753851712 submission_runner.py:376] Time since start: 9627.86s, Step: 27305, {'train/accuracy': 0.6470025181770325, 'train/loss': 1.7079155445098877, 'validation/accuracy': 0.5748400092124939, 'validation/loss': 2.0490245819091797, 'validation/num_examples': 50000, 'test/accuracy': 0.4561000168323517, 'test/loss': 2.6967175006866455, 'test/num_examples': 10000, 'score': 9244.031145811081, 'total_duration': 9627.8626434803, 'accumulated_submission_time': 9244.031145811081, 'accumulated_eval_time': 382.8217294216156, 'accumulated_logging_time': 0.5991528034210205} | |
I0914 09:55:06.920699 139620452902656 logging_writer.py:48] [27305] accumulated_eval_time=382.821729, accumulated_logging_time=0.599153, accumulated_submission_time=9244.031146, global_step=27305, preemption_count=0, score=9244.031146, test/accuracy=0.456100, test/loss=2.696718, test/num_examples=10000, total_duration=9627.862643, train/accuracy=0.647003, train/loss=1.707916, validation/accuracy=0.574840, validation/loss=2.049025, validation/num_examples=50000 | |
I0914 09:56:12.884320 139620461295360 logging_writer.py:48] [27500] global_step=27500, grad_norm=0.24228453636169434, loss=3.4076318740844727 | |
I0914 09:59:01.164959 139620452902656 logging_writer.py:48] [28000] global_step=28000, grad_norm=0.25307339429855347, loss=3.5211498737335205 | |
I0914 10:01:49.381948 139620461295360 logging_writer.py:48] [28500] global_step=28500, grad_norm=0.2569205164909363, loss=3.4712512493133545 | |
I0914 10:03:37.166977 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 10:03:45.304134 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 10:03:55.034219 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 10:03:57.295296 139785753851712 submission_runner.py:376] Time since start: 10158.26s, Step: 28822, {'train/accuracy': 0.6150151491165161, 'train/loss': 1.8384064435958862, 'validation/accuracy': 0.5671799778938293, 'validation/loss': 2.073777675628662, 'validation/num_examples': 50000, 'test/accuracy': 0.4439000189304352, 'test/loss': 2.7257843017578125, 'test/num_examples': 10000, 'score': 9754.244331598282, 'total_duration': 10158.26204609871, 'accumulated_submission_time': 9754.244331598282, 'accumulated_eval_time': 402.9500343799591, 'accumulated_logging_time': 0.6338088512420654} | |
I0914 10:03:57.321540 139620444509952 logging_writer.py:48] [28822] accumulated_eval_time=402.950034, accumulated_logging_time=0.633809, accumulated_submission_time=9754.244332, global_step=28822, preemption_count=0, score=9754.244332, test/accuracy=0.443900, test/loss=2.725784, test/num_examples=10000, total_duration=10158.262046, train/accuracy=0.615015, train/loss=1.838406, validation/accuracy=0.567180, validation/loss=2.073778, validation/num_examples=50000 | |
I0914 10:04:57.400686 139621090420480 logging_writer.py:48] [29000] global_step=29000, grad_norm=0.2559642493724823, loss=3.615403890609741 | |
I0914 10:07:45.404992 139620444509952 logging_writer.py:48] [29500] global_step=29500, grad_norm=0.25525912642478943, loss=3.485414743423462 | |
I0914 10:10:33.635075 139621090420480 logging_writer.py:48] [30000] global_step=30000, grad_norm=0.2619873881340027, loss=3.531353235244751 | |
I0914 10:12:27.489350 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 10:12:35.666204 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 10:12:46.634149 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 10:12:48.918695 139785753851712 submission_runner.py:376] Time since start: 10689.89s, Step: 30340, {'train/accuracy': 0.6219905614852905, 'train/loss': 1.8566009998321533, 'validation/accuracy': 0.5748000144958496, 'validation/loss': 2.0822231769561768, 'validation/num_examples': 50000, 'test/accuracy': 0.4506000280380249, 'test/loss': 2.730398178100586, 'test/num_examples': 10000, 'score': 10264.379125356674, 'total_duration': 10689.885441303253, 'accumulated_submission_time': 10264.379125356674, 'accumulated_eval_time': 424.3793590068817, 'accumulated_logging_time': 0.6697630882263184} | |
I0914 10:12:48.940200 139621090420480 logging_writer.py:48] [30340] accumulated_eval_time=424.379359, accumulated_logging_time=0.669763, accumulated_submission_time=10264.379125, global_step=30340, preemption_count=0, score=10264.379125, test/accuracy=0.450600, test/loss=2.730398, test/num_examples=10000, total_duration=10689.885441, train/accuracy=0.621991, train/loss=1.856601, validation/accuracy=0.574800, validation/loss=2.082223, validation/num_examples=50000 | |
I0914 10:13:42.962412 139621098813184 logging_writer.py:48] [30500] global_step=30500, grad_norm=0.24631181359291077, loss=3.4548144340515137 | |
I0914 10:16:31.118767 139621090420480 logging_writer.py:48] [31000] global_step=31000, grad_norm=0.2661013603210449, loss=3.5372538566589355 | |
I0914 10:19:19.433270 139621098813184 logging_writer.py:48] [31500] global_step=31500, grad_norm=0.26415982842445374, loss=3.5574021339416504 | |
I0914 10:21:18.972899 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 10:21:27.011381 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 10:21:38.255725 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 10:21:40.473356 139785753851712 submission_runner.py:376] Time since start: 11221.44s, Step: 31857, {'train/accuracy': 0.6304607391357422, 'train/loss': 1.7625677585601807, 'validation/accuracy': 0.5828399658203125, 'validation/loss': 1.97676682472229, 'validation/num_examples': 50000, 'test/accuracy': 0.460500031709671, 'test/loss': 2.6475884914398193, 'test/num_examples': 10000, 'score': 10774.378732919693, 'total_duration': 11221.44010066986, 'accumulated_submission_time': 10774.378732919693, 'accumulated_eval_time': 445.87978982925415, 'accumulated_logging_time': 0.7013595104217529} | |
I0914 10:21:40.499169 139618842306304 logging_writer.py:48] [31857] accumulated_eval_time=445.879790, accumulated_logging_time=0.701360, accumulated_submission_time=10774.378733, global_step=31857, preemption_count=0, score=10774.378733, test/accuracy=0.460500, test/loss=2.647588, test/num_examples=10000, total_duration=11221.440101, train/accuracy=0.630461, train/loss=1.762568, validation/accuracy=0.582840, validation/loss=1.976767, validation/num_examples=50000 | |
I0914 10:22:28.939013 139618850699008 logging_writer.py:48] [32000] global_step=32000, grad_norm=0.27015623450279236, loss=3.525996446609497 | |
I0914 10:25:17.226516 139618842306304 logging_writer.py:48] [32500] global_step=32500, grad_norm=0.2672438621520996, loss=3.547415256500244 | |
I0914 10:28:05.477488 139618850699008 logging_writer.py:48] [33000] global_step=33000, grad_norm=0.2667967975139618, loss=3.43361759185791 | |
I0914 10:30:10.750766 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 10:30:18.859109 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 10:30:29.977935 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 10:30:32.261843 139785753851712 submission_runner.py:376] Time since start: 11753.23s, Step: 33374, {'train/accuracy': 0.6193598508834839, 'train/loss': 1.826348066329956, 'validation/accuracy': 0.5787799954414368, 'validation/loss': 2.038647413253784, 'validation/num_examples': 50000, 'test/accuracy': 0.46250003576278687, 'test/loss': 2.679227352142334, 'test/num_examples': 10000, 'score': 11284.596626758575, 'total_duration': 11753.228493452072, 'accumulated_submission_time': 11284.596626758575, 'accumulated_eval_time': 467.390745639801, 'accumulated_logging_time': 0.738187313079834} | |
I0914 10:30:32.285148 139618850699008 logging_writer.py:48] [33374] accumulated_eval_time=467.390746, accumulated_logging_time=0.738187, accumulated_submission_time=11284.596627, global_step=33374, preemption_count=0, score=11284.596627, test/accuracy=0.462500, test/loss=2.679227, test/num_examples=10000, total_duration=11753.228493, train/accuracy=0.619360, train/loss=1.826348, validation/accuracy=0.578780, validation/loss=2.038647, validation/num_examples=50000 | |
I0914 10:31:15.049534 139621090420480 logging_writer.py:48] [33500] global_step=33500, grad_norm=0.27199679613113403, loss=3.4771883487701416 | |
I0914 10:34:03.261353 139618850699008 logging_writer.py:48] [34000] global_step=34000, grad_norm=0.267585426568985, loss=3.463315010070801 | |
I0914 10:36:51.504971 139621090420480 logging_writer.py:48] [34500] global_step=34500, grad_norm=0.2632179260253906, loss=3.4990382194519043 | |
I0914 10:39:02.502562 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 10:39:10.827893 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 10:39:21.889250 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 10:39:24.149815 139785753851712 submission_runner.py:376] Time since start: 12285.12s, Step: 34891, {'train/accuracy': 0.6135203838348389, 'train/loss': 1.8042339086532593, 'validation/accuracy': 0.5674799680709839, 'validation/loss': 2.0255982875823975, 'validation/num_examples': 50000, 'test/accuracy': 0.4407000243663788, 'test/loss': 2.6929032802581787, 'test/num_examples': 10000, 'score': 11794.77813744545, 'total_duration': 12285.116560459137, 'accumulated_submission_time': 11794.77813744545, 'accumulated_eval_time': 489.0379819869995, 'accumulated_logging_time': 0.7739980220794678} | |
I0914 10:39:24.173407 139618833913600 logging_writer.py:48] [34891] accumulated_eval_time=489.037982, accumulated_logging_time=0.773998, accumulated_submission_time=11794.778137, global_step=34891, preemption_count=0, score=11794.778137, test/accuracy=0.440700, test/loss=2.692903, test/num_examples=10000, total_duration=12285.116560, train/accuracy=0.613520, train/loss=1.804234, validation/accuracy=0.567480, validation/loss=2.025598, validation/num_examples=50000 | |
I0914 10:40:01.195227 139618842306304 logging_writer.py:48] [35000] global_step=35000, grad_norm=0.2558711767196655, loss=3.463052988052368 | |
I0914 10:42:49.472806 139618833913600 logging_writer.py:48] [35500] global_step=35500, grad_norm=0.2669128179550171, loss=3.420599937438965 | |
I0914 10:45:37.677465 139618842306304 logging_writer.py:48] [36000] global_step=36000, grad_norm=0.2667349874973297, loss=3.52689528465271 | |
I0914 10:47:54.355814 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 10:48:02.683128 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 10:48:14.021110 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 10:48:16.292880 139785753851712 submission_runner.py:376] Time since start: 12817.26s, Step: 36408, {'train/accuracy': 0.6544762253761292, 'train/loss': 1.7204208374023438, 'validation/accuracy': 0.5828199982643127, 'validation/loss': 2.025986433029175, 'validation/num_examples': 50000, 'test/accuracy': 0.45920002460479736, 'test/loss': 2.709810733795166, 'test/num_examples': 10000, 'score': 12304.926926612854, 'total_duration': 12817.259620189667, 'accumulated_submission_time': 12304.926926612854, 'accumulated_eval_time': 510.97503638267517, 'accumulated_logging_time': 0.8083920478820801} | |
I0914 10:48:16.314396 139621082027776 logging_writer.py:48] [36408] accumulated_eval_time=510.975036, accumulated_logging_time=0.808392, accumulated_submission_time=12304.926927, global_step=36408, preemption_count=0, score=12304.926927, test/accuracy=0.459200, test/loss=2.709811, test/num_examples=10000, total_duration=12817.259620, train/accuracy=0.654476, train/loss=1.720421, validation/accuracy=0.582820, validation/loss=2.025986, validation/num_examples=50000 | |
I0914 10:48:47.581541 139621090420480 logging_writer.py:48] [36500] global_step=36500, grad_norm=0.26472190022468567, loss=3.4355721473693848 | |
I0914 10:51:35.596939 139621082027776 logging_writer.py:48] [37000] global_step=37000, grad_norm=0.26586592197418213, loss=3.414236545562744 | |
I0914 10:54:23.842362 139621090420480 logging_writer.py:48] [37500] global_step=37500, grad_norm=0.2715505361557007, loss=3.4005072116851807 | |
I0914 10:56:46.598515 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 10:56:54.445778 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 10:57:05.205040 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 10:57:07.497152 139785753851712 submission_runner.py:376] Time since start: 13348.46s, Step: 37926, {'train/accuracy': 0.6563695669174194, 'train/loss': 1.7108644247055054, 'validation/accuracy': 0.5983799695968628, 'validation/loss': 1.9706319570541382, 'validation/num_examples': 50000, 'test/accuracy': 0.48260003328323364, 'test/loss': 2.6029040813446045, 'test/num_examples': 10000, 'score': 12815.176861763, 'total_duration': 13348.463897228241, 'accumulated_submission_time': 12815.176861763, 'accumulated_eval_time': 531.8736464977264, 'accumulated_logging_time': 0.8413448333740234} | |
I0914 10:57:07.519643 139618833913600 logging_writer.py:48] [37926] accumulated_eval_time=531.873646, accumulated_logging_time=0.841345, accumulated_submission_time=12815.176862, global_step=37926, preemption_count=0, score=12815.176862, test/accuracy=0.482600, test/loss=2.602904, test/num_examples=10000, total_duration=13348.463897, train/accuracy=0.656370, train/loss=1.710864, validation/accuracy=0.598380, validation/loss=1.970632, validation/num_examples=50000 | |
I0914 10:57:32.709061 139618842306304 logging_writer.py:48] [38000] global_step=38000, grad_norm=0.2711631655693054, loss=3.4340717792510986 | |
I0914 11:00:20.877791 139618833913600 logging_writer.py:48] [38500] global_step=38500, grad_norm=0.2710915207862854, loss=3.424994945526123 | |
I0914 11:03:09.109183 139618842306304 logging_writer.py:48] [39000] global_step=39000, grad_norm=0.2728542387485504, loss=3.4836416244506836 | |
I0914 11:05:37.603592 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 11:05:45.487278 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 11:05:56.162706 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 11:05:58.400947 139785753851712 submission_runner.py:376] Time since start: 13879.37s, Step: 39443, {'train/accuracy': 0.6421595811843872, 'train/loss': 1.6656968593597412, 'validation/accuracy': 0.5947999954223633, 'validation/loss': 1.8805724382400513, 'validation/num_examples': 50000, 'test/accuracy': 0.4724000096321106, 'test/loss': 2.5521347522735596, 'test/num_examples': 10000, 'score': 13325.226942777634, 'total_duration': 13879.367692947388, 'accumulated_submission_time': 13325.226942777634, 'accumulated_eval_time': 552.67098736763, 'accumulated_logging_time': 0.8749892711639404} | |
I0914 11:05:58.423598 139621090420480 logging_writer.py:48] [39443] accumulated_eval_time=552.670987, accumulated_logging_time=0.874989, accumulated_submission_time=13325.226943, global_step=39443, preemption_count=0, score=13325.226943, test/accuracy=0.472400, test/loss=2.552135, test/num_examples=10000, total_duration=13879.367693, train/accuracy=0.642160, train/loss=1.665697, validation/accuracy=0.594800, validation/loss=1.880572, validation/num_examples=50000 | |
I0914 11:06:17.864188 139621098813184 logging_writer.py:48] [39500] global_step=39500, grad_norm=0.2749280035495758, loss=3.5009050369262695 | |
I0914 11:09:05.970704 139621090420480 logging_writer.py:48] [40000] global_step=40000, grad_norm=0.2732629179954529, loss=3.5085206031799316 | |
I0914 11:11:54.222726 139621098813184 logging_writer.py:48] [40500] global_step=40500, grad_norm=0.27376991510391235, loss=3.424867868423462 | |
I0914 11:14:28.716043 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 11:14:36.618146 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 11:14:47.344748 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 11:14:49.609976 139785753851712 submission_runner.py:376] Time since start: 14410.58s, Step: 40961, {'train/accuracy': 0.6224888563156128, 'train/loss': 1.8308100700378418, 'validation/accuracy': 0.5813800096511841, 'validation/loss': 2.03011155128479, 'validation/num_examples': 50000, 'test/accuracy': 0.46330001950263977, 'test/loss': 2.654615640640259, 'test/num_examples': 10000, 'score': 13835.485067367554, 'total_duration': 14410.576689004898, 'accumulated_submission_time': 13835.485067367554, 'accumulated_eval_time': 573.5648620128632, 'accumulated_logging_time': 0.9091906547546387} | |
I0914 11:14:49.632932 139618850699008 logging_writer.py:48] [40961] accumulated_eval_time=573.564862, accumulated_logging_time=0.909191, accumulated_submission_time=13835.485067, global_step=40961, preemption_count=0, score=13835.485067, test/accuracy=0.463300, test/loss=2.654616, test/num_examples=10000, total_duration=14410.576689, train/accuracy=0.622489, train/loss=1.830810, validation/accuracy=0.581380, validation/loss=2.030112, validation/num_examples=50000 | |
I0914 11:15:03.108616 139620410959616 logging_writer.py:48] [41000] global_step=41000, grad_norm=0.2798670530319214, loss=3.474769115447998 | |
I0914 11:17:51.292984 139618850699008 logging_writer.py:48] [41500] global_step=41500, grad_norm=0.2704616189002991, loss=3.3771965503692627 | |
I0914 11:20:39.542866 139620410959616 logging_writer.py:48] [42000] global_step=42000, grad_norm=0.2555727958679199, loss=3.3959810733795166 | |
I0914 11:23:19.824655 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 11:23:27.647799 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 11:23:38.533474 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 11:23:40.750660 139785753851712 submission_runner.py:376] Time since start: 14941.72s, Step: 42478, {'train/accuracy': 0.6535195708274841, 'train/loss': 1.6985844373703003, 'validation/accuracy': 0.6078799962997437, 'validation/loss': 1.8977073431015015, 'validation/num_examples': 50000, 'test/accuracy': 0.4879000186920166, 'test/loss': 2.5333468914031982, 'test/num_examples': 10000, 'score': 14345.642942905426, 'total_duration': 14941.717395067215, 'accumulated_submission_time': 14345.642942905426, 'accumulated_eval_time': 594.4908349514008, 'accumulated_logging_time': 0.9433751106262207} | |
I0914 11:23:40.773953 139618045392640 logging_writer.py:48] [42478] accumulated_eval_time=594.490835, accumulated_logging_time=0.943375, accumulated_submission_time=14345.642943, global_step=42478, preemption_count=0, score=14345.642943, test/accuracy=0.487900, test/loss=2.533347, test/num_examples=10000, total_duration=14941.717395, train/accuracy=0.653520, train/loss=1.698584, validation/accuracy=0.607880, validation/loss=1.897707, validation/num_examples=50000 | |
I0914 11:23:48.501946 139621098813184 logging_writer.py:48] [42500] global_step=42500, grad_norm=0.26943811774253845, loss=3.3446433544158936 | |
I0914 11:26:36.638270 139618045392640 logging_writer.py:48] [43000] global_step=43000, grad_norm=0.26483917236328125, loss=3.457627058029175 | |
I0914 11:29:24.871920 139621098813184 logging_writer.py:48] [43500] global_step=43500, grad_norm=0.2760085463523865, loss=3.4224777221679688 | |
I0914 11:32:10.842974 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 11:32:18.730613 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 11:32:29.689754 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 11:32:31.948379 139785753851712 submission_runner.py:376] Time since start: 15472.92s, Step: 43995, {'train/accuracy': 0.6800063848495483, 'train/loss': 1.590664267539978, 'validation/accuracy': 0.5967199802398682, 'validation/loss': 1.9601261615753174, 'validation/num_examples': 50000, 'test/accuracy': 0.47510001063346863, 'test/loss': 2.590411424636841, 'test/num_examples': 10000, 'score': 14855.678381443024, 'total_duration': 15472.915107250214, 'accumulated_submission_time': 14855.678381443024, 'accumulated_eval_time': 615.5962023735046, 'accumulated_logging_time': 0.9775500297546387} | |
I0914 11:32:31.970802 139620410959616 logging_writer.py:48] [43995] accumulated_eval_time=615.596202, accumulated_logging_time=0.977550, accumulated_submission_time=14855.678381, global_step=43995, preemption_count=0, score=14855.678381, test/accuracy=0.475100, test/loss=2.590411, test/num_examples=10000, total_duration=15472.915107, train/accuracy=0.680006, train/loss=1.590664, validation/accuracy=0.596720, validation/loss=1.960126, validation/num_examples=50000 | |
I0914 11:32:33.989422 139620419352320 logging_writer.py:48] [44000] global_step=44000, grad_norm=0.2681794762611389, loss=3.3710594177246094 | |
I0914 11:35:22.091144 139620410959616 logging_writer.py:48] [44500] global_step=44500, grad_norm=0.2750702202320099, loss=3.4190914630889893 | |
I0914 11:38:10.329913 139620419352320 logging_writer.py:48] [45000] global_step=45000, grad_norm=0.2686624228954315, loss=3.4064462184906006 | |
I0914 11:40:58.527068 139620410959616 logging_writer.py:48] [45500] global_step=45500, grad_norm=0.2688489556312561, loss=3.4332542419433594 | |
I0914 11:41:01.977522 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 11:41:10.070076 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 11:41:20.937899 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 11:41:23.177372 139785753851712 submission_runner.py:376] Time since start: 16004.14s, Step: 45512, {'train/accuracy': 0.6493741869926453, 'train/loss': 1.7071624994277954, 'validation/accuracy': 0.5888199806213379, 'validation/loss': 2.0002124309539795, 'validation/num_examples': 50000, 'test/accuracy': 0.45580002665519714, 'test/loss': 2.681110382080078, 'test/num_examples': 10000, 'score': 15365.652275562286, 'total_duration': 16004.144119977951, 'accumulated_submission_time': 15365.652275562286, 'accumulated_eval_time': 636.7960221767426, 'accumulated_logging_time': 1.010751485824585} | |
I0914 11:41:23.199499 139618045392640 logging_writer.py:48] [45512] accumulated_eval_time=636.796022, accumulated_logging_time=1.010751, accumulated_submission_time=15365.652276, global_step=45512, preemption_count=0, score=15365.652276, test/accuracy=0.455800, test/loss=2.681110, test/num_examples=10000, total_duration=16004.144120, train/accuracy=0.649374, train/loss=1.707162, validation/accuracy=0.588820, validation/loss=2.000212, validation/num_examples=50000 | |
I0914 11:44:07.496607 139621098813184 logging_writer.py:48] [46000] global_step=46000, grad_norm=0.2720436155796051, loss=3.4086406230926514 | |
I0914 11:46:55.758412 139618045392640 logging_writer.py:48] [46500] global_step=46500, grad_norm=0.28809547424316406, loss=3.4885668754577637 | |
I0914 11:49:44.016989 139621098813184 logging_writer.py:48] [47000] global_step=47000, grad_norm=0.2793656885623932, loss=3.47119140625 | |
I0914 11:49:53.191527 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 11:50:01.044825 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 11:50:12.127666 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 11:50:14.442233 139785753851712 submission_runner.py:376] Time since start: 16535.41s, Step: 47029, {'train/accuracy': 0.6650390625, 'train/loss': 1.6022390127182007, 'validation/accuracy': 0.6084200143814087, 'validation/loss': 1.8593943119049072, 'validation/num_examples': 50000, 'test/accuracy': 0.4799000322818756, 'test/loss': 2.522620677947998, 'test/num_examples': 10000, 'score': 15875.610366106033, 'total_duration': 16535.40897345543, 'accumulated_submission_time': 15875.610366106033, 'accumulated_eval_time': 658.0466804504395, 'accumulated_logging_time': 1.043591022491455} | |
I0914 11:50:14.464774 139618045392640 logging_writer.py:48] [47029] accumulated_eval_time=658.046680, accumulated_logging_time=1.043591, accumulated_submission_time=15875.610366, global_step=47029, preemption_count=0, score=15875.610366, test/accuracy=0.479900, test/loss=2.522621, test/num_examples=10000, total_duration=16535.408973, train/accuracy=0.665039, train/loss=1.602239, validation/accuracy=0.608420, validation/loss=1.859394, validation/num_examples=50000 | |
I0914 11:52:53.123323 139620410959616 logging_writer.py:48] [47500] global_step=47500, grad_norm=0.2651343047618866, loss=3.3844122886657715 | |
I0914 11:55:41.347779 139618045392640 logging_writer.py:48] [48000] global_step=48000, grad_norm=0.28748777508735657, loss=3.477534770965576 | |
I0914 11:58:29.551903 139620410959616 logging_writer.py:48] [48500] global_step=48500, grad_norm=0.29311808943748474, loss=3.514925241470337 | |
I0914 11:58:44.444758 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 11:58:52.251591 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 11:59:03.331104 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 11:59:05.625081 139785753851712 submission_runner.py:376] Time since start: 17066.59s, Step: 48546, {'train/accuracy': 0.6487563848495483, 'train/loss': 1.6832637786865234, 'validation/accuracy': 0.5993599891662598, 'validation/loss': 1.918584942817688, 'validation/num_examples': 50000, 'test/accuracy': 0.47190001606941223, 'test/loss': 2.5920767784118652, 'test/num_examples': 10000, 'score': 16385.556366205215, 'total_duration': 17066.591827869415, 'accumulated_submission_time': 16385.556366205215, 'accumulated_eval_time': 679.2269690036774, 'accumulated_logging_time': 1.0766348838806152} | |
I0914 11:59:05.646985 139618028607232 logging_writer.py:48] [48546] accumulated_eval_time=679.226969, accumulated_logging_time=1.076635, accumulated_submission_time=16385.556366, global_step=48546, preemption_count=0, score=16385.556366, test/accuracy=0.471900, test/loss=2.592077, test/num_examples=10000, total_duration=17066.591828, train/accuracy=0.648756, train/loss=1.683264, validation/accuracy=0.599360, validation/loss=1.918585, validation/num_examples=50000 | |
I0914 12:01:38.705673 139618036999936 logging_writer.py:48] [49000] global_step=49000, grad_norm=0.28258246183395386, loss=3.4261724948883057 | |
I0914 12:04:26.898727 139618028607232 logging_writer.py:48] [49500] global_step=49500, grad_norm=0.28819113969802856, loss=3.4131767749786377 | |
I0914 12:07:15.160165 139618036999936 logging_writer.py:48] [50000] global_step=50000, grad_norm=0.27262210845947266, loss=3.3209586143493652 | |
I0914 12:07:35.770280 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 12:07:43.563357 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 12:07:54.683665 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 12:07:56.960196 139785753851712 submission_runner.py:376] Time since start: 17597.93s, Step: 50063, {'train/accuracy': 0.6590401530265808, 'train/loss': 1.6971187591552734, 'validation/accuracy': 0.6092999577522278, 'validation/loss': 1.9145151376724243, 'validation/num_examples': 50000, 'test/accuracy': 0.4806000292301178, 'test/loss': 2.5815019607543945, 'test/num_examples': 10000, 'score': 16895.64519429207, 'total_duration': 17597.92692875862, 'accumulated_submission_time': 16895.64519429207, 'accumulated_eval_time': 700.4168322086334, 'accumulated_logging_time': 1.1097350120544434} | |
I0914 12:07:56.982750 139620419352320 logging_writer.py:48] [50063] accumulated_eval_time=700.416832, accumulated_logging_time=1.109735, accumulated_submission_time=16895.645194, global_step=50063, preemption_count=0, score=16895.645194, test/accuracy=0.480600, test/loss=2.581502, test/num_examples=10000, total_duration=17597.926929, train/accuracy=0.659040, train/loss=1.697119, validation/accuracy=0.609300, validation/loss=1.914515, validation/num_examples=50000 | |
I0914 12:10:24.317590 139621082027776 logging_writer.py:48] [50500] global_step=50500, grad_norm=0.2909661531448364, loss=3.479194164276123 | |
I0914 12:13:12.551684 139620419352320 logging_writer.py:48] [51000] global_step=51000, grad_norm=0.27277132868766785, loss=3.3441390991210938 | |
I0914 12:16:00.776785 139621082027776 logging_writer.py:48] [51500] global_step=51500, grad_norm=0.2970084249973297, loss=3.453324794769287 | |
I0914 12:16:27.110894 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 12:16:34.974406 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 12:16:46.097953 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 12:16:48.275398 139785753851712 submission_runner.py:376] Time since start: 18129.24s, Step: 51580, {'train/accuracy': 0.6513472199440002, 'train/loss': 1.657335877418518, 'validation/accuracy': 0.6073200106620789, 'validation/loss': 1.8589073419570923, 'validation/num_examples': 50000, 'test/accuracy': 0.4838000237941742, 'test/loss': 2.5184545516967773, 'test/num_examples': 10000, 'score': 17405.740804433823, 'total_duration': 18129.24203300476, 'accumulated_submission_time': 17405.740804433823, 'accumulated_eval_time': 721.5812134742737, 'accumulated_logging_time': 1.142387866973877} | |
I0914 12:16:48.297658 139618045392640 logging_writer.py:48] [51580] accumulated_eval_time=721.581213, accumulated_logging_time=1.142388, accumulated_submission_time=17405.740804, global_step=51580, preemption_count=0, score=17405.740804, test/accuracy=0.483800, test/loss=2.518455, test/num_examples=10000, total_duration=18129.242033, train/accuracy=0.651347, train/loss=1.657336, validation/accuracy=0.607320, validation/loss=1.858907, validation/num_examples=50000 | |
I0914 12:19:09.672078 139620410959616 logging_writer.py:48] [52000] global_step=52000, grad_norm=0.29126450419425964, loss=3.4079971313476562 | |
I0914 12:21:57.867535 139618045392640 logging_writer.py:48] [52500] global_step=52500, grad_norm=0.28497904539108276, loss=3.4135935306549072 | |
I0914 12:24:46.141434 139620410959616 logging_writer.py:48] [53000] global_step=53000, grad_norm=0.2890980541706085, loss=3.3926331996917725 | |
I0914 12:25:18.535628 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 12:25:26.364237 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 12:25:37.400487 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 12:25:39.667940 139785753851712 submission_runner.py:376] Time since start: 18660.63s, Step: 53098, {'train/accuracy': 0.6906289458274841, 'train/loss': 1.4674862623214722, 'validation/accuracy': 0.5989800095558167, 'validation/loss': 1.8677852153778076, 'validation/num_examples': 50000, 'test/accuracy': 0.48270002007484436, 'test/loss': 2.5035319328308105, 'test/num_examples': 10000, 'score': 17915.94573879242, 'total_duration': 18660.634664297104, 'accumulated_submission_time': 17915.94573879242, 'accumulated_eval_time': 742.713464975357, 'accumulated_logging_time': 1.1745285987854004} | |
I0914 12:25:39.696571 139621098813184 logging_writer.py:48] [53098] accumulated_eval_time=742.713465, accumulated_logging_time=1.174529, accumulated_submission_time=17915.945739, global_step=53098, preemption_count=0, score=17915.945739, test/accuracy=0.482700, test/loss=2.503532, test/num_examples=10000, total_duration=18660.634664, train/accuracy=0.690629, train/loss=1.467486, validation/accuracy=0.598980, validation/loss=1.867785, validation/num_examples=50000 | |
I0914 12:27:55.156162 139621107205888 logging_writer.py:48] [53500] global_step=53500, grad_norm=0.2903880178928375, loss=3.4212396144866943 | |
I0914 12:30:43.397672 139621098813184 logging_writer.py:48] [54000] global_step=54000, grad_norm=0.2859453558921814, loss=3.338300943374634 | |
I0914 12:33:31.304433 139621107205888 logging_writer.py:48] [54500] global_step=54500, grad_norm=0.28716540336608887, loss=3.3848648071289062 | |
I0914 12:34:09.739545 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 12:34:17.555717 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 12:34:28.689805 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 12:34:30.940042 139785753851712 submission_runner.py:376] Time since start: 19191.91s, Step: 54616, {'train/accuracy': 0.6725525856018066, 'train/loss': 1.5843250751495361, 'validation/accuracy': 0.6118199825286865, 'validation/loss': 1.859339952468872, 'validation/num_examples': 50000, 'test/accuracy': 0.47780001163482666, 'test/loss': 2.5555808544158936, 'test/num_examples': 10000, 'score': 18425.954810380936, 'total_duration': 19191.906791448593, 'accumulated_submission_time': 18425.954810380936, 'accumulated_eval_time': 763.9139442443848, 'accumulated_logging_time': 1.2138841152191162} | |
I0914 12:34:30.967104 139620410959616 logging_writer.py:48] [54616] accumulated_eval_time=763.913944, accumulated_logging_time=1.213884, accumulated_submission_time=18425.954810, global_step=54616, preemption_count=0, score=18425.954810, test/accuracy=0.477800, test/loss=2.555581, test/num_examples=10000, total_duration=19191.906791, train/accuracy=0.672553, train/loss=1.584325, validation/accuracy=0.611820, validation/loss=1.859340, validation/num_examples=50000 | |
I0914 12:36:40.470300 139620419352320 logging_writer.py:48] [55000] global_step=55000, grad_norm=0.2885455787181854, loss=3.4199295043945312 | |
I0914 12:39:28.701473 139620410959616 logging_writer.py:48] [55500] global_step=55500, grad_norm=0.2843439280986786, loss=3.3483569622039795 | |
I0914 12:42:16.934671 139620419352320 logging_writer.py:48] [56000] global_step=56000, grad_norm=0.2804867625236511, loss=3.3117964267730713 | |
I0914 12:43:01.096180 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 12:43:08.891999 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 12:43:19.882472 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 12:43:22.166954 139785753851712 submission_runner.py:376] Time since start: 19723.13s, Step: 56133, {'train/accuracy': 0.6829958558082581, 'train/loss': 1.4818432331085205, 'validation/accuracy': 0.6251199841499329, 'validation/loss': 1.7404627799987793, 'validation/num_examples': 50000, 'test/accuracy': 0.5063000321388245, 'test/loss': 2.392498254776001, 'test/num_examples': 10000, 'score': 18936.050762176514, 'total_duration': 19723.13370156288, 'accumulated_submission_time': 18936.050762176514, 'accumulated_eval_time': 784.9846830368042, 'accumulated_logging_time': 1.2514803409576416} | |
I0914 12:43:22.194325 139618036999936 logging_writer.py:48] [56133] accumulated_eval_time=784.984683, accumulated_logging_time=1.251480, accumulated_submission_time=18936.050762, global_step=56133, preemption_count=0, score=18936.050762, test/accuracy=0.506300, test/loss=2.392498, test/num_examples=10000, total_duration=19723.133702, train/accuracy=0.682996, train/loss=1.481843, validation/accuracy=0.625120, validation/loss=1.740463, validation/num_examples=50000 | |
I0914 12:45:25.954758 139618045392640 logging_writer.py:48] [56500] global_step=56500, grad_norm=0.28177163004875183, loss=3.2978594303131104 | |
I0914 12:48:13.941949 139618036999936 logging_writer.py:48] [57000] global_step=57000, grad_norm=0.2938775420188904, loss=3.336643695831299 | |
I0914 12:51:02.182384 139618045392640 logging_writer.py:48] [57500] global_step=57500, grad_norm=0.28899526596069336, loss=3.2880070209503174 | |
I0914 12:51:52.388687 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 12:52:00.224894 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 12:52:11.284370 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 12:52:13.527475 139785753851712 submission_runner.py:376] Time since start: 20254.49s, Step: 57651, {'train/accuracy': 0.6757014989852905, 'train/loss': 1.5783743858337402, 'validation/accuracy': 0.619879961013794, 'validation/loss': 1.8317604064941406, 'validation/num_examples': 50000, 'test/accuracy': 0.49640002846717834, 'test/loss': 2.4753739833831787, 'test/num_examples': 10000, 'score': 19446.212296009064, 'total_duration': 20254.494203805923, 'accumulated_submission_time': 19446.212296009064, 'accumulated_eval_time': 806.1234366893768, 'accumulated_logging_time': 1.2888352870941162} | |
I0914 12:52:13.550995 139618036999936 logging_writer.py:48] [57651] accumulated_eval_time=806.123437, accumulated_logging_time=1.288835, accumulated_submission_time=19446.212296, global_step=57651, preemption_count=0, score=19446.212296, test/accuracy=0.496400, test/loss=2.475374, test/num_examples=10000, total_duration=20254.494204, train/accuracy=0.675701, train/loss=1.578374, validation/accuracy=0.619880, validation/loss=1.831760, validation/num_examples=50000 | |
I0914 12:54:11.198793 139618045392640 logging_writer.py:48] [58000] global_step=58000, grad_norm=0.2972732484340668, loss=3.3535263538360596 | |
I0914 12:56:59.446379 139618036999936 logging_writer.py:48] [58500] global_step=58500, grad_norm=0.2926310896873474, loss=3.381610631942749 | |
I0914 12:59:47.653911 139618045392640 logging_writer.py:48] [59000] global_step=59000, grad_norm=0.2968601882457733, loss=3.4223320484161377 | |
I0914 13:00:43.613631 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 13:00:51.421866 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 13:01:02.407345 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 13:01:04.662746 139785753851712 submission_runner.py:376] Time since start: 20785.63s, Step: 59168, {'train/accuracy': 0.6541573405265808, 'train/loss': 1.6419929265975952, 'validation/accuracy': 0.6013599634170532, 'validation/loss': 1.8765325546264648, 'validation/num_examples': 50000, 'test/accuracy': 0.4806000292301178, 'test/loss': 2.5387802124023438, 'test/num_examples': 10000, 'score': 19956.23943376541, 'total_duration': 20785.629487276077, 'accumulated_submission_time': 19956.23943376541, 'accumulated_eval_time': 827.1725206375122, 'accumulated_logging_time': 1.3252980709075928} | |
I0914 13:01:04.686016 139620410959616 logging_writer.py:48] [59168] accumulated_eval_time=827.172521, accumulated_logging_time=1.325298, accumulated_submission_time=19956.239434, global_step=59168, preemption_count=0, score=19956.239434, test/accuracy=0.480600, test/loss=2.538780, test/num_examples=10000, total_duration=20785.629487, train/accuracy=0.654157, train/loss=1.641993, validation/accuracy=0.601360, validation/loss=1.876533, validation/num_examples=50000 | |
I0914 13:02:56.666364 139620419352320 logging_writer.py:48] [59500] global_step=59500, grad_norm=0.2902616560459137, loss=3.3645551204681396 | |
I0914 13:05:44.877468 139620410959616 logging_writer.py:48] [60000] global_step=60000, grad_norm=0.2877161502838135, loss=3.365004777908325 | |
I0914 13:08:33.128105 139620419352320 logging_writer.py:48] [60500] global_step=60500, grad_norm=0.2932965159416199, loss=3.345557928085327 | |
I0914 13:09:34.795152 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 13:09:42.586314 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 13:09:53.636936 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 13:09:55.923867 139785753851712 submission_runner.py:376] Time since start: 21316.89s, Step: 60685, {'train/accuracy': 0.6724131107330322, 'train/loss': 1.6111648082733154, 'validation/accuracy': 0.6247599720954895, 'validation/loss': 1.8165513277053833, 'validation/num_examples': 50000, 'test/accuracy': 0.49490001797676086, 'test/loss': 2.476473093032837, 'test/num_examples': 10000, 'score': 20466.31075167656, 'total_duration': 21316.890612602234, 'accumulated_submission_time': 20466.31075167656, 'accumulated_eval_time': 848.3012022972107, 'accumulated_logging_time': 1.362497329711914} | |
I0914 13:09:55.951478 139618036999936 logging_writer.py:48] [60685] accumulated_eval_time=848.301202, accumulated_logging_time=1.362497, accumulated_submission_time=20466.310752, global_step=60685, preemption_count=0, score=20466.310752, test/accuracy=0.494900, test/loss=2.476473, test/num_examples=10000, total_duration=21316.890613, train/accuracy=0.672413, train/loss=1.611165, validation/accuracy=0.624760, validation/loss=1.816551, validation/num_examples=50000 | |
I0914 13:11:42.044412 139618045392640 logging_writer.py:48] [61000] global_step=61000, grad_norm=0.2919887602329254, loss=3.329354763031006 | |
I0914 13:14:30.159597 139618036999936 logging_writer.py:48] [61500] global_step=61500, grad_norm=0.29778042435646057, loss=3.3058791160583496 | |
I0914 13:17:18.386161 139618045392640 logging_writer.py:48] [62000] global_step=62000, grad_norm=0.2837899625301361, loss=3.382122755050659 | |
I0914 13:18:26.086736 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 13:18:33.855817 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 13:18:45.015492 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 13:18:47.231468 139785753851712 submission_runner.py:376] Time since start: 21848.20s, Step: 62203, {'train/accuracy': 0.6910076141357422, 'train/loss': 1.4848798513412476, 'validation/accuracy': 0.611739993095398, 'validation/loss': 1.8428031206130981, 'validation/num_examples': 50000, 'test/accuracy': 0.4788000285625458, 'test/loss': 2.5234971046447754, 'test/num_examples': 10000, 'score': 20976.412611722946, 'total_duration': 21848.198214292526, 'accumulated_submission_time': 20976.412611722946, 'accumulated_eval_time': 869.4459004402161, 'accumulated_logging_time': 1.4001359939575195} | |
I0914 13:18:47.254843 139618036999936 logging_writer.py:48] [62203] accumulated_eval_time=869.445900, accumulated_logging_time=1.400136, accumulated_submission_time=20976.412612, global_step=62203, preemption_count=0, score=20976.412612, test/accuracy=0.478800, test/loss=2.523497, test/num_examples=10000, total_duration=21848.198214, train/accuracy=0.691008, train/loss=1.484880, validation/accuracy=0.611740, validation/loss=1.842803, validation/num_examples=50000 | |
I0914 13:20:27.308857 139620419352320 logging_writer.py:48] [62500] global_step=62500, grad_norm=0.30506154894828796, loss=3.297473669052124 | |
I0914 13:23:15.523633 139618036999936 logging_writer.py:48] [63000] global_step=63000, grad_norm=0.3115837275981903, loss=3.338627815246582 | |
I0914 13:26:03.682383 139620419352320 logging_writer.py:48] [63500] global_step=63500, grad_norm=0.31050509214401245, loss=3.2706823348999023 | |
I0914 13:27:17.456019 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 13:27:25.201742 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 13:27:36.333347 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 13:27:38.580904 139785753851712 submission_runner.py:376] Time since start: 22379.55s, Step: 63721, {'train/accuracy': 0.6920240521430969, 'train/loss': 1.4649821519851685, 'validation/accuracy': 0.6265400052070618, 'validation/loss': 1.7699750661849976, 'validation/num_examples': 50000, 'test/accuracy': 0.5057000517845154, 'test/loss': 2.418595314025879, 'test/num_examples': 10000, 'score': 21486.580602407455, 'total_duration': 22379.547651052475, 'accumulated_submission_time': 21486.580602407455, 'accumulated_eval_time': 890.5707561969757, 'accumulated_logging_time': 1.433117389678955} | |
I0914 13:27:38.603337 139621090420480 logging_writer.py:48] [63721] accumulated_eval_time=890.570756, accumulated_logging_time=1.433117, accumulated_submission_time=21486.580602, global_step=63721, preemption_count=0, score=21486.580602, test/accuracy=0.505700, test/loss=2.418595, test/num_examples=10000, total_duration=22379.547651, train/accuracy=0.692024, train/loss=1.464982, validation/accuracy=0.626540, validation/loss=1.769975, validation/num_examples=50000 | |
I0914 13:29:12.660554 139621098813184 logging_writer.py:48] [64000] global_step=64000, grad_norm=0.29441016912460327, loss=3.2818336486816406 | |
I0914 13:32:00.863299 139621090420480 logging_writer.py:48] [64500] global_step=64500, grad_norm=0.2864547669887543, loss=3.258450508117676 | |
I0914 13:34:49.087769 139621098813184 logging_writer.py:48] [65000] global_step=65000, grad_norm=0.29472991824150085, loss=3.2548041343688965 | |
I0914 13:36:08.911741 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 13:36:16.670967 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 13:36:27.876635 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 13:36:30.129736 139785753851712 submission_runner.py:376] Time since start: 22911.10s, Step: 65239, {'train/accuracy': 0.6842514276504517, 'train/loss': 1.510816216468811, 'validation/accuracy': 0.6255399584770203, 'validation/loss': 1.7839473485946655, 'validation/num_examples': 50000, 'test/accuracy': 0.4951000213623047, 'test/loss': 2.4557323455810547, 'test/num_examples': 10000, 'score': 21996.856678962708, 'total_duration': 22911.09648013115, 'accumulated_submission_time': 21996.856678962708, 'accumulated_eval_time': 911.7887194156647, 'accumulated_logging_time': 1.464731216430664} | |
I0914 13:36:30.153545 139620419352320 logging_writer.py:48] [65239] accumulated_eval_time=911.788719, accumulated_logging_time=1.464731, accumulated_submission_time=21996.856679, global_step=65239, preemption_count=0, score=21996.856679, test/accuracy=0.495100, test/loss=2.455732, test/num_examples=10000, total_duration=22911.096480, train/accuracy=0.684251, train/loss=1.510816, validation/accuracy=0.625540, validation/loss=1.783947, validation/num_examples=50000 | |
I0914 13:37:58.123054 139621082027776 logging_writer.py:48] [65500] global_step=65500, grad_norm=0.30739957094192505, loss=3.3116114139556885 | |
I0914 13:40:46.180480 139620419352320 logging_writer.py:48] [66000] global_step=66000, grad_norm=0.3004262447357178, loss=3.3701276779174805 | |
I0914 13:43:34.421802 139621082027776 logging_writer.py:48] [66500] global_step=66500, grad_norm=0.2919093072414398, loss=3.2609100341796875 | |
I0914 13:45:00.284182 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 13:45:08.062767 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 13:45:19.342478 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 13:45:21.922167 139785753851712 submission_runner.py:376] Time since start: 23442.89s, Step: 66757, {'train/accuracy': 0.6906688213348389, 'train/loss': 1.5202724933624268, 'validation/accuracy': 0.6343199610710144, 'validation/loss': 1.7723238468170166, 'validation/num_examples': 50000, 'test/accuracy': 0.5078000426292419, 'test/loss': 2.4403321743011475, 'test/num_examples': 10000, 'score': 22506.95446920395, 'total_duration': 23442.88888812065, 'accumulated_submission_time': 22506.95446920395, 'accumulated_eval_time': 933.426650762558, 'accumulated_logging_time': 1.4981484413146973} | |
I0914 13:45:21.947076 139618036999936 logging_writer.py:48] [66757] accumulated_eval_time=933.426651, accumulated_logging_time=1.498148, accumulated_submission_time=22506.954469, global_step=66757, preemption_count=0, score=22506.954469, test/accuracy=0.507800, test/loss=2.440332, test/num_examples=10000, total_duration=23442.888888, train/accuracy=0.690669, train/loss=1.520272, validation/accuracy=0.634320, validation/loss=1.772324, validation/num_examples=50000 | |
I0914 13:46:43.979573 139618045392640 logging_writer.py:48] [67000] global_step=67000, grad_norm=0.2981511354446411, loss=3.263331413269043 | |
I0914 13:49:32.145080 139618036999936 logging_writer.py:48] [67500] global_step=67500, grad_norm=0.3048495054244995, loss=3.302417278289795 | |
I0914 13:52:20.352500 139618045392640 logging_writer.py:48] [68000] global_step=68000, grad_norm=0.3056058883666992, loss=3.3564062118530273 | |
I0914 13:53:52.119890 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 13:53:59.863528 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 13:54:11.162717 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 13:54:13.443134 139785753851712 submission_runner.py:376] Time since start: 23974.41s, Step: 68275, {'train/accuracy': 0.6909080147743225, 'train/loss': 1.470110535621643, 'validation/accuracy': 0.6406999826431274, 'validation/loss': 1.7015717029571533, 'validation/num_examples': 50000, 'test/accuracy': 0.5154000520706177, 'test/loss': 2.3714752197265625, 'test/num_examples': 10000, 'score': 23017.093948602676, 'total_duration': 23974.409868717194, 'accumulated_submission_time': 23017.093948602676, 'accumulated_eval_time': 954.7498555183411, 'accumulated_logging_time': 1.5329856872558594} | |
I0914 13:54:13.467223 139618045392640 logging_writer.py:48] [68275] accumulated_eval_time=954.749856, accumulated_logging_time=1.532986, accumulated_submission_time=23017.093949, global_step=68275, preemption_count=0, score=23017.093949, test/accuracy=0.515400, test/loss=2.371475, test/num_examples=10000, total_duration=23974.409869, train/accuracy=0.690908, train/loss=1.470111, validation/accuracy=0.640700, validation/loss=1.701572, validation/num_examples=50000 | |
I0914 13:55:29.413370 139620419352320 logging_writer.py:48] [68500] global_step=68500, grad_norm=0.30467408895492554, loss=3.2854578495025635 | |
I0914 13:58:17.648288 139618045392640 logging_writer.py:48] [69000] global_step=69000, grad_norm=0.3096533715724945, loss=3.2790355682373047 | |
I0914 14:01:05.889391 139620419352320 logging_writer.py:48] [69500] global_step=69500, grad_norm=0.3090498149394989, loss=3.3411953449249268 | |
I0914 14:02:43.574101 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 14:02:51.481779 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 14:03:02.490966 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 14:03:04.750303 139785753851712 submission_runner.py:376] Time since start: 24505.72s, Step: 69792, {'train/accuracy': 0.6947743892669678, 'train/loss': 1.4690238237380981, 'validation/accuracy': 0.6391599774360657, 'validation/loss': 1.700404167175293, 'validation/num_examples': 50000, 'test/accuracy': 0.5162000060081482, 'test/loss': 2.340641736984253, 'test/num_examples': 10000, 'score': 23527.16760492325, 'total_duration': 24505.717046260834, 'accumulated_submission_time': 23527.16760492325, 'accumulated_eval_time': 975.9260275363922, 'accumulated_logging_time': 1.5669758319854736} | |
I0914 14:03:04.773891 139620410959616 logging_writer.py:48] [69792] accumulated_eval_time=975.926028, accumulated_logging_time=1.566976, accumulated_submission_time=23527.167605, global_step=69792, preemption_count=0, score=23527.167605, test/accuracy=0.516200, test/loss=2.340642, test/num_examples=10000, total_duration=24505.717046, train/accuracy=0.694774, train/loss=1.469024, validation/accuracy=0.639160, validation/loss=1.700404, validation/num_examples=50000 | |
I0914 14:04:15.037125 139621090420480 logging_writer.py:48] [70000] global_step=70000, grad_norm=0.3104066550731659, loss=3.3210484981536865 | |
I0914 14:07:02.985107 139620410959616 logging_writer.py:48] [70500] global_step=70500, grad_norm=0.31107574701309204, loss=3.2496469020843506 | |
I0914 14:09:51.206982 139621090420480 logging_writer.py:48] [71000] global_step=71000, grad_norm=0.3089660704135895, loss=3.335127353668213 | |
I0914 14:11:34.908772 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 14:11:42.613535 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 14:11:53.646508 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 14:11:55.885930 139785753851712 submission_runner.py:376] Time since start: 25036.85s, Step: 71310, {'train/accuracy': 0.6975247263908386, 'train/loss': 1.482591152191162, 'validation/accuracy': 0.6232199668884277, 'validation/loss': 1.8254543542861938, 'validation/num_examples': 50000, 'test/accuracy': 0.4918000102043152, 'test/loss': 2.5100009441375732, 'test/num_examples': 10000, 'score': 24037.270278930664, 'total_duration': 25036.85267686844, 'accumulated_submission_time': 24037.270278930664, 'accumulated_eval_time': 996.903163433075, 'accumulated_logging_time': 1.599836826324463} | |
I0914 14:11:55.910848 139618028607232 logging_writer.py:48] [71310] accumulated_eval_time=996.903163, accumulated_logging_time=1.599837, accumulated_submission_time=24037.270279, global_step=71310, preemption_count=0, score=24037.270279, test/accuracy=0.491800, test/loss=2.510001, test/num_examples=10000, total_duration=25036.852677, train/accuracy=0.697525, train/loss=1.482591, validation/accuracy=0.623220, validation/loss=1.825454, validation/num_examples=50000 | |
I0914 14:13:00.034183 139618036999936 logging_writer.py:48] [71500] global_step=71500, grad_norm=0.2995145320892334, loss=3.251258373260498 | |
I0914 14:15:47.818787 139618028607232 logging_writer.py:48] [72000] global_step=72000, grad_norm=0.31369438767433167, loss=3.2654480934143066 | |
I0914 14:18:35.882528 139618036999936 logging_writer.py:48] [72500] global_step=72500, grad_norm=0.30754441022872925, loss=3.236921787261963 | |
I0914 14:20:26.003981 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 14:20:33.833399 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 14:20:44.757493 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 14:20:47.036332 139785753851712 submission_runner.py:376] Time since start: 25568.00s, Step: 72829, {'train/accuracy': 0.7215601205825806, 'train/loss': 1.327860713005066, 'validation/accuracy': 0.6525799632072449, 'validation/loss': 1.6442086696624756, 'validation/num_examples': 50000, 'test/accuracy': 0.5234000086784363, 'test/loss': 2.286708116531372, 'test/num_examples': 10000, 'score': 24547.331042289734, 'total_duration': 25568.003078222275, 'accumulated_submission_time': 24547.331042289734, 'accumulated_eval_time': 1017.9354872703552, 'accumulated_logging_time': 1.6340680122375488} | |
I0914 14:20:47.064872 139618028607232 logging_writer.py:48] [72829] accumulated_eval_time=1017.935487, accumulated_logging_time=1.634068, accumulated_submission_time=24547.331042, global_step=72829, preemption_count=0, score=24547.331042, test/accuracy=0.523400, test/loss=2.286708, test/num_examples=10000, total_duration=25568.003078, train/accuracy=0.721560, train/loss=1.327861, validation/accuracy=0.652580, validation/loss=1.644209, validation/num_examples=50000 | |
I0914 14:21:44.860488 139621082027776 logging_writer.py:48] [73000] global_step=73000, grad_norm=0.3169178366661072, loss=3.2606658935546875 | |
I0914 14:24:32.812964 139618028607232 logging_writer.py:48] [73500] global_step=73500, grad_norm=0.3276415467262268, loss=3.2667770385742188 | |
I0914 14:27:21.031353 139621082027776 logging_writer.py:48] [74000] global_step=74000, grad_norm=0.3188159465789795, loss=3.289651393890381 | |
I0914 14:29:17.197638 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 14:29:24.919732 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 14:29:36.035775 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 14:29:38.325842 139785753851712 submission_runner.py:376] Time since start: 26099.29s, Step: 74347, {'train/accuracy': 0.6916055083274841, 'train/loss': 1.4878246784210205, 'validation/accuracy': 0.6294199824333191, 'validation/loss': 1.7609314918518066, 'validation/num_examples': 50000, 'test/accuracy': 0.5049999952316284, 'test/loss': 2.4299111366271973, 'test/num_examples': 10000, 'score': 25057.428204774857, 'total_duration': 26099.292588233948, 'accumulated_submission_time': 25057.428204774857, 'accumulated_eval_time': 1039.0636780261993, 'accumulated_logging_time': 1.675804615020752} | |
I0914 14:29:38.350225 139618028607232 logging_writer.py:48] [74347] accumulated_eval_time=1039.063678, accumulated_logging_time=1.675805, accumulated_submission_time=25057.428205, global_step=74347, preemption_count=0, score=25057.428205, test/accuracy=0.505000, test/loss=2.429911, test/num_examples=10000, total_duration=26099.292588, train/accuracy=0.691606, train/loss=1.487825, validation/accuracy=0.629420, validation/loss=1.760931, validation/num_examples=50000 | |
I0914 14:30:30.036252 139618036999936 logging_writer.py:48] [74500] global_step=74500, grad_norm=0.29929229617118835, loss=3.231595039367676 | |
I0914 14:33:18.128115 139618028607232 logging_writer.py:48] [75000] global_step=75000, grad_norm=0.30234041810035706, loss=3.156550407409668 | |
I0914 14:36:06.326229 139618036999936 logging_writer.py:48] [75500] global_step=75500, grad_norm=0.32938486337661743, loss=3.363729953765869 | |
I0914 14:38:08.577289 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 14:38:16.598191 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 14:38:27.694827 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 14:38:29.926154 139785753851712 submission_runner.py:376] Time since start: 26630.89s, Step: 75865, {'train/accuracy': 0.7079480290412903, 'train/loss': 1.436248540878296, 'validation/accuracy': 0.6432799696922302, 'validation/loss': 1.7123216390609741, 'validation/num_examples': 50000, 'test/accuracy': 0.5208000540733337, 'test/loss': 2.3495261669158936, 'test/num_examples': 10000, 'score': 25567.620626449585, 'total_duration': 26630.89289879799, 'accumulated_submission_time': 25567.620626449585, 'accumulated_eval_time': 1060.4125316143036, 'accumulated_logging_time': 1.7115974426269531} | |
I0914 14:38:29.953331 139620419352320 logging_writer.py:48] [75865] accumulated_eval_time=1060.412532, accumulated_logging_time=1.711597, accumulated_submission_time=25567.620626, global_step=75865, preemption_count=0, score=25567.620626, test/accuracy=0.520800, test/loss=2.349526, test/num_examples=10000, total_duration=26630.892899, train/accuracy=0.707948, train/loss=1.436249, validation/accuracy=0.643280, validation/loss=1.712322, validation/num_examples=50000 | |
I0914 14:39:15.698670 139621082027776 logging_writer.py:48] [76000] global_step=76000, grad_norm=0.312216192483902, loss=3.186450719833374 | |
I0914 14:42:03.896099 139620419352320 logging_writer.py:48] [76500] global_step=76500, grad_norm=0.3170742392539978, loss=3.1856775283813477 | |
I0914 14:44:51.996706 139621082027776 logging_writer.py:48] [77000] global_step=77000, grad_norm=0.31434234976768494, loss=3.184199810028076 | |
I0914 14:46:59.946507 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 14:47:07.696892 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 14:47:18.790356 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 14:47:21.105739 139785753851712 submission_runner.py:376] Time since start: 27162.07s, Step: 77382, {'train/accuracy': 0.6989995241165161, 'train/loss': 1.422307014465332, 'validation/accuracy': 0.642579972743988, 'validation/loss': 1.6762522459030151, 'validation/num_examples': 50000, 'test/accuracy': 0.5154000520706177, 'test/loss': 2.3255693912506104, 'test/num_examples': 10000, 'score': 26077.580560445786, 'total_duration': 27162.072484493256, 'accumulated_submission_time': 26077.580560445786, 'accumulated_eval_time': 1081.571738243103, 'accumulated_logging_time': 1.748826503753662} | |
I0914 14:47:21.126562 139618028607232 logging_writer.py:48] [77382] accumulated_eval_time=1081.571738, accumulated_logging_time=1.748827, accumulated_submission_time=26077.580560, global_step=77382, preemption_count=0, score=26077.580560, test/accuracy=0.515400, test/loss=2.325569, test/num_examples=10000, total_duration=27162.072484, train/accuracy=0.699000, train/loss=1.422307, validation/accuracy=0.642580, validation/loss=1.676252, validation/num_examples=50000 | |
I0914 14:48:01.154166 139618036999936 logging_writer.py:48] [77500] global_step=77500, grad_norm=0.31988903880119324, loss=3.2403721809387207 | |
I0914 14:50:49.111113 139618028607232 logging_writer.py:48] [78000] global_step=78000, grad_norm=0.31698429584503174, loss=3.1237356662750244 | |
I0914 14:53:37.223134 139618036999936 logging_writer.py:48] [78500] global_step=78500, grad_norm=0.3259633481502533, loss=3.25683856010437 | |
I0914 14:55:51.240862 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 14:55:58.930301 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 14:56:10.042633 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 14:56:12.309500 139785753851712 submission_runner.py:376] Time since start: 27693.28s, Step: 78900, {'train/accuracy': 0.7405332922935486, 'train/loss': 1.2673977613449097, 'validation/accuracy': 0.6460199952125549, 'validation/loss': 1.6637077331542969, 'validation/num_examples': 50000, 'test/accuracy': 0.5200999975204468, 'test/loss': 2.316652536392212, 'test/num_examples': 10000, 'score': 26587.66230893135, 'total_duration': 27693.276193618774, 'accumulated_submission_time': 26587.66230893135, 'accumulated_eval_time': 1102.640297651291, 'accumulated_logging_time': 1.7789013385772705} | |
I0914 14:56:12.334219 139621090420480 logging_writer.py:48] [78900] accumulated_eval_time=1102.640298, accumulated_logging_time=1.778901, accumulated_submission_time=26587.662309, global_step=78900, preemption_count=0, score=26587.662309, test/accuracy=0.520100, test/loss=2.316653, test/num_examples=10000, total_duration=27693.276194, train/accuracy=0.740533, train/loss=1.267398, validation/accuracy=0.646020, validation/loss=1.663708, validation/num_examples=50000 | |
I0914 14:56:46.298706 139621098813184 logging_writer.py:48] [79000] global_step=79000, grad_norm=0.3292911946773529, loss=3.169358015060425 | |
I0914 14:59:34.547732 139621090420480 logging_writer.py:48] [79500] global_step=79500, grad_norm=0.329366534948349, loss=3.227858304977417 | |
I0914 15:02:22.801796 139621098813184 logging_writer.py:48] [80000] global_step=80000, grad_norm=0.3188716173171997, loss=3.21577787399292 | |
I0914 15:04:42.494943 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 15:04:50.231371 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 15:05:01.274785 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 15:05:03.605396 139785753851712 submission_runner.py:376] Time since start: 28224.57s, Step: 80417, {'train/accuracy': 0.7322424650192261, 'train/loss': 1.2937663793563843, 'validation/accuracy': 0.6529799699783325, 'validation/loss': 1.6502137184143066, 'validation/num_examples': 50000, 'test/accuracy': 0.527999997138977, 'test/loss': 2.2828421592712402, 'test/num_examples': 10000, 'score': 27097.790013074875, 'total_duration': 28224.572131872177, 'accumulated_submission_time': 27097.790013074875, 'accumulated_eval_time': 1123.7507123947144, 'accumulated_logging_time': 1.813563585281372} | |
I0914 15:05:03.629084 139618028607232 logging_writer.py:48] [80417] accumulated_eval_time=1123.750712, accumulated_logging_time=1.813564, accumulated_submission_time=27097.790013, global_step=80417, preemption_count=0, score=27097.790013, test/accuracy=0.528000, test/loss=2.282842, test/num_examples=10000, total_duration=28224.572132, train/accuracy=0.732242, train/loss=1.293766, validation/accuracy=0.652980, validation/loss=1.650214, validation/num_examples=50000 | |
I0914 15:05:31.847938 139618045392640 logging_writer.py:48] [80500] global_step=80500, grad_norm=0.34364253282546997, loss=3.3060929775238037 | |
I0914 15:08:19.974839 139618028607232 logging_writer.py:48] [81000] global_step=81000, grad_norm=0.3189429044723511, loss=3.134965419769287 | |
I0914 15:11:07.884455 139618045392640 logging_writer.py:48] [81500] global_step=81500, grad_norm=0.3358222246170044, loss=3.146552085876465 | |
I0914 15:13:33.672224 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 15:13:41.402096 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 15:13:52.418636 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 15:13:54.673453 139785753851712 submission_runner.py:376] Time since start: 28755.64s, Step: 81935, {'train/accuracy': 0.7216796875, 'train/loss': 1.3296536207199097, 'validation/accuracy': 0.6556800007820129, 'validation/loss': 1.635520577430725, 'validation/num_examples': 50000, 'test/accuracy': 0.5260000228881836, 'test/loss': 2.306215286254883, 'test/num_examples': 10000, 'score': 27607.79568052292, 'total_duration': 28755.640197753906, 'accumulated_submission_time': 27607.79568052292, 'accumulated_eval_time': 1144.751916885376, 'accumulated_logging_time': 1.8511121273040771} | |
I0914 15:13:54.701997 139621090420480 logging_writer.py:48] [81935] accumulated_eval_time=1144.751917, accumulated_logging_time=1.851112, accumulated_submission_time=27607.795681, global_step=81935, preemption_count=0, score=27607.795681, test/accuracy=0.526000, test/loss=2.306215, test/num_examples=10000, total_duration=28755.640198, train/accuracy=0.721680, train/loss=1.329654, validation/accuracy=0.655680, validation/loss=1.635521, validation/num_examples=50000 | |
I0914 15:14:16.905740 139621098813184 logging_writer.py:48] [82000] global_step=82000, grad_norm=0.32502761483192444, loss=3.2584471702575684 | |
I0914 15:17:05.131891 139621090420480 logging_writer.py:48] [82500] global_step=82500, grad_norm=0.33060410618782043, loss=3.2520453929901123 | |
I0914 15:19:53.375931 139621098813184 logging_writer.py:48] [83000] global_step=83000, grad_norm=0.325259804725647, loss=3.173687219619751 | |
I0914 15:22:24.842192 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 15:22:32.534203 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 15:22:43.610139 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 15:22:45.856654 139785753851712 submission_runner.py:376] Time since start: 29286.82s, Step: 83452, {'train/accuracy': 0.7293726205825806, 'train/loss': 1.2778840065002441, 'validation/accuracy': 0.664139986038208, 'validation/loss': 1.5573835372924805, 'validation/num_examples': 50000, 'test/accuracy': 0.534500002861023, 'test/loss': 2.211136817932129, 'test/num_examples': 10000, 'score': 28117.902702093124, 'total_duration': 29286.823399305344, 'accumulated_submission_time': 28117.902702093124, 'accumulated_eval_time': 1165.7663543224335, 'accumulated_logging_time': 1.8896336555480957} | |
I0914 15:22:45.884523 139618036999936 logging_writer.py:48] [83452] accumulated_eval_time=1165.766354, accumulated_logging_time=1.889634, accumulated_submission_time=28117.902702, global_step=83452, preemption_count=0, score=28117.902702, test/accuracy=0.534500, test/loss=2.211137, test/num_examples=10000, total_duration=29286.823399, train/accuracy=0.729373, train/loss=1.277884, validation/accuracy=0.664140, validation/loss=1.557384, validation/num_examples=50000 | |
I0914 15:23:02.359415 139618045392640 logging_writer.py:48] [83500] global_step=83500, grad_norm=0.3362581431865692, loss=3.143125057220459 | |
I0914 15:25:50.559870 139618036999936 logging_writer.py:48] [84000] global_step=84000, grad_norm=0.3290119767189026, loss=3.218841552734375 | |
I0914 15:28:38.785134 139618045392640 logging_writer.py:48] [84500] global_step=84500, grad_norm=0.3364306390285492, loss=3.2182228565216064 | |
I0914 15:31:15.978195 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 15:31:23.674424 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 15:31:34.592998 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 15:31:36.879657 139785753851712 submission_runner.py:376] Time since start: 29817.85s, Step: 84969, {'train/accuracy': 0.7102997303009033, 'train/loss': 1.3722602128982544, 'validation/accuracy': 0.6526600122451782, 'validation/loss': 1.63387930393219, 'validation/num_examples': 50000, 'test/accuracy': 0.5289000272750854, 'test/loss': 2.2834620475769043, 'test/num_examples': 10000, 'score': 28627.961901664734, 'total_duration': 29817.84640312195, 'accumulated_submission_time': 28627.961901664734, 'accumulated_eval_time': 1186.6677963733673, 'accumulated_logging_time': 1.9287693500518799} | |
I0914 15:31:36.904275 139618045392640 logging_writer.py:48] [84969] accumulated_eval_time=1186.667796, accumulated_logging_time=1.928769, accumulated_submission_time=28627.961902, global_step=84969, preemption_count=0, score=28627.961902, test/accuracy=0.528900, test/loss=2.283462, test/num_examples=10000, total_duration=29817.846403, train/accuracy=0.710300, train/loss=1.372260, validation/accuracy=0.652660, validation/loss=1.633879, validation/num_examples=50000 | |
I0914 15:31:47.661670 139621090420480 logging_writer.py:48] [85000] global_step=85000, grad_norm=0.33524322509765625, loss=3.2073211669921875 | |
I0914 15:34:35.556128 139618045392640 logging_writer.py:48] [85500] global_step=85500, grad_norm=0.344351589679718, loss=3.215855360031128 | |
I0914 15:37:23.726473 139621090420480 logging_writer.py:48] [86000] global_step=86000, grad_norm=0.34118911623954773, loss=3.184333324432373 | |
I0914 15:40:06.981885 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 15:40:14.750491 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 15:40:25.937859 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 15:40:28.169976 139785753851712 submission_runner.py:376] Time since start: 30349.14s, Step: 86487, {'train/accuracy': 0.7192083597183228, 'train/loss': 1.318070411682129, 'validation/accuracy': 0.6582799553871155, 'validation/loss': 1.5826328992843628, 'validation/num_examples': 50000, 'test/accuracy': 0.5327000021934509, 'test/loss': 2.2320618629455566, 'test/num_examples': 10000, 'score': 29138.005053281784, 'total_duration': 30349.136724233627, 'accumulated_submission_time': 29138.005053281784, 'accumulated_eval_time': 1207.8558654785156, 'accumulated_logging_time': 1.964245080947876} | |
I0914 15:40:28.195220 139620419352320 logging_writer.py:48] [86487] accumulated_eval_time=1207.855865, accumulated_logging_time=1.964245, accumulated_submission_time=29138.005053, global_step=86487, preemption_count=0, score=29138.005053, test/accuracy=0.532700, test/loss=2.232062, test/num_examples=10000, total_duration=30349.136724, train/accuracy=0.719208, train/loss=1.318070, validation/accuracy=0.658280, validation/loss=1.582633, validation/num_examples=50000 | |
I0914 15:40:32.916276 139621082027776 logging_writer.py:48] [86500] global_step=86500, grad_norm=0.33634471893310547, loss=3.165295124053955 | |
I0914 15:43:21.105581 139620419352320 logging_writer.py:48] [87000] global_step=87000, grad_norm=0.3431008756160736, loss=3.158895969390869 | |
I0914 15:46:09.041963 139621082027776 logging_writer.py:48] [87500] global_step=87500, grad_norm=0.33978071808815, loss=3.1531012058258057 | |
I0914 15:48:57.184348 139620419352320 logging_writer.py:48] [88000] global_step=88000, grad_norm=0.33860036730766296, loss=3.157572031021118 | |
I0914 15:48:58.283360 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 15:49:05.941977 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 15:49:17.018113 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 15:49:19.268614 139785753851712 submission_runner.py:376] Time since start: 30880.24s, Step: 88005, {'train/accuracy': 0.7746930718421936, 'train/loss': 1.0886952877044678, 'validation/accuracy': 0.6663599610328674, 'validation/loss': 1.5537863969802856, 'validation/num_examples': 50000, 'test/accuracy': 0.5420000553131104, 'test/loss': 2.211760997772217, 'test/num_examples': 10000, 'score': 29648.06053853035, 'total_duration': 30880.23536133766, 'accumulated_submission_time': 29648.06053853035, 'accumulated_eval_time': 1228.8410770893097, 'accumulated_logging_time': 1.9997587203979492} | |
I0914 15:49:19.292289 139618036999936 logging_writer.py:48] [88005] accumulated_eval_time=1228.841077, accumulated_logging_time=1.999759, accumulated_submission_time=29648.060539, global_step=88005, preemption_count=0, score=29648.060539, test/accuracy=0.542000, test/loss=2.211761, test/num_examples=10000, total_duration=30880.235361, train/accuracy=0.774693, train/loss=1.088695, validation/accuracy=0.666360, validation/loss=1.553786, validation/num_examples=50000 | |
I0914 15:52:06.000102 139618045392640 logging_writer.py:48] [88500] global_step=88500, grad_norm=0.3258196711540222, loss=3.0706851482391357 | |
I0914 15:54:54.075734 139618036999936 logging_writer.py:48] [89000] global_step=89000, grad_norm=0.3466311991214752, loss=3.194755792617798 | |
I0914 15:57:42.013890 139618045392640 logging_writer.py:48] [89500] global_step=89500, grad_norm=0.3438502848148346, loss=3.107487678527832 | |
I0914 15:57:49.511776 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 15:57:57.318909 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 15:58:08.388863 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 15:58:10.616803 139785753851712 submission_runner.py:376] Time since start: 31411.58s, Step: 89524, {'train/accuracy': 0.7538663744926453, 'train/loss': 1.2191698551177979, 'validation/accuracy': 0.6728999614715576, 'validation/loss': 1.5614254474639893, 'validation/num_examples': 50000, 'test/accuracy': 0.5437000393867493, 'test/loss': 2.203444719314575, 'test/num_examples': 10000, 'score': 30158.248270750046, 'total_duration': 31411.583546876907, 'accumulated_submission_time': 30158.248270750046, 'accumulated_eval_time': 1249.946064710617, 'accumulated_logging_time': 2.0324220657348633} | |
I0914 15:58:10.640623 139621090420480 logging_writer.py:48] [89524] accumulated_eval_time=1249.946065, accumulated_logging_time=2.032422, accumulated_submission_time=30158.248271, global_step=89524, preemption_count=0, score=30158.248271, test/accuracy=0.543700, test/loss=2.203445, test/num_examples=10000, total_duration=31411.583547, train/accuracy=0.753866, train/loss=1.219170, validation/accuracy=0.672900, validation/loss=1.561425, validation/num_examples=50000 | |
I0914 16:00:50.736585 139621098813184 logging_writer.py:48] [90000] global_step=90000, grad_norm=0.3367817997932434, loss=3.0911600589752197 | |
I0914 16:03:38.763216 139621090420480 logging_writer.py:48] [90500] global_step=90500, grad_norm=0.3441648781299591, loss=3.13027024269104 | |
I0914 16:06:26.761715 139621098813184 logging_writer.py:48] [91000] global_step=91000, grad_norm=0.3405533730983734, loss=3.1159348487854004 | |
I0914 16:06:40.637633 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 16:06:48.270154 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 16:06:59.503044 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 16:07:01.769087 139785753851712 submission_runner.py:376] Time since start: 31942.74s, Step: 91043, {'train/accuracy': 0.7286351919174194, 'train/loss': 1.2893345355987549, 'validation/accuracy': 0.6548799872398376, 'validation/loss': 1.6241097450256348, 'validation/num_examples': 50000, 'test/accuracy': 0.5337000489234924, 'test/loss': 2.274324893951416, 'test/num_examples': 10000, 'score': 30668.21325492859, 'total_duration': 31942.735835552216, 'accumulated_submission_time': 30668.21325492859, 'accumulated_eval_time': 1271.077484369278, 'accumulated_logging_time': 2.065810203552246} | |
I0914 16:07:01.792829 139618045392640 logging_writer.py:48] [91043] accumulated_eval_time=1271.077484, accumulated_logging_time=2.065810, accumulated_submission_time=30668.213255, global_step=91043, preemption_count=0, score=30668.213255, test/accuracy=0.533700, test/loss=2.274325, test/num_examples=10000, total_duration=31942.735836, train/accuracy=0.728635, train/loss=1.289335, validation/accuracy=0.654880, validation/loss=1.624110, validation/num_examples=50000 | |
I0914 16:09:35.603240 139620410959616 logging_writer.py:48] [91500] global_step=91500, grad_norm=0.33739516139030457, loss=3.0308287143707275 | |
I0914 16:12:23.677797 139618045392640 logging_writer.py:48] [92000] global_step=92000, grad_norm=0.3435581922531128, loss=3.043058395385742 | |
I0914 16:15:11.851698 139620410959616 logging_writer.py:48] [92500] global_step=92500, grad_norm=0.354044109582901, loss=3.1414432525634766 | |
I0914 16:15:31.795955 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 16:15:39.450120 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 16:15:50.477826 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 16:15:52.755001 139785753851712 submission_runner.py:376] Time since start: 32473.72s, Step: 92561, {'train/accuracy': 0.7453164458274841, 'train/loss': 1.2298004627227783, 'validation/accuracy': 0.6710399985313416, 'validation/loss': 1.5516157150268555, 'validation/num_examples': 50000, 'test/accuracy': 0.5501000285148621, 'test/loss': 2.1891353130340576, 'test/num_examples': 10000, 'score': 31178.181513547897, 'total_duration': 32473.721732854843, 'accumulated_submission_time': 31178.181513547897, 'accumulated_eval_time': 1292.0364754199982, 'accumulated_logging_time': 2.102283000946045} | |
I0914 16:15:52.780585 139620410959616 logging_writer.py:48] [92561] accumulated_eval_time=1292.036475, accumulated_logging_time=2.102283, accumulated_submission_time=31178.181514, global_step=92561, preemption_count=0, score=31178.181514, test/accuracy=0.550100, test/loss=2.189135, test/num_examples=10000, total_duration=32473.721733, train/accuracy=0.745316, train/loss=1.229800, validation/accuracy=0.671040, validation/loss=1.551616, validation/num_examples=50000 | |
I0914 16:18:20.689055 139621090420480 logging_writer.py:48] [93000] global_step=93000, grad_norm=0.3550674021244049, loss=3.0757088661193848 | |
I0914 16:21:08.756993 139620410959616 logging_writer.py:48] [93500] global_step=93500, grad_norm=0.36305615305900574, loss=3.072169065475464 | |
I0914 16:23:56.953319 139621090420480 logging_writer.py:48] [94000] global_step=94000, grad_norm=0.37776169180870056, loss=3.1176300048828125 | |
I0914 16:24:22.947271 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 16:24:30.598288 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 16:24:41.735978 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 16:24:44.008029 139785753851712 submission_runner.py:376] Time since start: 33004.97s, Step: 94079, {'train/accuracy': 0.7446189522743225, 'train/loss': 1.263240098953247, 'validation/accuracy': 0.677619993686676, 'validation/loss': 1.5534390211105347, 'validation/num_examples': 50000, 'test/accuracy': 0.5550000071525574, 'test/loss': 2.191549062728882, 'test/num_examples': 10000, 'score': 31688.31489801407, 'total_duration': 33004.9747774601, 'accumulated_submission_time': 31688.31489801407, 'accumulated_eval_time': 1313.0971965789795, 'accumulated_logging_time': 2.1376092433929443} | |
I0914 16:24:44.032844 139618045392640 logging_writer.py:48] [94079] accumulated_eval_time=1313.097197, accumulated_logging_time=2.137609, accumulated_submission_time=31688.314898, global_step=94079, preemption_count=0, score=31688.314898, test/accuracy=0.555000, test/loss=2.191549, test/num_examples=10000, total_duration=33004.974777, train/accuracy=0.744619, train/loss=1.263240, validation/accuracy=0.677620, validation/loss=1.553439, validation/num_examples=50000 | |
I0914 16:27:05.939068 139620410959616 logging_writer.py:48] [94500] global_step=94500, grad_norm=0.3512846827507019, loss=3.0405852794647217 | |
I0914 16:29:54.129690 139618045392640 logging_writer.py:48] [95000] global_step=95000, grad_norm=0.3504914939403534, loss=3.1171083450317383 | |
I0914 16:32:42.109276 139620410959616 logging_writer.py:48] [95500] global_step=95500, grad_norm=0.3645898997783661, loss=3.053126811981201 | |
I0914 16:33:14.164235 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 16:33:21.834095 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 16:33:32.892427 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 16:33:35.152878 139785753851712 submission_runner.py:376] Time since start: 33536.12s, Step: 95597, {'train/accuracy': 0.7449776530265808, 'train/loss': 1.2335659265518188, 'validation/accuracy': 0.6818400025367737, 'validation/loss': 1.5183688402175903, 'validation/num_examples': 50000, 'test/accuracy': 0.5538000464439392, 'test/loss': 2.1452736854553223, 'test/num_examples': 10000, 'score': 32198.41109275818, 'total_duration': 33536.11962342262, 'accumulated_submission_time': 32198.41109275818, 'accumulated_eval_time': 1334.085800409317, 'accumulated_logging_time': 2.1747827529907227} | |
I0914 16:33:35.181872 139621090420480 logging_writer.py:48] [95597] accumulated_eval_time=1334.085800, accumulated_logging_time=2.174783, accumulated_submission_time=32198.411093, global_step=95597, preemption_count=0, score=32198.411093, test/accuracy=0.553800, test/loss=2.145274, test/num_examples=10000, total_duration=33536.119623, train/accuracy=0.744978, train/loss=1.233566, validation/accuracy=0.681840, validation/loss=1.518369, validation/num_examples=50000 | |
I0914 16:35:50.819981 139621098813184 logging_writer.py:48] [96000] global_step=96000, grad_norm=0.36467593908309937, loss=3.0915420055389404 | |
I0914 16:38:38.981382 139621090420480 logging_writer.py:48] [96500] global_step=96500, grad_norm=0.36556708812713623, loss=3.0985519886016846 | |
I0914 16:41:27.161284 139621098813184 logging_writer.py:48] [97000] global_step=97000, grad_norm=0.36795929074287415, loss=2.9657955169677734 | |
I0914 16:42:05.221490 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 16:42:12.871140 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 16:42:23.909234 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 16:42:26.155958 139785753851712 submission_runner.py:376] Time since start: 34067.12s, Step: 97115, {'train/accuracy': 0.7691525816917419, 'train/loss': 1.1535043716430664, 'validation/accuracy': 0.667199969291687, 'validation/loss': 1.590874195098877, 'validation/num_examples': 50000, 'test/accuracy': 0.5401000380516052, 'test/loss': 2.221604347229004, 'test/num_examples': 10000, 'score': 32708.41788005829, 'total_duration': 34067.122673511505, 'accumulated_submission_time': 32708.41788005829, 'accumulated_eval_time': 1355.0201969146729, 'accumulated_logging_time': 2.2134897708892822} | |
I0914 16:42:26.179425 139618045392640 logging_writer.py:48] [97115] accumulated_eval_time=1355.020197, accumulated_logging_time=2.213490, accumulated_submission_time=32708.417880, global_step=97115, preemption_count=0, score=32708.417880, test/accuracy=0.540100, test/loss=2.221604, test/num_examples=10000, total_duration=34067.122674, train/accuracy=0.769153, train/loss=1.153504, validation/accuracy=0.667200, validation/loss=1.590874, validation/num_examples=50000 | |
I0914 16:44:36.011977 139620410959616 logging_writer.py:48] [97500] global_step=97500, grad_norm=0.36638134717941284, loss=3.09440541267395 | |
I0914 16:47:24.182037 139618045392640 logging_writer.py:48] [98000] global_step=98000, grad_norm=0.382851243019104, loss=3.08982515335083 | |
I0914 16:50:12.426895 139620410959616 logging_writer.py:48] [98500] global_step=98500, grad_norm=0.3698352873325348, loss=3.0431742668151855 | |
I0914 16:50:56.266190 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 16:51:04.192825 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 16:51:15.348309 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 16:51:17.568398 139785753851712 submission_runner.py:376] Time since start: 34598.54s, Step: 98632, {'train/accuracy': 0.7757692933082581, 'train/loss': 1.1324656009674072, 'validation/accuracy': 0.690559983253479, 'validation/loss': 1.5002992153167725, 'validation/num_examples': 50000, 'test/accuracy': 0.5630000233650208, 'test/loss': 2.134718656539917, 'test/num_examples': 10000, 'score': 33218.47289562225, 'total_duration': 34598.53514504433, 'accumulated_submission_time': 33218.47289562225, 'accumulated_eval_time': 1376.3223690986633, 'accumulated_logging_time': 2.246415138244629} | |
I0914 16:51:17.593133 139618036999936 logging_writer.py:48] [98632] accumulated_eval_time=1376.322369, accumulated_logging_time=2.246415, accumulated_submission_time=33218.472896, global_step=98632, preemption_count=0, score=33218.472896, test/accuracy=0.563000, test/loss=2.134719, test/num_examples=10000, total_duration=34598.535145, train/accuracy=0.775769, train/loss=1.132466, validation/accuracy=0.690560, validation/loss=1.500299, validation/num_examples=50000 | |
I0914 16:53:21.508238 139621090420480 logging_writer.py:48] [99000] global_step=99000, grad_norm=0.3786008358001709, loss=3.025963068008423 | |
I0914 16:56:09.690086 139618036999936 logging_writer.py:48] [99500] global_step=99500, grad_norm=0.3736327886581421, loss=3.033701181411743 | |
I0914 16:58:57.918751 139621090420480 logging_writer.py:48] [100000] global_step=100000, grad_norm=0.38231179118156433, loss=3.0271847248077393 | |
I0914 16:59:47.846658 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 16:59:55.566396 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 17:00:06.535897 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 17:00:08.794648 139785753851712 submission_runner.py:376] Time since start: 35129.76s, Step: 100150, {'train/accuracy': 0.7731584906578064, 'train/loss': 1.1149073839187622, 'validation/accuracy': 0.6941999793052673, 'validation/loss': 1.4566799402236938, 'validation/num_examples': 50000, 'test/accuracy': 0.5652000308036804, 'test/loss': 2.10357928276062, 'test/num_examples': 10000, 'score': 33728.69375014305, 'total_duration': 35129.76139855385, 'accumulated_submission_time': 33728.69375014305, 'accumulated_eval_time': 1397.2703416347504, 'accumulated_logging_time': 2.2805235385894775} | |
I0914 17:00:08.818891 139621082027776 logging_writer.py:48] [100150] accumulated_eval_time=1397.270342, accumulated_logging_time=2.280524, accumulated_submission_time=33728.693750, global_step=100150, preemption_count=0, score=33728.693750, test/accuracy=0.565200, test/loss=2.103579, test/num_examples=10000, total_duration=35129.761399, train/accuracy=0.773158, train/loss=1.114907, validation/accuracy=0.694200, validation/loss=1.456680, validation/num_examples=50000 | |
I0914 17:02:06.840601 139621107205888 logging_writer.py:48] [100500] global_step=100500, grad_norm=0.3828810453414917, loss=3.052368402481079 | |
I0914 17:04:55.038065 139621082027776 logging_writer.py:48] [101000] global_step=101000, grad_norm=0.387489914894104, loss=3.0429494380950928 | |
I0914 17:07:43.136696 139621107205888 logging_writer.py:48] [101500] global_step=101500, grad_norm=0.37475812435150146, loss=2.9192471504211426 | |
I0914 17:08:38.965880 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 17:08:46.603290 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 17:08:57.649946 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 17:08:59.932699 139785753851712 submission_runner.py:376] Time since start: 35660.90s, Step: 101668, {'train/accuracy': 0.7729790806770325, 'train/loss': 1.1216973066329956, 'validation/accuracy': 0.6942600011825562, 'validation/loss': 1.4576067924499512, 'validation/num_examples': 50000, 'test/accuracy': 0.5682000517845154, 'test/loss': 2.1024677753448486, 'test/num_examples': 10000, 'score': 34238.80842423439, 'total_duration': 35660.899446964264, 'accumulated_submission_time': 34238.80842423439, 'accumulated_eval_time': 1418.23712515831, 'accumulated_logging_time': 2.314119815826416} | |
I0914 17:08:59.957205 139620410959616 logging_writer.py:48] [101668] accumulated_eval_time=1418.237125, accumulated_logging_time=2.314120, accumulated_submission_time=34238.808424, global_step=101668, preemption_count=0, score=34238.808424, test/accuracy=0.568200, test/loss=2.102468, test/num_examples=10000, total_duration=35660.899447, train/accuracy=0.772979, train/loss=1.121697, validation/accuracy=0.694260, validation/loss=1.457607, validation/num_examples=50000 | |
I0914 17:10:51.731181 139620419352320 logging_writer.py:48] [102000] global_step=102000, grad_norm=0.37464380264282227, loss=2.9934191703796387 | |
I0914 17:13:39.813981 139620410959616 logging_writer.py:48] [102500] global_step=102500, grad_norm=0.3883311152458191, loss=3.0084176063537598 | |
I0914 17:16:28.027521 139620419352320 logging_writer.py:48] [103000] global_step=103000, grad_norm=0.38096585869789124, loss=2.983563184738159 | |
I0914 17:17:30.245515 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 17:17:37.844074 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 17:17:49.018025 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 17:17:51.227629 139785753851712 submission_runner.py:376] Time since start: 36192.19s, Step: 103187, {'train/accuracy': 0.7712850570678711, 'train/loss': 1.1430009603500366, 'validation/accuracy': 0.6977199912071228, 'validation/loss': 1.4706509113311768, 'validation/num_examples': 50000, 'test/accuracy': 0.5703999996185303, 'test/loss': 2.109886646270752, 'test/num_examples': 10000, 'score': 34749.06494665146, 'total_duration': 36192.19437289238, 'accumulated_submission_time': 34749.06494665146, 'accumulated_eval_time': 1439.2192113399506, 'accumulated_logging_time': 2.3478918075561523} | |
I0914 17:17:51.252707 139621082027776 logging_writer.py:48] [103187] accumulated_eval_time=1439.219211, accumulated_logging_time=2.347892, accumulated_submission_time=34749.064947, global_step=103187, preemption_count=0, score=34749.064947, test/accuracy=0.570400, test/loss=2.109887, test/num_examples=10000, total_duration=36192.194373, train/accuracy=0.771285, train/loss=1.143001, validation/accuracy=0.697720, validation/loss=1.470651, validation/num_examples=50000 | |
I0914 17:19:36.790855 139621098813184 logging_writer.py:48] [103500] global_step=103500, grad_norm=0.38000091910362244, loss=2.9304559230804443 | |
I0914 17:22:24.911260 139621082027776 logging_writer.py:48] [104000] global_step=104000, grad_norm=0.39263468980789185, loss=2.971359968185425 | |
I0914 17:25:13.136347 139621098813184 logging_writer.py:48] [104500] global_step=104500, grad_norm=0.40036970376968384, loss=3.0242748260498047 | |
I0914 17:26:21.542417 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 17:26:29.194283 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 17:26:40.280743 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 17:26:42.518318 139785753851712 submission_runner.py:376] Time since start: 36723.48s, Step: 104705, {'train/accuracy': 0.7724210619926453, 'train/loss': 1.1543452739715576, 'validation/accuracy': 0.6925199627876282, 'validation/loss': 1.485355019569397, 'validation/num_examples': 50000, 'test/accuracy': 0.5617000460624695, 'test/loss': 2.1401867866516113, 'test/num_examples': 10000, 'score': 35259.32285261154, 'total_duration': 36723.48499917984, 'accumulated_submission_time': 35259.32285261154, 'accumulated_eval_time': 1460.195018529892, 'accumulated_logging_time': 2.381948232650757} | |
I0914 17:26:42.543345 139618045392640 logging_writer.py:48] [104705] accumulated_eval_time=1460.195019, accumulated_logging_time=2.381948, accumulated_submission_time=35259.322853, global_step=104705, preemption_count=0, score=35259.322853, test/accuracy=0.561700, test/loss=2.140187, test/num_examples=10000, total_duration=36723.484999, train/accuracy=0.772421, train/loss=1.154345, validation/accuracy=0.692520, validation/loss=1.485355, validation/num_examples=50000 | |
I0914 17:28:21.862946 139620410959616 logging_writer.py:48] [105000] global_step=105000, grad_norm=0.4067330062389374, loss=3.011190891265869 | |
I0914 17:31:09.761775 139618045392640 logging_writer.py:48] [105500] global_step=105500, grad_norm=0.39534327387809753, loss=2.948467493057251 | |
I0914 17:33:57.917924 139620410959616 logging_writer.py:48] [106000] global_step=106000, grad_norm=0.40077218413352966, loss=3.005322217941284 | |
I0914 17:35:12.584997 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 17:35:20.279434 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 17:35:31.368846 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 17:35:33.610364 139785753851712 submission_runner.py:376] Time since start: 37254.58s, Step: 106224, {'train/accuracy': 0.8110052347183228, 'train/loss': 0.9716266393661499, 'validation/accuracy': 0.7090199589729309, 'validation/loss': 1.3921597003936768, 'validation/num_examples': 50000, 'test/accuracy': 0.5842000246047974, 'test/loss': 2.0155460834503174, 'test/num_examples': 10000, 'score': 35769.33108854294, 'total_duration': 37254.577083826065, 'accumulated_submission_time': 35769.33108854294, 'accumulated_eval_time': 1481.2203319072723, 'accumulated_logging_time': 2.4178240299224854} | |
I0914 17:35:33.645757 139621090420480 logging_writer.py:48] [106224] accumulated_eval_time=1481.220332, accumulated_logging_time=2.417824, accumulated_submission_time=35769.331089, global_step=106224, preemption_count=0, score=35769.331089, test/accuracy=0.584200, test/loss=2.015546, test/num_examples=10000, total_duration=37254.577084, train/accuracy=0.811005, train/loss=0.971627, validation/accuracy=0.709020, validation/loss=1.392160, validation/num_examples=50000 | |
I0914 17:37:06.574394 139621098813184 logging_writer.py:48] [106500] global_step=106500, grad_norm=0.4171896278858185, loss=2.970425605773926 | |
I0914 17:39:54.659567 139621090420480 logging_writer.py:48] [107000] global_step=107000, grad_norm=0.38486701250076294, loss=2.8754992485046387 | |
I0914 17:42:42.683755 139621098813184 logging_writer.py:48] [107500] global_step=107500, grad_norm=0.41185760498046875, loss=2.9692084789276123 | |
I0914 17:44:03.787653 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 17:44:11.570801 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 17:44:22.618466 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 17:44:24.886515 139785753851712 submission_runner.py:376] Time since start: 37785.85s, Step: 107743, {'train/accuracy': 0.7952407598495483, 'train/loss': 1.0415843725204468, 'validation/accuracy': 0.7023400068283081, 'validation/loss': 1.4332274198532104, 'validation/num_examples': 50000, 'test/accuracy': 0.5763000249862671, 'test/loss': 2.0579850673675537, 'test/num_examples': 10000, 'score': 36279.43704533577, 'total_duration': 37785.853261470795, 'accumulated_submission_time': 36279.43704533577, 'accumulated_eval_time': 1502.319188117981, 'accumulated_logging_time': 2.4656083583831787} | |
I0914 17:44:24.914275 139618028607232 logging_writer.py:48] [107743] accumulated_eval_time=1502.319188, accumulated_logging_time=2.465608, accumulated_submission_time=36279.437045, global_step=107743, preemption_count=0, score=36279.437045, test/accuracy=0.576300, test/loss=2.057985, test/num_examples=10000, total_duration=37785.853261, train/accuracy=0.795241, train/loss=1.041584, validation/accuracy=0.702340, validation/loss=1.433227, validation/num_examples=50000 | |
I0914 17:45:51.602435 139618036999936 logging_writer.py:48] [108000] global_step=108000, grad_norm=0.4147533178329468, loss=3.016648769378662 | |
I0914 17:48:39.803782 139618028607232 logging_writer.py:48] [108500] global_step=108500, grad_norm=0.40793949365615845, loss=2.919987440109253 | |
I0914 17:51:28.004422 139618036999936 logging_writer.py:48] [109000] global_step=109000, grad_norm=0.43241071701049805, loss=3.0008387565612793 | |
I0914 17:52:54.913345 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 17:53:02.549693 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 17:53:13.555589 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 17:53:15.787854 139785753851712 submission_runner.py:376] Time since start: 38316.75s, Step: 109260, {'train/accuracy': 0.80961012840271, 'train/loss': 0.9733060002326965, 'validation/accuracy': 0.7177000045776367, 'validation/loss': 1.3576146364212036, 'validation/num_examples': 50000, 'test/accuracy': 0.5911000370979309, 'test/loss': 1.9605129957199097, 'test/num_examples': 10000, 'score': 36789.401881456375, 'total_duration': 38316.75459957123, 'accumulated_submission_time': 36789.401881456375, 'accumulated_eval_time': 1523.1936659812927, 'accumulated_logging_time': 2.504333019256592} | |
I0914 17:53:15.813561 139621098813184 logging_writer.py:48] [109260] accumulated_eval_time=1523.193666, accumulated_logging_time=2.504333, accumulated_submission_time=36789.401881, global_step=109260, preemption_count=0, score=36789.401881, test/accuracy=0.591100, test/loss=1.960513, test/num_examples=10000, total_duration=38316.754600, train/accuracy=0.809610, train/loss=0.973306, validation/accuracy=0.717700, validation/loss=1.357615, validation/num_examples=50000 | |
I0914 17:54:36.872528 139621107205888 logging_writer.py:48] [109500] global_step=109500, grad_norm=0.4119095206260681, loss=2.924466609954834 | |
I0914 17:57:25.075344 139621098813184 logging_writer.py:48] [110000] global_step=110000, grad_norm=0.42898768186569214, loss=2.928290605545044 | |
I0914 18:00:13.132177 139621107205888 logging_writer.py:48] [110500] global_step=110500, grad_norm=0.42565202713012695, loss=2.8965280055999756 | |
I0914 18:01:45.962886 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 18:01:53.526907 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 18:02:04.644013 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 18:02:06.896817 139785753851712 submission_runner.py:376] Time since start: 38847.86s, Step: 110778, {'train/accuracy': 0.7960578799247742, 'train/loss': 1.0393996238708496, 'validation/accuracy': 0.7094399929046631, 'validation/loss': 1.413783073425293, 'validation/num_examples': 50000, 'test/accuracy': 0.5824000239372253, 'test/loss': 2.058758020401001, 'test/num_examples': 10000, 'score': 37299.51941990852, 'total_duration': 38847.8635661602, 'accumulated_submission_time': 37299.51941990852, 'accumulated_eval_time': 1544.1275751590729, 'accumulated_logging_time': 2.539213180541992} | |
I0914 18:02:06.922950 139618036999936 logging_writer.py:48] [110778] accumulated_eval_time=1544.127575, accumulated_logging_time=2.539213, accumulated_submission_time=37299.519420, global_step=110778, preemption_count=0, score=37299.519420, test/accuracy=0.582400, test/loss=2.058758, test/num_examples=10000, total_duration=38847.863566, train/accuracy=0.796058, train/loss=1.039400, validation/accuracy=0.709440, validation/loss=1.413783, validation/num_examples=50000 | |
I0914 18:03:21.900630 139618045392640 logging_writer.py:48] [111000] global_step=111000, grad_norm=0.43163755536079407, loss=2.9281558990478516 | |
I0914 18:06:10.041956 139618036999936 logging_writer.py:48] [111500] global_step=111500, grad_norm=0.4414684474468231, loss=2.9374234676361084 | |
I0914 18:08:58.126828 139618045392640 logging_writer.py:48] [112000] global_step=112000, grad_norm=0.4413857161998749, loss=2.953256845474243 | |
I0914 18:10:37.223568 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 18:10:44.865906 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 18:10:55.847971 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 18:10:58.111961 139785753851712 submission_runner.py:376] Time since start: 39379.08s, Step: 112297, {'train/accuracy': 0.8116230964660645, 'train/loss': 0.9464977383613586, 'validation/accuracy': 0.721019983291626, 'validation/loss': 1.3188493251800537, 'validation/num_examples': 50000, 'test/accuracy': 0.5951000452041626, 'test/loss': 1.9455469846725464, 'test/num_examples': 10000, 'score': 37809.78449511528, 'total_duration': 39379.07870936394, 'accumulated_submission_time': 37809.78449511528, 'accumulated_eval_time': 1565.0159449577332, 'accumulated_logging_time': 2.5778493881225586} | |
I0914 18:10:58.137747 139621082027776 logging_writer.py:48] [112297] accumulated_eval_time=1565.015945, accumulated_logging_time=2.577849, accumulated_submission_time=37809.784495, global_step=112297, preemption_count=0, score=37809.784495, test/accuracy=0.595100, test/loss=1.945547, test/num_examples=10000, total_duration=39379.078709, train/accuracy=0.811623, train/loss=0.946498, validation/accuracy=0.721020, validation/loss=1.318849, validation/num_examples=50000 | |
I0914 18:12:06.665533 139621090420480 logging_writer.py:48] [112500] global_step=112500, grad_norm=0.4168124198913574, loss=2.871910333633423 | |
I0914 18:14:54.676797 139621082027776 logging_writer.py:48] [113000] global_step=113000, grad_norm=0.43743082880973816, loss=2.907787799835205 | |
I0914 18:17:42.738482 139621090420480 logging_writer.py:48] [113500] global_step=113500, grad_norm=0.4323841631412506, loss=2.890209674835205 | |
I0914 18:19:28.127635 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 18:19:35.779218 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 18:19:46.886007 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 18:19:49.129853 139785753851712 submission_runner.py:376] Time since start: 39910.10s, Step: 113815, {'train/accuracy': 0.8452048897743225, 'train/loss': 0.8711603879928589, 'validation/accuracy': 0.7268399596214294, 'validation/loss': 1.3569468259811401, 'validation/num_examples': 50000, 'test/accuracy': 0.5951000452041626, 'test/loss': 1.9684876203536987, 'test/num_examples': 10000, 'score': 38319.7420566082, 'total_duration': 39910.09659743309, 'accumulated_submission_time': 38319.7420566082, 'accumulated_eval_time': 1586.0181345939636, 'accumulated_logging_time': 2.6128127574920654} | |
I0914 18:19:49.157007 139618045392640 logging_writer.py:48] [113815] accumulated_eval_time=1586.018135, accumulated_logging_time=2.612813, accumulated_submission_time=38319.742057, global_step=113815, preemption_count=0, score=38319.742057, test/accuracy=0.595100, test/loss=1.968488, test/num_examples=10000, total_duration=39910.096597, train/accuracy=0.845205, train/loss=0.871160, validation/accuracy=0.726840, validation/loss=1.356947, validation/num_examples=50000 | |
I0914 18:20:51.536039 139620410959616 logging_writer.py:48] [114000] global_step=114000, grad_norm=0.46374964714050293, loss=2.9240376949310303 | |
I0914 18:23:39.450235 139618045392640 logging_writer.py:48] [114500] global_step=114500, grad_norm=0.44095996022224426, loss=2.8425261974334717 | |
I0914 18:26:27.675474 139620410959616 logging_writer.py:48] [115000] global_step=115000, grad_norm=0.4671684503555298, loss=2.88558292388916 | |
I0914 18:28:19.439936 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 18:28:27.056215 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 18:28:37.988294 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 18:28:40.252659 139785753851712 submission_runner.py:376] Time since start: 40441.22s, Step: 115334, {'train/accuracy': 0.8356186151504517, 'train/loss': 0.846179187297821, 'validation/accuracy': 0.7235400080680847, 'validation/loss': 1.3148432970046997, 'validation/num_examples': 50000, 'test/accuracy': 0.5975000262260437, 'test/loss': 1.953201413154602, 'test/num_examples': 10000, 'score': 38829.9924018383, 'total_duration': 40441.21931767464, 'accumulated_submission_time': 38829.9924018383, 'accumulated_eval_time': 1606.8307423591614, 'accumulated_logging_time': 2.6497488021850586} | |
I0914 18:28:40.279415 139618036999936 logging_writer.py:48] [115334] accumulated_eval_time=1606.830742, accumulated_logging_time=2.649749, accumulated_submission_time=38829.992402, global_step=115334, preemption_count=0, score=38829.992402, test/accuracy=0.597500, test/loss=1.953201, test/num_examples=10000, total_duration=40441.219318, train/accuracy=0.835619, train/loss=0.846179, validation/accuracy=0.723540, validation/loss=1.314843, validation/num_examples=50000 | |
I0914 18:29:36.330388 139618045392640 logging_writer.py:48] [115500] global_step=115500, grad_norm=0.4787191152572632, loss=2.8335397243499756 | |
I0914 18:32:24.424423 139618036999936 logging_writer.py:48] [116000] global_step=116000, grad_norm=0.44966191053390503, loss=2.8393120765686035 | |
I0914 18:35:12.558569 139618045392640 logging_writer.py:48] [116500] global_step=116500, grad_norm=0.453999787569046, loss=2.8157007694244385 | |
I0914 18:37:10.536283 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 18:37:18.147148 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 18:37:29.201172 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 18:37:31.462862 139785753851712 submission_runner.py:376] Time since start: 40972.43s, Step: 116853, {'train/accuracy': 0.8401426672935486, 'train/loss': 0.8460429906845093, 'validation/accuracy': 0.7312600016593933, 'validation/loss': 1.2911760807037354, 'validation/num_examples': 50000, 'test/accuracy': 0.6062000393867493, 'test/loss': 1.8956819772720337, 'test/num_examples': 10000, 'score': 39340.216069698334, 'total_duration': 40972.4295835495, 'accumulated_submission_time': 39340.216069698334, 'accumulated_eval_time': 1627.757281780243, 'accumulated_logging_time': 2.6862545013427734} | |
I0914 18:37:31.492592 139618045392640 logging_writer.py:48] [116853] accumulated_eval_time=1627.757282, accumulated_logging_time=2.686255, accumulated_submission_time=39340.216070, global_step=116853, preemption_count=0, score=39340.216070, test/accuracy=0.606200, test/loss=1.895682, test/num_examples=10000, total_duration=40972.429584, train/accuracy=0.840143, train/loss=0.846043, validation/accuracy=0.731260, validation/loss=1.291176, validation/num_examples=50000 | |
I0914 18:38:21.166066 139620410959616 logging_writer.py:48] [117000] global_step=117000, grad_norm=0.47764918208122253, loss=2.888979434967041 | |
I0914 18:41:09.129042 139618045392640 logging_writer.py:48] [117500] global_step=117500, grad_norm=0.4835554361343384, loss=2.7958009243011475 | |
I0914 18:43:57.120205 139620410959616 logging_writer.py:48] [118000] global_step=118000, grad_norm=0.4646367132663727, loss=2.7950828075408936 | |
I0914 18:46:01.562391 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 18:46:09.242262 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 18:46:20.153388 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 18:46:22.558940 139785753851712 submission_runner.py:376] Time since start: 41503.53s, Step: 118372, {'train/accuracy': 0.8466597199440002, 'train/loss': 0.8141149878501892, 'validation/accuracy': 0.7381399869918823, 'validation/loss': 1.2623021602630615, 'validation/num_examples': 50000, 'test/accuracy': 0.6214000582695007, 'test/loss': 1.8717379570007324, 'test/num_examples': 10000, 'score': 39850.24843668938, 'total_duration': 41503.52567815781, 'accumulated_submission_time': 39850.24843668938, 'accumulated_eval_time': 1648.753799200058, 'accumulated_logging_time': 2.7309017181396484} | |
I0914 18:46:22.587269 139621098813184 logging_writer.py:48] [118372] accumulated_eval_time=1648.753799, accumulated_logging_time=2.730902, accumulated_submission_time=39850.248437, global_step=118372, preemption_count=0, score=39850.248437, test/accuracy=0.621400, test/loss=1.871738, test/num_examples=10000, total_duration=41503.525678, train/accuracy=0.846660, train/loss=0.814115, validation/accuracy=0.738140, validation/loss=1.262302, validation/num_examples=50000 | |
I0914 18:47:05.924679 139621107205888 logging_writer.py:48] [118500] global_step=118500, grad_norm=0.49744462966918945, loss=2.8599724769592285 | |
I0914 18:49:54.139391 139621098813184 logging_writer.py:48] [119000] global_step=119000, grad_norm=0.47480571269989014, loss=2.7920033931732178 | |
I0914 18:52:42.335139 139621107205888 logging_writer.py:48] [119500] global_step=119500, grad_norm=0.4592367112636566, loss=2.7162587642669678 | |
I0914 18:54:52.652643 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 18:55:00.214142 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 18:55:11.169922 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 18:55:13.409800 139785753851712 submission_runner.py:376] Time since start: 42034.38s, Step: 119889, {'train/accuracy': 0.8564253449440002, 'train/loss': 0.7718811631202698, 'validation/accuracy': 0.7484999895095825, 'validation/loss': 1.2095959186553955, 'validation/num_examples': 50000, 'test/accuracy': 0.6221000552177429, 'test/loss': 1.825244426727295, 'test/num_examples': 10000, 'score': 40360.281074762344, 'total_duration': 42034.37654709816, 'accumulated_submission_time': 40360.281074762344, 'accumulated_eval_time': 1669.5109317302704, 'accumulated_logging_time': 2.7693583965301514} | |
I0914 18:55:13.440362 139620410959616 logging_writer.py:48] [119889] accumulated_eval_time=1669.510932, accumulated_logging_time=2.769358, accumulated_submission_time=40360.281075, global_step=119889, preemption_count=0, score=40360.281075, test/accuracy=0.622100, test/loss=1.825244, test/num_examples=10000, total_duration=42034.376547, train/accuracy=0.856425, train/loss=0.771881, validation/accuracy=0.748500, validation/loss=1.209596, validation/num_examples=50000 | |
I0914 18:55:51.055367 139620419352320 logging_writer.py:48] [120000] global_step=120000, grad_norm=0.46088361740112305, loss=2.748404026031494 | |
I0914 18:58:38.933666 139620410959616 logging_writer.py:48] [120500] global_step=120500, grad_norm=0.47682926058769226, loss=2.788738965988159 | |
I0914 19:01:27.151314 139620419352320 logging_writer.py:48] [121000] global_step=121000, grad_norm=0.48673200607299805, loss=2.728959560394287 | |
I0914 19:03:43.670358 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 19:03:51.275867 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 19:04:02.285306 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 19:04:04.623245 139785753851712 submission_runner.py:376] Time since start: 42565.59s, Step: 121408, {'train/accuracy': 0.8634805083274841, 'train/loss': 0.7633724212646484, 'validation/accuracy': 0.750220000743866, 'validation/loss': 1.2169595956802368, 'validation/num_examples': 50000, 'test/accuracy': 0.626800000667572, 'test/loss': 1.814558506011963, 'test/num_examples': 10000, 'score': 40870.47562837601, 'total_duration': 42565.58989524841, 'accumulated_submission_time': 40870.47562837601, 'accumulated_eval_time': 1690.4636988639832, 'accumulated_logging_time': 2.8127453327178955} | |
I0914 19:04:04.650436 139617894389504 logging_writer.py:48] [121408] accumulated_eval_time=1690.463699, accumulated_logging_time=2.812745, accumulated_submission_time=40870.475628, global_step=121408, preemption_count=0, score=40870.475628, test/accuracy=0.626800, test/loss=1.814559, test/num_examples=10000, total_duration=42565.589895, train/accuracy=0.863481, train/loss=0.763372, validation/accuracy=0.750220, validation/loss=1.216960, validation/num_examples=50000 | |
I0914 19:04:35.909023 139617902782208 logging_writer.py:48] [121500] global_step=121500, grad_norm=0.49807053804397583, loss=2.7201364040374756 | |
I0914 19:07:24.007127 139617894389504 logging_writer.py:48] [122000] global_step=122000, grad_norm=0.479455828666687, loss=2.7059950828552246 | |
I0914 19:10:12.186707 139617902782208 logging_writer.py:48] [122500] global_step=122500, grad_norm=0.4708760380744934, loss=2.661046266555786 | |
I0914 19:12:34.906721 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 19:12:42.492753 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 19:12:53.594218 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 19:12:55.807027 139785753851712 submission_runner.py:376] Time since start: 43096.77s, Step: 122926, {'train/accuracy': 0.896882951259613, 'train/loss': 0.6356571316719055, 'validation/accuracy': 0.7584199905395508, 'validation/loss': 1.1778696775436401, 'validation/num_examples': 50000, 'test/accuracy': 0.6371000409126282, 'test/loss': 1.7690379619598389, 'test/num_examples': 10000, 'score': 41380.697590112686, 'total_duration': 43096.77365708351, 'accumulated_submission_time': 41380.697590112686, 'accumulated_eval_time': 1711.363877773285, 'accumulated_logging_time': 2.851658821105957} | |
I0914 19:12:55.839606 139620410959616 logging_writer.py:48] [122926] accumulated_eval_time=1711.363878, accumulated_logging_time=2.851659, accumulated_submission_time=41380.697590, global_step=122926, preemption_count=0, score=41380.697590, test/accuracy=0.637100, test/loss=1.769038, test/num_examples=10000, total_duration=43096.773657, train/accuracy=0.896883, train/loss=0.635657, validation/accuracy=0.758420, validation/loss=1.177870, validation/num_examples=50000 | |
I0914 19:13:21.037121 139620419352320 logging_writer.py:48] [123000] global_step=123000, grad_norm=0.48896467685699463, loss=2.7046561241149902 | |
I0914 19:16:09.015657 139620410959616 logging_writer.py:48] [123500] global_step=123500, grad_norm=0.4708644151687622, loss=2.6549618244171143 | |
I0914 19:18:56.923338 139620419352320 logging_writer.py:48] [124000] global_step=124000, grad_norm=0.5001617670059204, loss=2.6695520877838135 | |
I0914 19:21:26.042989 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 19:21:33.706474 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 19:21:44.803596 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 19:21:47.160427 139785753851712 submission_runner.py:376] Time since start: 43628.13s, Step: 124445, {'train/accuracy': 0.8963049650192261, 'train/loss': 0.6321005821228027, 'validation/accuracy': 0.7647599577903748, 'validation/loss': 1.1482919454574585, 'validation/num_examples': 50000, 'test/accuracy': 0.6454000473022461, 'test/loss': 1.7265831232070923, 'test/num_examples': 10000, 'score': 41890.86552166939, 'total_duration': 43628.127166986465, 'accumulated_submission_time': 41890.86552166939, 'accumulated_eval_time': 1732.4812920093536, 'accumulated_logging_time': 2.8973982334136963} | |
I0914 19:21:47.187817 139618045392640 logging_writer.py:48] [124445] accumulated_eval_time=1732.481292, accumulated_logging_time=2.897398, accumulated_submission_time=41890.865522, global_step=124445, preemption_count=0, score=41890.865522, test/accuracy=0.645400, test/loss=1.726583, test/num_examples=10000, total_duration=43628.127167, train/accuracy=0.896305, train/loss=0.632101, validation/accuracy=0.764760, validation/loss=1.148292, validation/num_examples=50000 | |
I0914 19:22:06.001334 139620410959616 logging_writer.py:48] [124500] global_step=124500, grad_norm=0.5023619532585144, loss=2.6471095085144043 | |
I0914 19:24:53.974798 139618045392640 logging_writer.py:48] [125000] global_step=125000, grad_norm=0.5099729299545288, loss=2.6568284034729004 | |
I0914 19:27:41.930681 139620410959616 logging_writer.py:48] [125500] global_step=125500, grad_norm=0.4939233958721161, loss=2.586808204650879 | |
I0914 19:30:17.467931 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 19:30:24.997955 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 19:30:36.002361 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 19:30:38.243386 139785753851712 submission_runner.py:376] Time since start: 44159.21s, Step: 125965, {'train/accuracy': 0.9003308415412903, 'train/loss': 0.6020267009735107, 'validation/accuracy': 0.7695800065994263, 'validation/loss': 1.123653769493103, 'validation/num_examples': 50000, 'test/accuracy': 0.6490000486373901, 'test/loss': 1.7115205526351929, 'test/num_examples': 10000, 'score': 42401.11342215538, 'total_duration': 44159.210112810135, 'accumulated_submission_time': 42401.11342215538, 'accumulated_eval_time': 1753.2567028999329, 'accumulated_logging_time': 2.934680700302124} | |
I0914 19:30:38.274112 139621098813184 logging_writer.py:48] [125965] accumulated_eval_time=1753.256703, accumulated_logging_time=2.934681, accumulated_submission_time=42401.113422, global_step=125965, preemption_count=0, score=42401.113422, test/accuracy=0.649000, test/loss=1.711521, test/num_examples=10000, total_duration=44159.210113, train/accuracy=0.900331, train/loss=0.602027, validation/accuracy=0.769580, validation/loss=1.123654, validation/num_examples=50000 | |
I0914 19:30:50.376403 139621107205888 logging_writer.py:48] [126000] global_step=126000, grad_norm=0.4852544665336609, loss=2.5603702068328857 | |
I0914 19:33:38.312931 139621098813184 logging_writer.py:48] [126500] global_step=126500, grad_norm=0.518208384513855, loss=2.619767665863037 | |
I0914 19:36:26.558245 139621107205888 logging_writer.py:48] [127000] global_step=127000, grad_norm=0.5091156363487244, loss=2.5884017944335938 | |
I0914 19:39:08.269389 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 19:39:15.808310 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 19:39:26.886655 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 19:39:29.079470 139785753851712 submission_runner.py:376] Time since start: 44690.05s, Step: 127483, {'train/accuracy': 0.9035195708274841, 'train/loss': 0.5998629331588745, 'validation/accuracy': 0.7721199989318848, 'validation/loss': 1.1213932037353516, 'validation/num_examples': 50000, 'test/accuracy': 0.6554000377655029, 'test/loss': 1.6994982957839966, 'test/num_examples': 10000, 'score': 42911.07231712341, 'total_duration': 44690.04618239403, 'accumulated_submission_time': 42911.07231712341, 'accumulated_eval_time': 1774.066725730896, 'accumulated_logging_time': 2.978921890258789} | |
I0914 19:39:29.122376 139620410959616 logging_writer.py:48] [127483] accumulated_eval_time=1774.066726, accumulated_logging_time=2.978922, accumulated_submission_time=42911.072317, global_step=127483, preemption_count=0, score=42911.072317, test/accuracy=0.655400, test/loss=1.699498, test/num_examples=10000, total_duration=44690.046182, train/accuracy=0.903520, train/loss=0.599863, validation/accuracy=0.772120, validation/loss=1.121393, validation/num_examples=50000 | |
I0914 19:39:35.166222 139620419352320 logging_writer.py:48] [127500] global_step=127500, grad_norm=0.5102914571762085, loss=2.6478493213653564 | |
I0914 19:42:23.218772 139620410959616 logging_writer.py:48] [128000] global_step=128000, grad_norm=0.5169419050216675, loss=2.622182846069336 | |
I0914 19:45:11.451178 139620419352320 logging_writer.py:48] [128500] global_step=128500, grad_norm=0.5053176283836365, loss=2.5826773643493652 | |
I0914 19:47:59.658819 139620410959616 logging_writer.py:48] [129000] global_step=129000, grad_norm=0.4903814196586609, loss=2.593526601791382 | |
I0914 19:47:59.666086 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 19:48:07.148158 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 19:48:18.186425 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 19:48:20.480490 139785753851712 submission_runner.py:376] Time since start: 45221.45s, Step: 129001, {'train/accuracy': 0.9015664458274841, 'train/loss': 0.6057281494140625, 'validation/accuracy': 0.7714999914169312, 'validation/loss': 1.1205312013626099, 'validation/num_examples': 50000, 'test/accuracy': 0.6547000408172607, 'test/loss': 1.7005892992019653, 'test/num_examples': 10000, 'score': 43421.583512067795, 'total_duration': 45221.447182655334, 'accumulated_submission_time': 43421.583512067795, 'accumulated_eval_time': 1794.8810048103333, 'accumulated_logging_time': 3.0320894718170166} | |
I0914 19:48:20.522889 139621082027776 logging_writer.py:48] [129001] accumulated_eval_time=1794.881005, accumulated_logging_time=3.032089, accumulated_submission_time=43421.583512, global_step=129001, preemption_count=0, score=43421.583512, test/accuracy=0.654700, test/loss=1.700589, test/num_examples=10000, total_duration=45221.447183, train/accuracy=0.901566, train/loss=0.605728, validation/accuracy=0.771500, validation/loss=1.120531, validation/num_examples=50000 | |
I0914 19:51:08.502212 139621090420480 logging_writer.py:48] [129500] global_step=129500, grad_norm=0.5298535823822021, loss=2.646083354949951 | |
I0914 19:53:56.488409 139621082027776 logging_writer.py:48] [130000] global_step=130000, grad_norm=0.5100540518760681, loss=2.5769078731536865 | |
I0914 19:56:44.719903 139621090420480 logging_writer.py:48] [130500] global_step=130500, grad_norm=0.5042638778686523, loss=2.5659964084625244 | |
I0914 19:56:50.530486 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 19:56:58.011437 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 19:57:09.085397 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 19:57:11.381052 139785753851712 submission_runner.py:376] Time since start: 45752.35s, Step: 130519, {'train/accuracy': 0.9035793542861938, 'train/loss': 0.5980676412582397, 'validation/accuracy': 0.7721799612045288, 'validation/loss': 1.119748592376709, 'validation/num_examples': 50000, 'test/accuracy': 0.6549000144004822, 'test/loss': 1.6995123624801636, 'test/num_examples': 10000, 'score': 43931.55882525444, 'total_duration': 45752.34780359268, 'accumulated_submission_time': 43931.55882525444, 'accumulated_eval_time': 1815.7315328121185, 'accumulated_logging_time': 3.084369659423828} | |
I0914 19:57:11.410619 139618045392640 logging_writer.py:48] [130519] accumulated_eval_time=1815.731533, accumulated_logging_time=3.084370, accumulated_submission_time=43931.558825, global_step=130519, preemption_count=0, score=43931.558825, test/accuracy=0.654900, test/loss=1.699512, test/num_examples=10000, total_duration=45752.347804, train/accuracy=0.903579, train/loss=0.598068, validation/accuracy=0.772180, validation/loss=1.119749, validation/num_examples=50000 | |
I0914 19:59:53.540511 139620410959616 logging_writer.py:48] [131000] global_step=131000, grad_norm=0.4992199242115021, loss=2.5815839767456055 | |
I0914 20:02:41.781963 139618045392640 logging_writer.py:48] [131500] global_step=131500, grad_norm=0.5254312753677368, loss=2.7122814655303955 | |
I0914 20:05:30.022560 139620410959616 logging_writer.py:48] [132000] global_step=132000, grad_norm=0.5144177079200745, loss=2.636721611022949 | |
I0914 20:05:41.557423 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 20:05:49.096254 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 20:06:00.296557 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 20:06:02.548254 139785753851712 submission_runner.py:376] Time since start: 46283.52s, Step: 132036, {'train/accuracy': 0.9046555757522583, 'train/loss': 0.5945121645927429, 'validation/accuracy': 0.7727400064468384, 'validation/loss': 1.1170986890792847, 'validation/num_examples': 50000, 'test/accuracy': 0.6539000272750854, 'test/loss': 1.6980239152908325, 'test/num_examples': 10000, 'score': 44441.67366838455, 'total_duration': 46283.51500082016, 'accumulated_submission_time': 44441.67366838455, 'accumulated_eval_time': 1836.7223308086395, 'accumulated_logging_time': 3.1229963302612305} | |
I0914 20:06:02.574327 139621090420480 logging_writer.py:48] [132036] accumulated_eval_time=1836.722331, accumulated_logging_time=3.122996, accumulated_submission_time=44441.673668, global_step=132036, preemption_count=0, score=44441.673668, test/accuracy=0.653900, test/loss=1.698024, test/num_examples=10000, total_duration=46283.515001, train/accuracy=0.904656, train/loss=0.594512, validation/accuracy=0.772740, validation/loss=1.117099, validation/num_examples=50000 | |
I0914 20:08:38.948357 139621098813184 logging_writer.py:48] [132500] global_step=132500, grad_norm=0.506624698638916, loss=2.62170147895813 | |
I0914 20:11:27.108828 139621090420480 logging_writer.py:48] [133000] global_step=133000, grad_norm=0.4876885414123535, loss=2.5655786991119385 | |
I0914 20:14:15.160693 139621098813184 logging_writer.py:48] [133500] global_step=133500, grad_norm=0.511705219745636, loss=2.5628979206085205 | |
I0914 20:14:32.740298 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 20:14:40.237702 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 20:14:51.334104 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 20:14:53.563938 139785753851712 submission_runner.py:376] Time since start: 46814.53s, Step: 133554, {'train/accuracy': 0.9058513641357422, 'train/loss': 0.5935448408126831, 'validation/accuracy': 0.772599995136261, 'validation/loss': 1.117397427558899, 'validation/num_examples': 50000, 'test/accuracy': 0.6536000370979309, 'test/loss': 1.6978504657745361, 'test/num_examples': 10000, 'score': 44951.80677986145, 'total_duration': 46814.53054857254, 'accumulated_submission_time': 44951.80677986145, 'accumulated_eval_time': 1857.5457971096039, 'accumulated_logging_time': 3.1587300300598145} | |
I0914 20:14:53.591204 139618036999936 logging_writer.py:48] [133554] accumulated_eval_time=1857.545797, accumulated_logging_time=3.158730, accumulated_submission_time=44951.806780, global_step=133554, preemption_count=0, score=44951.806780, test/accuracy=0.653600, test/loss=1.697850, test/num_examples=10000, total_duration=46814.530549, train/accuracy=0.905851, train/loss=0.593545, validation/accuracy=0.772600, validation/loss=1.117397, validation/num_examples=50000 | |
I0914 20:17:23.739792 139618045392640 logging_writer.py:48] [134000] global_step=134000, grad_norm=0.5241892337799072, loss=2.5950281620025635 | |
I0914 20:20:11.909889 139618036999936 logging_writer.py:48] [134500] global_step=134500, grad_norm=0.5013359189033508, loss=2.549062967300415 | |
I0914 20:23:00.125262 139618045392640 logging_writer.py:48] [135000] global_step=135000, grad_norm=0.49756449460983276, loss=2.585008382797241 | |
I0914 20:23:23.762339 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 20:23:31.236675 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 20:23:42.431392 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 20:23:44.701690 139785753851712 submission_runner.py:376] Time since start: 47345.67s, Step: 135072, {'train/accuracy': 0.9068678021430969, 'train/loss': 0.5902974605560303, 'validation/accuracy': 0.7732200026512146, 'validation/loss': 1.1165798902511597, 'validation/num_examples': 50000, 'test/accuracy': 0.6538000106811523, 'test/loss': 1.698063611984253, 'test/num_examples': 10000, 'score': 45461.94407606125, 'total_duration': 47345.66832566261, 'accumulated_submission_time': 45461.94407606125, 'accumulated_eval_time': 1878.4849972724915, 'accumulated_logging_time': 3.197124719619751} | |
I0914 20:23:44.728093 139618036999936 logging_writer.py:48] [135072] accumulated_eval_time=1878.484997, accumulated_logging_time=3.197125, accumulated_submission_time=45461.944076, global_step=135072, preemption_count=0, score=45461.944076, test/accuracy=0.653800, test/loss=1.698064, test/num_examples=10000, total_duration=47345.668326, train/accuracy=0.906868, train/loss=0.590297, validation/accuracy=0.773220, validation/loss=1.116580, validation/num_examples=50000 | |
I0914 20:26:08.701267 139621090420480 logging_writer.py:48] [135500] global_step=135500, grad_norm=0.5120027661323547, loss=2.583296298980713 | |
I0914 20:28:56.874789 139618036999936 logging_writer.py:48] [136000] global_step=136000, grad_norm=0.5158140063285828, loss=2.5482170581817627 | |
I0914 20:31:45.001054 139621090420480 logging_writer.py:48] [136500] global_step=136500, grad_norm=0.5265464186668396, loss=2.591872453689575 | |
I0914 20:32:14.701688 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 20:32:22.171722 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 20:32:33.272115 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 20:32:35.523978 139785753851712 submission_runner.py:376] Time since start: 47876.49s, Step: 136590, {'train/accuracy': 0.9063496589660645, 'train/loss': 0.5868960022926331, 'validation/accuracy': 0.7733599543571472, 'validation/loss': 1.11497962474823, 'validation/num_examples': 50000, 'test/accuracy': 0.6541000604629517, 'test/loss': 1.6960680484771729, 'test/num_examples': 10000, 'score': 45971.88573241234, 'total_duration': 47876.49062347412, 'accumulated_submission_time': 45971.88573241234, 'accumulated_eval_time': 1899.3071541786194, 'accumulated_logging_time': 3.2327637672424316} | |
I0914 20:32:35.551081 139618036999936 logging_writer.py:48] [136590] accumulated_eval_time=1899.307154, accumulated_logging_time=3.232764, accumulated_submission_time=45971.885732, global_step=136590, preemption_count=0, score=45971.885732, test/accuracy=0.654100, test/loss=1.696068, test/num_examples=10000, total_duration=47876.490623, train/accuracy=0.906350, train/loss=0.586896, validation/accuracy=0.773360, validation/loss=1.114980, validation/num_examples=50000 | |
I0914 20:34:53.602206 139618045392640 logging_writer.py:48] [137000] global_step=137000, grad_norm=0.5001352429389954, loss=2.6086196899414062 | |
I0914 20:37:41.751103 139618036999936 logging_writer.py:48] [137500] global_step=137500, grad_norm=0.5193489193916321, loss=2.615438461303711 | |
I0914 20:40:29.898882 139618045392640 logging_writer.py:48] [138000] global_step=138000, grad_norm=0.5259115099906921, loss=2.6131694316864014 | |
I0914 20:41:05.654629 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 20:41:13.167435 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 20:41:24.345569 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 20:41:26.596942 139785753851712 submission_runner.py:376] Time since start: 48407.56s, Step: 138108, {'train/accuracy': 0.907645046710968, 'train/loss': 0.5866842865943909, 'validation/accuracy': 0.7729199528694153, 'validation/loss': 1.1145741939544678, 'validation/num_examples': 50000, 'test/accuracy': 0.6550000309944153, 'test/loss': 1.6953842639923096, 'test/num_examples': 10000, 'score': 46481.956923007965, 'total_duration': 48407.56366467476, 'accumulated_submission_time': 46481.956923007965, 'accumulated_eval_time': 1920.2494142055511, 'accumulated_logging_time': 3.2690396308898926} | |
I0914 20:41:26.624465 139621090420480 logging_writer.py:48] [138108] accumulated_eval_time=1920.249414, accumulated_logging_time=3.269040, accumulated_submission_time=46481.956923, global_step=138108, preemption_count=0, score=46481.956923, test/accuracy=0.655000, test/loss=1.695384, test/num_examples=10000, total_duration=48407.563665, train/accuracy=0.907645, train/loss=0.586684, validation/accuracy=0.772920, validation/loss=1.114574, validation/num_examples=50000 | |
I0914 20:43:38.498144 139621098813184 logging_writer.py:48] [138500] global_step=138500, grad_norm=0.4836212694644928, loss=2.522245407104492 | |
I0914 20:46:26.472573 139621090420480 logging_writer.py:48] [139000] global_step=139000, grad_norm=0.48201984167099, loss=2.5357580184936523 | |
I0914 20:49:14.508329 139621098813184 logging_writer.py:48] [139500] global_step=139500, grad_norm=0.5080897212028503, loss=2.6238484382629395 | |
I0914 20:49:56.659739 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 20:50:04.166987 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 20:50:15.245995 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 20:50:17.443364 139785753851712 submission_runner.py:376] Time since start: 48938.41s, Step: 139627, {'train/accuracy': 0.9070870280265808, 'train/loss': 0.5904383659362793, 'validation/accuracy': 0.7734000086784363, 'validation/loss': 1.1148360967636108, 'validation/num_examples': 50000, 'test/accuracy': 0.6565000414848328, 'test/loss': 1.696603775024414, 'test/num_examples': 10000, 'score': 46991.960359573364, 'total_duration': 48938.4100048542, 'accumulated_submission_time': 46991.960359573364, 'accumulated_eval_time': 1941.032898902893, 'accumulated_logging_time': 3.3056366443634033} | |
I0914 20:50:17.474641 139620410959616 logging_writer.py:48] [139627] accumulated_eval_time=1941.032899, accumulated_logging_time=3.305637, accumulated_submission_time=46991.960360, global_step=139627, preemption_count=0, score=46991.960360, test/accuracy=0.656500, test/loss=1.696604, test/num_examples=10000, total_duration=48938.410005, train/accuracy=0.907087, train/loss=0.590438, validation/accuracy=0.773400, validation/loss=1.114836, validation/num_examples=50000 | |
I0914 20:52:22.570719 139785753851712 spec.py:320] Evaluating on the training split. | |
I0914 20:52:29.954459 139785753851712 spec.py:332] Evaluating on the validation split. | |
I0914 20:52:40.947678 139785753851712 spec.py:348] Evaluating on the test split. | |
I0914 20:52:43.221620 139785753851712 submission_runner.py:376] Time since start: 49084.19s, Step: 140000, {'train/accuracy': 0.9084422588348389, 'train/loss': 0.578050971031189, 'validation/accuracy': 0.7722199559211731, 'validation/loss': 1.1122545003890991, 'validation/num_examples': 50000, 'test/accuracy': 0.655500054359436, 'test/loss': 1.6934614181518555, 'test/num_examples': 10000, 'score': 47117.04019618034, 'total_duration': 49084.18835735321, 'accumulated_submission_time': 47117.04019618034, 'accumulated_eval_time': 1961.6837706565857, 'accumulated_logging_time': 3.3469996452331543} | |
I0914 20:52:43.254832 139618036999936 logging_writer.py:48] [140000] accumulated_eval_time=1961.683771, accumulated_logging_time=3.347000, accumulated_submission_time=47117.040196, global_step=140000, preemption_count=0, score=47117.040196, test/accuracy=0.655500, test/loss=1.693461, test/num_examples=10000, total_duration=49084.188357, train/accuracy=0.908442, train/loss=0.578051, validation/accuracy=0.772220, validation/loss=1.112255, validation/num_examples=50000 | |
I0914 20:52:43.278700 139621082027776 logging_writer.py:48] [140000] global_step=140000, preemption_count=0, score=47117.040196 | |
I0914 20:52:43.516733 139785753851712 checkpoints.py:490] Saving checkpoint at step: 140000 | |
I0914 20:52:44.370062 139785753851712 checkpoints.py:422] Saved checkpoint at /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1/checkpoint_140000 | |
I0914 20:52:44.389781 139785753851712 checkpoint_utils.py:240] Saved checkpoint to /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1/checkpoint_140000. | |
I0914 20:52:45.253473 139785753851712 submission_runner.py:540] Tuning trial 1/1 | |
I0914 20:52:45.253736 139785753851712 submission_runner.py:541] Hyperparameters: Hyperparameters(learning_rate=4.131896390902391, beta1=0.9274758113254791, beta2=0.9978504782314613, warmup_steps=6999, decay_steps_factor=0.9007765761611038, end_factor=0.001, weight_decay=5.6687777311501786e-06, label_smoothing=0.2) | |
I0914 20:52:45.258056 139785753851712 submission_runner.py:542] Metrics: {'eval_results': [(1, {'train/accuracy': 0.0009367027669213712, 'train/loss': 6.9118571281433105, 'validation/accuracy': 0.0010400000028312206, 'validation/loss': 6.911978721618652, 'validation/num_examples': 50000, 'test/accuracy': 0.0014000000664964318, 'test/loss': 6.91181755065918, 'test/num_examples': 10000, 'score': 62.32368874549866, 'total_duration': 109.0933792591095, 'accumulated_submission_time': 62.32368874549866, 'accumulated_eval_time': 46.76959991455078, 'accumulated_logging_time': 0, 'global_step': 1, 'preemption_count': 0}), (1514, {'train/accuracy': 0.18895487487316132, 'train/loss': 4.249844551086426, 'validation/accuracy': 0.1698399931192398, 'validation/loss': 4.387373924255371, 'validation/num_examples': 50000, 'test/accuracy': 0.12890000641345978, 'test/loss': 4.780310153961182, 'test/num_examples': 10000, 'score': 572.3801600933075, 'total_duration': 636.7591044902802, 'accumulated_submission_time': 572.3801600933075, 'accumulated_eval_time': 64.32867097854614, 'accumulated_logging_time': 0.02817511558532715, 'global_step': 1514, 'preemption_count': 0}), (3030, {'train/accuracy': 0.35439252853393555, 'train/loss': 3.16475772857666, 'validation/accuracy': 0.32311999797821045, 'validation/loss': 3.354123592376709, 'validation/num_examples': 50000, 'test/accuracy': 0.24500000476837158, 'test/loss': 3.896735429763794, 'test/num_examples': 10000, 'score': 1082.4337828159332, 'total_duration': 1164.5784351825714, 'accumulated_submission_time': 1082.4337828159332, 'accumulated_eval_time': 82.04390573501587, 'accumulated_logging_time': 0.05537152290344238, 'global_step': 3030, 'preemption_count': 0}), (4547, {'train/accuracy': 0.4696468412876129, 'train/loss': 2.565059185028076, 'validation/accuracy': 0.4343400001525879, 'validation/loss': 2.7238423824310303, 'validation/num_examples': 50000, 'test/accuracy': 0.3296000063419342, 'test/loss': 3.3760478496551514, 'test/num_examples': 10000, 'score': 1592.5839030742645, 'total_duration': 1692.2868733406067, 'accumulated_submission_time': 1592.5839030742645, 'accumulated_eval_time': 99.54879140853882, 'accumulated_logging_time': 0.08617043495178223, 'global_step': 4547, 'preemption_count': 0}), (6064, {'train/accuracy': 0.485072523355484, 'train/loss': 2.4722769260406494, 'validation/accuracy': 0.45179998874664307, 'validation/loss': 2.632256269454956, 'validation/num_examples': 50000, 'test/accuracy': 0.3456000089645386, 'test/loss': 3.28208327293396, 'test/num_examples': 10000, 'score': 2102.707985162735, 'total_duration': 2220.0234982967377, 'accumulated_submission_time': 2102.707985162735, 'accumulated_eval_time': 117.10866022109985, 'accumulated_logging_time': 0.11611032485961914, 'global_step': 6064, 'preemption_count': 0}), (7581, {'train/accuracy': 0.5335220098495483, 'train/loss': 2.235621452331543, 'validation/accuracy': 0.5026400089263916, 'validation/loss': 2.388488531112671, 'validation/num_examples': 50000, 'test/accuracy': 0.392300009727478, 'test/loss': 3.0264182090759277, 'test/num_examples': 10000, 'score': 2612.7207324504852, 'total_duration': 2747.8368368148804, 'accumulated_submission_time': 2612.7207324504852, 'accumulated_eval_time': 134.8591091632843, 'accumulated_logging_time': 0.14357876777648926, 'global_step': 7581, 'preemption_count': 0}), (9098, {'train/accuracy': 0.5826291441917419, 'train/loss': 2.0229527950286865, 'validation/accuracy': 0.5064799785614014, 'validation/loss': 2.3824005126953125, 'validation/num_examples': 50000, 'test/accuracy': 0.3856000304222107, 'test/loss': 3.0475265979766846, 'test/num_examples': 10000, 'score': 3122.8992822170258, 'total_duration': 3275.9179759025574, 'accumulated_submission_time': 3122.8992822170258, 'accumulated_eval_time': 152.71290373802185, 'accumulated_logging_time': 0.17004680633544922, 'global_step': 9098, 'preemption_count': 0}), (10615, {'train/accuracy': 0.5702128410339355, 'train/loss': 2.0194029808044434, 'validation/accuracy': 0.5161600112915039, 'validation/loss': 2.2897284030914307, 'validation/num_examples': 50000, 'test/accuracy': 0.3993000090122223, 'test/loss': 2.9481253623962402, 'test/num_examples': 10000, 'score': 3633.0976436138153, 'total_duration': 3804.0056059360504, 'accumulated_submission_time': 3633.0976436138153, 'accumulated_eval_time': 170.5518193244934, 'accumulated_logging_time': 0.1971442699432373, 'global_step': 10615, 'preemption_count': 0}), (12132, {'train/accuracy': 0.5753945708274841, 'train/loss': 2.0027692317962646, 'validation/accuracy': 0.5300599932670593, 'validation/loss': 2.2071869373321533, 'validation/num_examples': 50000, 'test/accuracy': 0.414900004863739, 'test/loss': 2.8690314292907715, 'test/num_examples': 10000, 'score': 4143.2383596897125, 'total_duration': 4332.050138235092, 'accumulated_submission_time': 4143.2383596897125, 'accumulated_eval_time': 188.3957643508911, 'accumulated_logging_time': 0.23388051986694336, 'global_step': 12132, 'preemption_count': 0}), (13649, {'train/accuracy': 0.5742785334587097, 'train/loss': 2.003106117248535, 'validation/accuracy': 0.5323799848556519, 'validation/loss': 2.2081964015960693, 'validation/num_examples': 50000, 'test/accuracy': 0.4196000099182129, 'test/loss': 2.892209529876709, 'test/num_examples': 10000, 'score': 4653.37796998024, 'total_duration': 4860.886433124542, 'accumulated_submission_time': 4653.37796998024, 'accumulated_eval_time': 207.04141402244568, 'accumulated_logging_time': 0.26221203804016113, 'global_step': 13649, 'preemption_count': 0}), (15166, {'train/accuracy': 0.5779455900192261, 'train/loss': 2.057650327682495, 'validation/accuracy': 0.5369799733161926, 'validation/loss': 2.2513246536254883, 'validation/num_examples': 50000, 'test/accuracy': 0.414900004863739, 'test/loss': 2.942534923553467, 'test/num_examples': 10000, 'score': 5163.59117102623, 'total_duration': 5389.39173579216, 'accumulated_submission_time': 5163.59117102623, 'accumulated_eval_time': 225.25672912597656, 'accumulated_logging_time': 0.31658077239990234, 'global_step': 15166, 'preemption_count': 0}), (16683, {'train/accuracy': 0.5338209271430969, 'train/loss': 2.335331678390503, 'validation/accuracy': 0.5003600120544434, 'validation/loss': 2.478173017501831, 'validation/num_examples': 50000, 'test/accuracy': 0.38210001587867737, 'test/loss': 3.1922895908355713, 'test/num_examples': 10000, 'score': 5673.775738954544, 'total_duration': 5919.104665517807, 'accumulated_submission_time': 5673.775738954544, 'accumulated_eval_time': 244.7305908203125, 'accumulated_logging_time': 0.3483104705810547, 'global_step': 16683, 'preemption_count': 0}), (18201, {'train/accuracy': 0.6382134556770325, 'train/loss': 1.74356210231781, 'validation/accuracy': 0.5595200061798096, 'validation/loss': 2.1038706302642822, 'validation/num_examples': 50000, 'test/accuracy': 0.43620002269744873, 'test/loss': 2.7823822498321533, 'test/num_examples': 10000, 'score': 6183.962655305862, 'total_duration': 6448.986067771912, 'accumulated_submission_time': 6183.962655305862, 'accumulated_eval_time': 264.35929918289185, 'accumulated_logging_time': 0.39099645614624023, 'global_step': 18201, 'preemption_count': 0}), (19718, {'train/accuracy': 0.5982142686843872, 'train/loss': 1.9227712154388428, 'validation/accuracy': 0.5423600077629089, 'validation/loss': 2.185263156890869, 'validation/num_examples': 50000, 'test/accuracy': 0.4272000193595886, 'test/loss': 2.8164870738983154, 'test/num_examples': 10000, 'score': 6694.032505512238, 'total_duration': 6979.200145483017, 'accumulated_submission_time': 6694.032505512238, 'accumulated_eval_time': 284.4493384361267, 'accumulated_logging_time': 0.4227602481842041, 'global_step': 19718, 'preemption_count': 0}), (21235, {'train/accuracy': 0.6011040806770325, 'train/loss': 1.9267860651016235, 'validation/accuracy': 0.5546199679374695, 'validation/loss': 2.1422059535980225, 'validation/num_examples': 50000, 'test/accuracy': 0.4358000159263611, 'test/loss': 2.809361457824707, 'test/num_examples': 10000, 'score': 7203.964419841766, 'total_duration': 7508.275420188904, 'accumulated_submission_time': 7203.964419841766, 'accumulated_eval_time': 303.52203822135925, 'accumulated_logging_time': 0.47086191177368164, 'global_step': 21235, 'preemption_count': 0}), (22753, {'train/accuracy': 0.5950454473495483, 'train/loss': 1.9752215147018433, 'validation/accuracy': 0.5528799891471863, 'validation/loss': 2.1724321842193604, 'validation/num_examples': 50000, 'test/accuracy': 0.4336000084877014, 'test/loss': 2.810899257659912, 'test/num_examples': 10000, 'score': 7714.026890993118, 'total_duration': 8037.667640447617, 'accumulated_submission_time': 7714.026890993118, 'accumulated_eval_time': 322.7941265106201, 'accumulated_logging_time': 0.5051746368408203, 'global_step': 22753, 'preemption_count': 0}), (24270, {'train/accuracy': 0.5795798897743225, 'train/loss': 2.023024320602417, 'validation/accuracy': 0.5407800078392029, 'validation/loss': 2.230668306350708, 'validation/num_examples': 50000, 'test/accuracy': 0.41540002822875977, 'test/loss': 2.895124673843384, 'test/num_examples': 10000, 'score': 8224.06004691124, 'total_duration': 8567.42837190628, 'accumulated_submission_time': 8224.06004691124, 'accumulated_eval_time': 342.4695653915405, 'accumulated_logging_time': 0.5343174934387207, 'global_step': 24270, 'preemption_count': 0}), (25788, {'train/accuracy': 0.6070232391357422, 'train/loss': 1.906052589416504, 'validation/accuracy': 0.5681399703025818, 'validation/loss': 2.081261396408081, 'validation/num_examples': 50000, 'test/accuracy': 0.4448000192642212, 'test/loss': 2.7579729557037354, 'test/num_examples': 10000, 'score': 8734.046847581863, 'total_duration': 9097.127324581146, 'accumulated_submission_time': 8734.046847581863, 'accumulated_eval_time': 362.12878465652466, 'accumulated_logging_time': 0.5639915466308594, 'global_step': 25788, 'preemption_count': 0}), (27305, {'train/accuracy': 0.6470025181770325, 'train/loss': 1.7079155445098877, 'validation/accuracy': 0.5748400092124939, 'validation/loss': 2.0490245819091797, 'validation/num_examples': 50000, 'test/accuracy': 0.4561000168323517, 'test/loss': 2.6967175006866455, 'test/num_examples': 10000, 'score': 9244.031145811081, 'total_duration': 9627.8626434803, 'accumulated_submission_time': 9244.031145811081, 'accumulated_eval_time': 382.8217294216156, 'accumulated_logging_time': 0.5991528034210205, 'global_step': 27305, 'preemption_count': 0}), (28822, {'train/accuracy': 0.6150151491165161, 'train/loss': 1.8384064435958862, 'validation/accuracy': 0.5671799778938293, 'validation/loss': 2.073777675628662, 'validation/num_examples': 50000, 'test/accuracy': 0.4439000189304352, 'test/loss': 2.7257843017578125, 'test/num_examples': 10000, 'score': 9754.244331598282, 'total_duration': 10158.26204609871, 'accumulated_submission_time': 9754.244331598282, 'accumulated_eval_time': 402.9500343799591, 'accumulated_logging_time': 0.6338088512420654, 'global_step': 28822, 'preemption_count': 0}), (30340, {'train/accuracy': 0.6219905614852905, 'train/loss': 1.8566009998321533, 'validation/accuracy': 0.5748000144958496, 'validation/loss': 2.0822231769561768, 'validation/num_examples': 50000, 'test/accuracy': 0.4506000280380249, 'test/loss': 2.730398178100586, 'test/num_examples': 10000, 'score': 10264.379125356674, 'total_duration': 10689.885441303253, 'accumulated_submission_time': 10264.379125356674, 'accumulated_eval_time': 424.3793590068817, 'accumulated_logging_time': 0.6697630882263184, 'global_step': 30340, 'preemption_count': 0}), (31857, {'train/accuracy': 0.6304607391357422, 'train/loss': 1.7625677585601807, 'validation/accuracy': 0.5828399658203125, 'validation/loss': 1.97676682472229, 'validation/num_examples': 50000, 'test/accuracy': 0.460500031709671, 'test/loss': 2.6475884914398193, 'test/num_examples': 10000, 'score': 10774.378732919693, 'total_duration': 11221.44010066986, 'accumulated_submission_time': 10774.378732919693, 'accumulated_eval_time': 445.87978982925415, 'accumulated_logging_time': 0.7013595104217529, 'global_step': 31857, 'preemption_count': 0}), (33374, {'train/accuracy': 0.6193598508834839, 'train/loss': 1.826348066329956, 'validation/accuracy': 0.5787799954414368, 'validation/loss': 2.038647413253784, 'validation/num_examples': 50000, 'test/accuracy': 0.46250003576278687, 'test/loss': 2.679227352142334, 'test/num_examples': 10000, 'score': 11284.596626758575, 'total_duration': 11753.228493452072, 'accumulated_submission_time': 11284.596626758575, 'accumulated_eval_time': 467.390745639801, 'accumulated_logging_time': 0.738187313079834, 'global_step': 33374, 'preemption_count': 0}), (34891, {'train/accuracy': 0.6135203838348389, 'train/loss': 1.8042339086532593, 'validation/accuracy': 0.5674799680709839, 'validation/loss': 2.0255982875823975, 'validation/num_examples': 50000, 'test/accuracy': 0.4407000243663788, 'test/loss': 2.6929032802581787, 'test/num_examples': 10000, 'score': 11794.77813744545, 'total_duration': 12285.116560459137, 'accumulated_submission_time': 11794.77813744545, 'accumulated_eval_time': 489.0379819869995, 'accumulated_logging_time': 0.7739980220794678, 'global_step': 34891, 'preemption_count': 0}), (36408, {'train/accuracy': 0.6544762253761292, 'train/loss': 1.7204208374023438, 'validation/accuracy': 0.5828199982643127, 'validation/loss': 2.025986433029175, 'validation/num_examples': 50000, 'test/accuracy': 0.45920002460479736, 'test/loss': 2.709810733795166, 'test/num_examples': 10000, 'score': 12304.926926612854, 'total_duration': 12817.259620189667, 'accumulated_submission_time': 12304.926926612854, 'accumulated_eval_time': 510.97503638267517, 'accumulated_logging_time': 0.8083920478820801, 'global_step': 36408, 'preemption_count': 0}), (37926, {'train/accuracy': 0.6563695669174194, 'train/loss': 1.7108644247055054, 'validation/accuracy': 0.5983799695968628, 'validation/loss': 1.9706319570541382, 'validation/num_examples': 50000, 'test/accuracy': 0.48260003328323364, 'test/loss': 2.6029040813446045, 'test/num_examples': 10000, 'score': 12815.176861763, 'total_duration': 13348.463897228241, 'accumulated_submission_time': 12815.176861763, 'accumulated_eval_time': 531.8736464977264, 'accumulated_logging_time': 0.8413448333740234, 'global_step': 37926, 'preemption_count': 0}), (39443, {'train/accuracy': 0.6421595811843872, 'train/loss': 1.6656968593597412, 'validation/accuracy': 0.5947999954223633, 'validation/loss': 1.8805724382400513, 'validation/num_examples': 50000, 'test/accuracy': 0.4724000096321106, 'test/loss': 2.5521347522735596, 'test/num_examples': 10000, 'score': 13325.226942777634, 'total_duration': 13879.367692947388, 'accumulated_submission_time': 13325.226942777634, 'accumulated_eval_time': 552.67098736763, 'accumulated_logging_time': 0.8749892711639404, 'global_step': 39443, 'preemption_count': 0}), (40961, {'train/accuracy': 0.6224888563156128, 'train/loss': 1.8308100700378418, 'validation/accuracy': 0.5813800096511841, 'validation/loss': 2.03011155128479, 'validation/num_examples': 50000, 'test/accuracy': 0.46330001950263977, 'test/loss': 2.654615640640259, 'test/num_examples': 10000, 'score': 13835.485067367554, 'total_duration': 14410.576689004898, 'accumulated_submission_time': 13835.485067367554, 'accumulated_eval_time': 573.5648620128632, 'accumulated_logging_time': 0.9091906547546387, 'global_step': 40961, 'preemption_count': 0}), (42478, {'train/accuracy': 0.6535195708274841, 'train/loss': 1.6985844373703003, 'validation/accuracy': 0.6078799962997437, 'validation/loss': 1.8977073431015015, 'validation/num_examples': 50000, 'test/accuracy': 0.4879000186920166, 'test/loss': 2.5333468914031982, 'test/num_examples': 10000, 'score': 14345.642942905426, 'total_duration': 14941.717395067215, 'accumulated_submission_time': 14345.642942905426, 'accumulated_eval_time': 594.4908349514008, 'accumulated_logging_time': 0.9433751106262207, 'global_step': 42478, 'preemption_count': 0}), (43995, {'train/accuracy': 0.6800063848495483, 'train/loss': 1.590664267539978, 'validation/accuracy': 0.5967199802398682, 'validation/loss': 1.9601261615753174, 'validation/num_examples': 50000, 'test/accuracy': 0.47510001063346863, 'test/loss': 2.590411424636841, 'test/num_examples': 10000, 'score': 14855.678381443024, 'total_duration': 15472.915107250214, 'accumulated_submission_time': 14855.678381443024, 'accumulated_eval_time': 615.5962023735046, 'accumulated_logging_time': 0.9775500297546387, 'global_step': 43995, 'preemption_count': 0}), (45512, {'train/accuracy': 0.6493741869926453, 'train/loss': 1.7071624994277954, 'validation/accuracy': 0.5888199806213379, 'validation/loss': 2.0002124309539795, 'validation/num_examples': 50000, 'test/accuracy': 0.45580002665519714, 'test/loss': 2.681110382080078, 'test/num_examples': 10000, 'score': 15365.652275562286, 'total_duration': 16004.144119977951, 'accumulated_submission_time': 15365.652275562286, 'accumulated_eval_time': 636.7960221767426, 'accumulated_logging_time': 1.010751485824585, 'global_step': 45512, 'preemption_count': 0}), (47029, {'train/accuracy': 0.6650390625, 'train/loss': 1.6022390127182007, 'validation/accuracy': 0.6084200143814087, 'validation/loss': 1.8593943119049072, 'validation/num_examples': 50000, 'test/accuracy': 0.4799000322818756, 'test/loss': 2.522620677947998, 'test/num_examples': 10000, 'score': 15875.610366106033, 'total_duration': 16535.40897345543, 'accumulated_submission_time': 15875.610366106033, 'accumulated_eval_time': 658.0466804504395, 'accumulated_logging_time': 1.043591022491455, 'global_step': 47029, 'preemption_count': 0}), (48546, {'train/accuracy': 0.6487563848495483, 'train/loss': 1.6832637786865234, 'validation/accuracy': 0.5993599891662598, 'validation/loss': 1.918584942817688, 'validation/num_examples': 50000, 'test/accuracy': 0.47190001606941223, 'test/loss': 2.5920767784118652, 'test/num_examples': 10000, 'score': 16385.556366205215, 'total_duration': 17066.591827869415, 'accumulated_submission_time': 16385.556366205215, 'accumulated_eval_time': 679.2269690036774, 'accumulated_logging_time': 1.0766348838806152, 'global_step': 48546, 'preemption_count': 0}), (50063, {'train/accuracy': 0.6590401530265808, 'train/loss': 1.6971187591552734, 'validation/accuracy': 0.6092999577522278, 'validation/loss': 1.9145151376724243, 'validation/num_examples': 50000, 'test/accuracy': 0.4806000292301178, 'test/loss': 2.5815019607543945, 'test/num_examples': 10000, 'score': 16895.64519429207, 'total_duration': 17597.92692875862, 'accumulated_submission_time': 16895.64519429207, 'accumulated_eval_time': 700.4168322086334, 'accumulated_logging_time': 1.1097350120544434, 'global_step': 50063, 'preemption_count': 0}), (51580, {'train/accuracy': 0.6513472199440002, 'train/loss': 1.657335877418518, 'validation/accuracy': 0.6073200106620789, 'validation/loss': 1.8589073419570923, 'validation/num_examples': 50000, 'test/accuracy': 0.4838000237941742, 'test/loss': 2.5184545516967773, 'test/num_examples': 10000, 'score': 17405.740804433823, 'total_duration': 18129.24203300476, 'accumulated_submission_time': 17405.740804433823, 'accumulated_eval_time': 721.5812134742737, 'accumulated_logging_time': 1.142387866973877, 'global_step': 51580, 'preemption_count': 0}), (53098, {'train/accuracy': 0.6906289458274841, 'train/loss': 1.4674862623214722, 'validation/accuracy': 0.5989800095558167, 'validation/loss': 1.8677852153778076, 'validation/num_examples': 50000, 'test/accuracy': 0.48270002007484436, 'test/loss': 2.5035319328308105, 'test/num_examples': 10000, 'score': 17915.94573879242, 'total_duration': 18660.634664297104, 'accumulated_submission_time': 17915.94573879242, 'accumulated_eval_time': 742.713464975357, 'accumulated_logging_time': 1.1745285987854004, 'global_step': 53098, 'preemption_count': 0}), (54616, {'train/accuracy': 0.6725525856018066, 'train/loss': 1.5843250751495361, 'validation/accuracy': 0.6118199825286865, 'validation/loss': 1.859339952468872, 'validation/num_examples': 50000, 'test/accuracy': 0.47780001163482666, 'test/loss': 2.5555808544158936, 'test/num_examples': 10000, 'score': 18425.954810380936, 'total_duration': 19191.906791448593, 'accumulated_submission_time': 18425.954810380936, 'accumulated_eval_time': 763.9139442443848, 'accumulated_logging_time': 1.2138841152191162, 'global_step': 54616, 'preemption_count': 0}), (56133, {'train/accuracy': 0.6829958558082581, 'train/loss': 1.4818432331085205, 'validation/accuracy': 0.6251199841499329, 'validation/loss': 1.7404627799987793, 'validation/num_examples': 50000, 'test/accuracy': 0.5063000321388245, 'test/loss': 2.392498254776001, 'test/num_examples': 10000, 'score': 18936.050762176514, 'total_duration': 19723.13370156288, 'accumulated_submission_time': 18936.050762176514, 'accumulated_eval_time': 784.9846830368042, 'accumulated_logging_time': 1.2514803409576416, 'global_step': 56133, 'preemption_count': 0}), (57651, {'train/accuracy': 0.6757014989852905, 'train/loss': 1.5783743858337402, 'validation/accuracy': 0.619879961013794, 'validation/loss': 1.8317604064941406, 'validation/num_examples': 50000, 'test/accuracy': 0.49640002846717834, 'test/loss': 2.4753739833831787, 'test/num_examples': 10000, 'score': 19446.212296009064, 'total_duration': 20254.494203805923, 'accumulated_submission_time': 19446.212296009064, 'accumulated_eval_time': 806.1234366893768, 'accumulated_logging_time': 1.2888352870941162, 'global_step': 57651, 'preemption_count': 0}), (59168, {'train/accuracy': 0.6541573405265808, 'train/loss': 1.6419929265975952, 'validation/accuracy': 0.6013599634170532, 'validation/loss': 1.8765325546264648, 'validation/num_examples': 50000, 'test/accuracy': 0.4806000292301178, 'test/loss': 2.5387802124023438, 'test/num_examples': 10000, 'score': 19956.23943376541, 'total_duration': 20785.629487276077, 'accumulated_submission_time': 19956.23943376541, 'accumulated_eval_time': 827.1725206375122, 'accumulated_logging_time': 1.3252980709075928, 'global_step': 59168, 'preemption_count': 0}), (60685, {'train/accuracy': 0.6724131107330322, 'train/loss': 1.6111648082733154, 'validation/accuracy': 0.6247599720954895, 'validation/loss': 1.8165513277053833, 'validation/num_examples': 50000, 'test/accuracy': 0.49490001797676086, 'test/loss': 2.476473093032837, 'test/num_examples': 10000, 'score': 20466.31075167656, 'total_duration': 21316.890612602234, 'accumulated_submission_time': 20466.31075167656, 'accumulated_eval_time': 848.3012022972107, 'accumulated_logging_time': 1.362497329711914, 'global_step': 60685, 'preemption_count': 0}), (62203, {'train/accuracy': 0.6910076141357422, 'train/loss': 1.4848798513412476, 'validation/accuracy': 0.611739993095398, 'validation/loss': 1.8428031206130981, 'validation/num_examples': 50000, 'test/accuracy': 0.4788000285625458, 'test/loss': 2.5234971046447754, 'test/num_examples': 10000, 'score': 20976.412611722946, 'total_duration': 21848.198214292526, 'accumulated_submission_time': 20976.412611722946, 'accumulated_eval_time': 869.4459004402161, 'accumulated_logging_time': 1.4001359939575195, 'global_step': 62203, 'preemption_count': 0}), (63721, {'train/accuracy': 0.6920240521430969, 'train/loss': 1.4649821519851685, 'validation/accuracy': 0.6265400052070618, 'validation/loss': 1.7699750661849976, 'validation/num_examples': 50000, 'test/accuracy': 0.5057000517845154, 'test/loss': 2.418595314025879, 'test/num_examples': 10000, 'score': 21486.580602407455, 'total_duration': 22379.547651052475, 'accumulated_submission_time': 21486.580602407455, 'accumulated_eval_time': 890.5707561969757, 'accumulated_logging_time': 1.433117389678955, 'global_step': 63721, 'preemption_count': 0}), (65239, {'train/accuracy': 0.6842514276504517, 'train/loss': 1.510816216468811, 'validation/accuracy': 0.6255399584770203, 'validation/loss': 1.7839473485946655, 'validation/num_examples': 50000, 'test/accuracy': 0.4951000213623047, 'test/loss': 2.4557323455810547, 'test/num_examples': 10000, 'score': 21996.856678962708, 'total_duration': 22911.09648013115, 'accumulated_submission_time': 21996.856678962708, 'accumulated_eval_time': 911.7887194156647, 'accumulated_logging_time': 1.464731216430664, 'global_step': 65239, 'preemption_count': 0}), (66757, {'train/accuracy': 0.6906688213348389, 'train/loss': 1.5202724933624268, 'validation/accuracy': 0.6343199610710144, 'validation/loss': 1.7723238468170166, 'validation/num_examples': 50000, 'test/accuracy': 0.5078000426292419, 'test/loss': 2.4403321743011475, 'test/num_examples': 10000, 'score': 22506.95446920395, 'total_duration': 23442.88888812065, 'accumulated_submission_time': 22506.95446920395, 'accumulated_eval_time': 933.426650762558, 'accumulated_logging_time': 1.4981484413146973, 'global_step': 66757, 'preemption_count': 0}), (68275, {'train/accuracy': 0.6909080147743225, 'train/loss': 1.470110535621643, 'validation/accuracy': 0.6406999826431274, 'validation/loss': 1.7015717029571533, 'validation/num_examples': 50000, 'test/accuracy': 0.5154000520706177, 'test/loss': 2.3714752197265625, 'test/num_examples': 10000, 'score': 23017.093948602676, 'total_duration': 23974.409868717194, 'accumulated_submission_time': 23017.093948602676, 'accumulated_eval_time': 954.7498555183411, 'accumulated_logging_time': 1.5329856872558594, 'global_step': 68275, 'preemption_count': 0}), (69792, {'train/accuracy': 0.6947743892669678, 'train/loss': 1.4690238237380981, 'validation/accuracy': 0.6391599774360657, 'validation/loss': 1.700404167175293, 'validation/num_examples': 50000, 'test/accuracy': 0.5162000060081482, 'test/loss': 2.340641736984253, 'test/num_examples': 10000, 'score': 23527.16760492325, 'total_duration': 24505.717046260834, 'accumulated_submission_time': 23527.16760492325, 'accumulated_eval_time': 975.9260275363922, 'accumulated_logging_time': 1.5669758319854736, 'global_step': 69792, 'preemption_count': 0}), (71310, {'train/accuracy': 0.6975247263908386, 'train/loss': 1.482591152191162, 'validation/accuracy': 0.6232199668884277, 'validation/loss': 1.8254543542861938, 'validation/num_examples': 50000, 'test/accuracy': 0.4918000102043152, 'test/loss': 2.5100009441375732, 'test/num_examples': 10000, 'score': 24037.270278930664, 'total_duration': 25036.85267686844, 'accumulated_submission_time': 24037.270278930664, 'accumulated_eval_time': 996.903163433075, 'accumulated_logging_time': 1.599836826324463, 'global_step': 71310, 'preemption_count': 0}), (72829, {'train/accuracy': 0.7215601205825806, 'train/loss': 1.327860713005066, 'validation/accuracy': 0.6525799632072449, 'validation/loss': 1.6442086696624756, 'validation/num_examples': 50000, 'test/accuracy': 0.5234000086784363, 'test/loss': 2.286708116531372, 'test/num_examples': 10000, 'score': 24547.331042289734, 'total_duration': 25568.003078222275, 'accumulated_submission_time': 24547.331042289734, 'accumulated_eval_time': 1017.9354872703552, 'accumulated_logging_time': 1.6340680122375488, 'global_step': 72829, 'preemption_count': 0}), (74347, {'train/accuracy': 0.6916055083274841, 'train/loss': 1.4878246784210205, 'validation/accuracy': 0.6294199824333191, 'validation/loss': 1.7609314918518066, 'validation/num_examples': 50000, 'test/accuracy': 0.5049999952316284, 'test/loss': 2.4299111366271973, 'test/num_examples': 10000, 'score': 25057.428204774857, 'total_duration': 26099.292588233948, 'accumulated_submission_time': 25057.428204774857, 'accumulated_eval_time': 1039.0636780261993, 'accumulated_logging_time': 1.675804615020752, 'global_step': 74347, 'preemption_count': 0}), (75865, {'train/accuracy': 0.7079480290412903, 'train/loss': 1.436248540878296, 'validation/accuracy': 0.6432799696922302, 'validation/loss': 1.7123216390609741, 'validation/num_examples': 50000, 'test/accuracy': 0.5208000540733337, 'test/loss': 2.3495261669158936, 'test/num_examples': 10000, 'score': 25567.620626449585, 'total_duration': 26630.89289879799, 'accumulated_submission_time': 25567.620626449585, 'accumulated_eval_time': 1060.4125316143036, 'accumulated_logging_time': 1.7115974426269531, 'global_step': 75865, 'preemption_count': 0}), (77382, {'train/accuracy': 0.6989995241165161, 'train/loss': 1.422307014465332, 'validation/accuracy': 0.642579972743988, 'validation/loss': 1.6762522459030151, 'validation/num_examples': 50000, 'test/accuracy': 0.5154000520706177, 'test/loss': 2.3255693912506104, 'test/num_examples': 10000, 'score': 26077.580560445786, 'total_duration': 27162.072484493256, 'accumulated_submission_time': 26077.580560445786, 'accumulated_eval_time': 1081.571738243103, 'accumulated_logging_time': 1.748826503753662, 'global_step': 77382, 'preemption_count': 0}), (78900, {'train/accuracy': 0.7405332922935486, 'train/loss': 1.2673977613449097, 'validation/accuracy': 0.6460199952125549, 'validation/loss': 1.6637077331542969, 'validation/num_examples': 50000, 'test/accuracy': 0.5200999975204468, 'test/loss': 2.316652536392212, 'test/num_examples': 10000, 'score': 26587.66230893135, 'total_duration': 27693.276193618774, 'accumulated_submission_time': 26587.66230893135, 'accumulated_eval_time': 1102.640297651291, 'accumulated_logging_time': 1.7789013385772705, 'global_step': 78900, 'preemption_count': 0}), (80417, {'train/accuracy': 0.7322424650192261, 'train/loss': 1.2937663793563843, 'validation/accuracy': 0.6529799699783325, 'validation/loss': 1.6502137184143066, 'validation/num_examples': 50000, 'test/accuracy': 0.527999997138977, 'test/loss': 2.2828421592712402, 'test/num_examples': 10000, 'score': 27097.790013074875, 'total_duration': 28224.572131872177, 'accumulated_submission_time': 27097.790013074875, 'accumulated_eval_time': 1123.7507123947144, 'accumulated_logging_time': 1.813563585281372, 'global_step': 80417, 'preemption_count': 0}), (81935, {'train/accuracy': 0.7216796875, 'train/loss': 1.3296536207199097, 'validation/accuracy': 0.6556800007820129, 'validation/loss': 1.635520577430725, 'validation/num_examples': 50000, 'test/accuracy': 0.5260000228881836, 'test/loss': 2.306215286254883, 'test/num_examples': 10000, 'score': 27607.79568052292, 'total_duration': 28755.640197753906, 'accumulated_submission_time': 27607.79568052292, 'accumulated_eval_time': 1144.751916885376, 'accumulated_logging_time': 1.8511121273040771, 'global_step': 81935, 'preemption_count': 0}), (83452, {'train/accuracy': 0.7293726205825806, 'train/loss': 1.2778840065002441, 'validation/accuracy': 0.664139986038208, 'validation/loss': 1.5573835372924805, 'validation/num_examples': 50000, 'test/accuracy': 0.534500002861023, 'test/loss': 2.211136817932129, 'test/num_examples': 10000, 'score': 28117.902702093124, 'total_duration': 29286.823399305344, 'accumulated_submission_time': 28117.902702093124, 'accumulated_eval_time': 1165.7663543224335, 'accumulated_logging_time': 1.8896336555480957, 'global_step': 83452, 'preemption_count': 0}), (84969, {'train/accuracy': 0.7102997303009033, 'train/loss': 1.3722602128982544, 'validation/accuracy': 0.6526600122451782, 'validation/loss': 1.63387930393219, 'validation/num_examples': 50000, 'test/accuracy': 0.5289000272750854, 'test/loss': 2.2834620475769043, 'test/num_examples': 10000, 'score': 28627.961901664734, 'total_duration': 29817.84640312195, 'accumulated_submission_time': 28627.961901664734, 'accumulated_eval_time': 1186.6677963733673, 'accumulated_logging_time': 1.9287693500518799, 'global_step': 84969, 'preemption_count': 0}), (86487, {'train/accuracy': 0.7192083597183228, 'train/loss': 1.318070411682129, 'validation/accuracy': 0.6582799553871155, 'validation/loss': 1.5826328992843628, 'validation/num_examples': 50000, 'test/accuracy': 0.5327000021934509, 'test/loss': 2.2320618629455566, 'test/num_examples': 10000, 'score': 29138.005053281784, 'total_duration': 30349.136724233627, 'accumulated_submission_time': 29138.005053281784, 'accumulated_eval_time': 1207.8558654785156, 'accumulated_logging_time': 1.964245080947876, 'global_step': 86487, 'preemption_count': 0}), (88005, {'train/accuracy': 0.7746930718421936, 'train/loss': 1.0886952877044678, 'validation/accuracy': 0.6663599610328674, 'validation/loss': 1.5537863969802856, 'validation/num_examples': 50000, 'test/accuracy': 0.5420000553131104, 'test/loss': 2.211760997772217, 'test/num_examples': 10000, 'score': 29648.06053853035, 'total_duration': 30880.23536133766, 'accumulated_submission_time': 29648.06053853035, 'accumulated_eval_time': 1228.8410770893097, 'accumulated_logging_time': 1.9997587203979492, 'global_step': 88005, 'preemption_count': 0}), (89524, {'train/accuracy': 0.7538663744926453, 'train/loss': 1.2191698551177979, 'validation/accuracy': 0.6728999614715576, 'validation/loss': 1.5614254474639893, 'validation/num_examples': 50000, 'test/accuracy': 0.5437000393867493, 'test/loss': 2.203444719314575, 'test/num_examples': 10000, 'score': 30158.248270750046, 'total_duration': 31411.583546876907, 'accumulated_submission_time': 30158.248270750046, 'accumulated_eval_time': 1249.946064710617, 'accumulated_logging_time': 2.0324220657348633, 'global_step': 89524, 'preemption_count': 0}), (91043, {'train/accuracy': 0.7286351919174194, 'train/loss': 1.2893345355987549, 'validation/accuracy': 0.6548799872398376, 'validation/loss': 1.6241097450256348, 'validation/num_examples': 50000, 'test/accuracy': 0.5337000489234924, 'test/loss': 2.274324893951416, 'test/num_examples': 10000, 'score': 30668.21325492859, 'total_duration': 31942.735835552216, 'accumulated_submission_time': 30668.21325492859, 'accumulated_eval_time': 1271.077484369278, 'accumulated_logging_time': 2.065810203552246, 'global_step': 91043, 'preemption_count': 0}), (92561, {'train/accuracy': 0.7453164458274841, 'train/loss': 1.2298004627227783, 'validation/accuracy': 0.6710399985313416, 'validation/loss': 1.5516157150268555, 'validation/num_examples': 50000, 'test/accuracy': 0.5501000285148621, 'test/loss': 2.1891353130340576, 'test/num_examples': 10000, 'score': 31178.181513547897, 'total_duration': 32473.721732854843, 'accumulated_submission_time': 31178.181513547897, 'accumulated_eval_time': 1292.0364754199982, 'accumulated_logging_time': 2.102283000946045, 'global_step': 92561, 'preemption_count': 0}), (94079, {'train/accuracy': 0.7446189522743225, 'train/loss': 1.263240098953247, 'validation/accuracy': 0.677619993686676, 'validation/loss': 1.5534390211105347, 'validation/num_examples': 50000, 'test/accuracy': 0.5550000071525574, 'test/loss': 2.191549062728882, 'test/num_examples': 10000, 'score': 31688.31489801407, 'total_duration': 33004.9747774601, 'accumulated_submission_time': 31688.31489801407, 'accumulated_eval_time': 1313.0971965789795, 'accumulated_logging_time': 2.1376092433929443, 'global_step': 94079, 'preemption_count': 0}), (95597, {'train/accuracy': 0.7449776530265808, 'train/loss': 1.2335659265518188, 'validation/accuracy': 0.6818400025367737, 'validation/loss': 1.5183688402175903, 'validation/num_examples': 50000, 'test/accuracy': 0.5538000464439392, 'test/loss': 2.1452736854553223, 'test/num_examples': 10000, 'score': 32198.41109275818, 'total_duration': 33536.11962342262, 'accumulated_submission_time': 32198.41109275818, 'accumulated_eval_time': 1334.085800409317, 'accumulated_logging_time': 2.1747827529907227, 'global_step': 95597, 'preemption_count': 0}), (97115, {'train/accuracy': 0.7691525816917419, 'train/loss': 1.1535043716430664, 'validation/accuracy': 0.667199969291687, 'validation/loss': 1.590874195098877, 'validation/num_examples': 50000, 'test/accuracy': 0.5401000380516052, 'test/loss': 2.221604347229004, 'test/num_examples': 10000, 'score': 32708.41788005829, 'total_duration': 34067.122673511505, 'accumulated_submission_time': 32708.41788005829, 'accumulated_eval_time': 1355.0201969146729, 'accumulated_logging_time': 2.2134897708892822, 'global_step': 97115, 'preemption_count': 0}), (98632, {'train/accuracy': 0.7757692933082581, 'train/loss': 1.1324656009674072, 'validation/accuracy': 0.690559983253479, 'validation/loss': 1.5002992153167725, 'validation/num_examples': 50000, 'test/accuracy': 0.5630000233650208, 'test/loss': 2.134718656539917, 'test/num_examples': 10000, 'score': 33218.47289562225, 'total_duration': 34598.53514504433, 'accumulated_submission_time': 33218.47289562225, 'accumulated_eval_time': 1376.3223690986633, 'accumulated_logging_time': 2.246415138244629, 'global_step': 98632, 'preemption_count': 0}), (100150, {'train/accuracy': 0.7731584906578064, 'train/loss': 1.1149073839187622, 'validation/accuracy': 0.6941999793052673, 'validation/loss': 1.4566799402236938, 'validation/num_examples': 50000, 'test/accuracy': 0.5652000308036804, 'test/loss': 2.10357928276062, 'test/num_examples': 10000, 'score': 33728.69375014305, 'total_duration': 35129.76139855385, 'accumulated_submission_time': 33728.69375014305, 'accumulated_eval_time': 1397.2703416347504, 'accumulated_logging_time': 2.2805235385894775, 'global_step': 100150, 'preemption_count': 0}), (101668, {'train/accuracy': 0.7729790806770325, 'train/loss': 1.1216973066329956, 'validation/accuracy': 0.6942600011825562, 'validation/loss': 1.4576067924499512, 'validation/num_examples': 50000, 'test/accuracy': 0.5682000517845154, 'test/loss': 2.1024677753448486, 'test/num_examples': 10000, 'score': 34238.80842423439, 'total_duration': 35660.899446964264, 'accumulated_submission_time': 34238.80842423439, 'accumulated_eval_time': 1418.23712515831, 'accumulated_logging_time': 2.314119815826416, 'global_step': 101668, 'preemption_count': 0}), (103187, {'train/accuracy': 0.7712850570678711, 'train/loss': 1.1430009603500366, 'validation/accuracy': 0.6977199912071228, 'validation/loss': 1.4706509113311768, 'validation/num_examples': 50000, 'test/accuracy': 0.5703999996185303, 'test/loss': 2.109886646270752, 'test/num_examples': 10000, 'score': 34749.06494665146, 'total_duration': 36192.19437289238, 'accumulated_submission_time': 34749.06494665146, 'accumulated_eval_time': 1439.2192113399506, 'accumulated_logging_time': 2.3478918075561523, 'global_step': 103187, 'preemption_count': 0}), (104705, {'train/accuracy': 0.7724210619926453, 'train/loss': 1.1543452739715576, 'validation/accuracy': 0.6925199627876282, 'validation/loss': 1.485355019569397, 'validation/num_examples': 50000, 'test/accuracy': 0.5617000460624695, 'test/loss': 2.1401867866516113, 'test/num_examples': 10000, 'score': 35259.32285261154, 'total_duration': 36723.48499917984, 'accumulated_submission_time': 35259.32285261154, 'accumulated_eval_time': 1460.195018529892, 'accumulated_logging_time': 2.381948232650757, 'global_step': 104705, 'preemption_count': 0}), (106224, {'train/accuracy': 0.8110052347183228, 'train/loss': 0.9716266393661499, 'validation/accuracy': 0.7090199589729309, 'validation/loss': 1.3921597003936768, 'validation/num_examples': 50000, 'test/accuracy': 0.5842000246047974, 'test/loss': 2.0155460834503174, 'test/num_examples': 10000, 'score': 35769.33108854294, 'total_duration': 37254.577083826065, 'accumulated_submission_time': 35769.33108854294, 'accumulated_eval_time': 1481.2203319072723, 'accumulated_logging_time': 2.4178240299224854, 'global_step': 106224, 'preemption_count': 0}), (107743, {'train/accuracy': 0.7952407598495483, 'train/loss': 1.0415843725204468, 'validation/accuracy': 0.7023400068283081, 'validation/loss': 1.4332274198532104, 'validation/num_examples': 50000, 'test/accuracy': 0.5763000249862671, 'test/loss': 2.0579850673675537, 'test/num_examples': 10000, 'score': 36279.43704533577, 'total_duration': 37785.853261470795, 'accumulated_submission_time': 36279.43704533577, 'accumulated_eval_time': 1502.319188117981, 'accumulated_logging_time': 2.4656083583831787, 'global_step': 107743, 'preemption_count': 0}), (109260, {'train/accuracy': 0.80961012840271, 'train/loss': 0.9733060002326965, 'validation/accuracy': 0.7177000045776367, 'validation/loss': 1.3576146364212036, 'validation/num_examples': 50000, 'test/accuracy': 0.5911000370979309, 'test/loss': 1.9605129957199097, 'test/num_examples': 10000, 'score': 36789.401881456375, 'total_duration': 38316.75459957123, 'accumulated_submission_time': 36789.401881456375, 'accumulated_eval_time': 1523.1936659812927, 'accumulated_logging_time': 2.504333019256592, 'global_step': 109260, 'preemption_count': 0}), (110778, {'train/accuracy': 0.7960578799247742, 'train/loss': 1.0393996238708496, 'validation/accuracy': 0.7094399929046631, 'validation/loss': 1.413783073425293, 'validation/num_examples': 50000, 'test/accuracy': 0.5824000239372253, 'test/loss': 2.058758020401001, 'test/num_examples': 10000, 'score': 37299.51941990852, 'total_duration': 38847.8635661602, 'accumulated_submission_time': 37299.51941990852, 'accumulated_eval_time': 1544.1275751590729, 'accumulated_logging_time': 2.539213180541992, 'global_step': 110778, 'preemption_count': 0}), (112297, {'train/accuracy': 0.8116230964660645, 'train/loss': 0.9464977383613586, 'validation/accuracy': 0.721019983291626, 'validation/loss': 1.3188493251800537, 'validation/num_examples': 50000, 'test/accuracy': 0.5951000452041626, 'test/loss': 1.9455469846725464, 'test/num_examples': 10000, 'score': 37809.78449511528, 'total_duration': 39379.07870936394, 'accumulated_submission_time': 37809.78449511528, 'accumulated_eval_time': 1565.0159449577332, 'accumulated_logging_time': 2.5778493881225586, 'global_step': 112297, 'preemption_count': 0}), (113815, {'train/accuracy': 0.8452048897743225, 'train/loss': 0.8711603879928589, 'validation/accuracy': 0.7268399596214294, 'validation/loss': 1.3569468259811401, 'validation/num_examples': 50000, 'test/accuracy': 0.5951000452041626, 'test/loss': 1.9684876203536987, 'test/num_examples': 10000, 'score': 38319.7420566082, 'total_duration': 39910.09659743309, 'accumulated_submission_time': 38319.7420566082, 'accumulated_eval_time': 1586.0181345939636, 'accumulated_logging_time': 2.6128127574920654, 'global_step': 113815, 'preemption_count': 0}), (115334, {'train/accuracy': 0.8356186151504517, 'train/loss': 0.846179187297821, 'validation/accuracy': 0.7235400080680847, 'validation/loss': 1.3148432970046997, 'validation/num_examples': 50000, 'test/accuracy': 0.5975000262260437, 'test/loss': 1.953201413154602, 'test/num_examples': 10000, 'score': 38829.9924018383, 'total_duration': 40441.21931767464, 'accumulated_submission_time': 38829.9924018383, 'accumulated_eval_time': 1606.8307423591614, 'accumulated_logging_time': 2.6497488021850586, 'global_step': 115334, 'preemption_count': 0}), (116853, {'train/accuracy': 0.8401426672935486, 'train/loss': 0.8460429906845093, 'validation/accuracy': 0.7312600016593933, 'validation/loss': 1.2911760807037354, 'validation/num_examples': 50000, 'test/accuracy': 0.6062000393867493, 'test/loss': 1.8956819772720337, 'test/num_examples': 10000, 'score': 39340.216069698334, 'total_duration': 40972.4295835495, 'accumulated_submission_time': 39340.216069698334, 'accumulated_eval_time': 1627.757281780243, 'accumulated_logging_time': 2.6862545013427734, 'global_step': 116853, 'preemption_count': 0}), (118372, {'train/accuracy': 0.8466597199440002, 'train/loss': 0.8141149878501892, 'validation/accuracy': 0.7381399869918823, 'validation/loss': 1.2623021602630615, 'validation/num_examples': 50000, 'test/accuracy': 0.6214000582695007, 'test/loss': 1.8717379570007324, 'test/num_examples': 10000, 'score': 39850.24843668938, 'total_duration': 41503.52567815781, 'accumulated_submission_time': 39850.24843668938, 'accumulated_eval_time': 1648.753799200058, 'accumulated_logging_time': 2.7309017181396484, 'global_step': 118372, 'preemption_count': 0}), (119889, {'train/accuracy': 0.8564253449440002, 'train/loss': 0.7718811631202698, 'validation/accuracy': 0.7484999895095825, 'validation/loss': 1.2095959186553955, 'validation/num_examples': 50000, 'test/accuracy': 0.6221000552177429, 'test/loss': 1.825244426727295, 'test/num_examples': 10000, 'score': 40360.281074762344, 'total_duration': 42034.37654709816, 'accumulated_submission_time': 40360.281074762344, 'accumulated_eval_time': 1669.5109317302704, 'accumulated_logging_time': 2.7693583965301514, 'global_step': 119889, 'preemption_count': 0}), (121408, {'train/accuracy': 0.8634805083274841, 'train/loss': 0.7633724212646484, 'validation/accuracy': 0.750220000743866, 'validation/loss': 1.2169595956802368, 'validation/num_examples': 50000, 'test/accuracy': 0.626800000667572, 'test/loss': 1.814558506011963, 'test/num_examples': 10000, 'score': 40870.47562837601, 'total_duration': 42565.58989524841, 'accumulated_submission_time': 40870.47562837601, 'accumulated_eval_time': 1690.4636988639832, 'accumulated_logging_time': 2.8127453327178955, 'global_step': 121408, 'preemption_count': 0}), (122926, {'train/accuracy': 0.896882951259613, 'train/loss': 0.6356571316719055, 'validation/accuracy': 0.7584199905395508, 'validation/loss': 1.1778696775436401, 'validation/num_examples': 50000, 'test/accuracy': 0.6371000409126282, 'test/loss': 1.7690379619598389, 'test/num_examples': 10000, 'score': 41380.697590112686, 'total_duration': 43096.77365708351, 'accumulated_submission_time': 41380.697590112686, 'accumulated_eval_time': 1711.363877773285, 'accumulated_logging_time': 2.851658821105957, 'global_step': 122926, 'preemption_count': 0}), (124445, {'train/accuracy': 0.8963049650192261, 'train/loss': 0.6321005821228027, 'validation/accuracy': 0.7647599577903748, 'validation/loss': 1.1482919454574585, 'validation/num_examples': 50000, 'test/accuracy': 0.6454000473022461, 'test/loss': 1.7265831232070923, 'test/num_examples': 10000, 'score': 41890.86552166939, 'total_duration': 43628.127166986465, 'accumulated_submission_time': 41890.86552166939, 'accumulated_eval_time': 1732.4812920093536, 'accumulated_logging_time': 2.8973982334136963, 'global_step': 124445, 'preemption_count': 0}), (125965, {'train/accuracy': 0.9003308415412903, 'train/loss': 0.6020267009735107, 'validation/accuracy': 0.7695800065994263, 'validation/loss': 1.123653769493103, 'validation/num_examples': 50000, 'test/accuracy': 0.6490000486373901, 'test/loss': 1.7115205526351929, 'test/num_examples': 10000, 'score': 42401.11342215538, 'total_duration': 44159.210112810135, 'accumulated_submission_time': 42401.11342215538, 'accumulated_eval_time': 1753.2567028999329, 'accumulated_logging_time': 2.934680700302124, 'global_step': 125965, 'preemption_count': 0}), (127483, {'train/accuracy': 0.9035195708274841, 'train/loss': 0.5998629331588745, 'validation/accuracy': 0.7721199989318848, 'validation/loss': 1.1213932037353516, 'validation/num_examples': 50000, 'test/accuracy': 0.6554000377655029, 'test/loss': 1.6994982957839966, 'test/num_examples': 10000, 'score': 42911.07231712341, 'total_duration': 44690.04618239403, 'accumulated_submission_time': 42911.07231712341, 'accumulated_eval_time': 1774.066725730896, 'accumulated_logging_time': 2.978921890258789, 'global_step': 127483, 'preemption_count': 0}), (129001, {'train/accuracy': 0.9015664458274841, 'train/loss': 0.6057281494140625, 'validation/accuracy': 0.7714999914169312, 'validation/loss': 1.1205312013626099, 'validation/num_examples': 50000, 'test/accuracy': 0.6547000408172607, 'test/loss': 1.7005892992019653, 'test/num_examples': 10000, 'score': 43421.583512067795, 'total_duration': 45221.447182655334, 'accumulated_submission_time': 43421.583512067795, 'accumulated_eval_time': 1794.8810048103333, 'accumulated_logging_time': 3.0320894718170166, 'global_step': 129001, 'preemption_count': 0}), (130519, {'train/accuracy': 0.9035793542861938, 'train/loss': 0.5980676412582397, 'validation/accuracy': 0.7721799612045288, 'validation/loss': 1.119748592376709, 'validation/num_examples': 50000, 'test/accuracy': 0.6549000144004822, 'test/loss': 1.6995123624801636, 'test/num_examples': 10000, 'score': 43931.55882525444, 'total_duration': 45752.34780359268, 'accumulated_submission_time': 43931.55882525444, 'accumulated_eval_time': 1815.7315328121185, 'accumulated_logging_time': 3.084369659423828, 'global_step': 130519, 'preemption_count': 0}), (132036, {'train/accuracy': 0.9046555757522583, 'train/loss': 0.5945121645927429, 'validation/accuracy': 0.7727400064468384, 'validation/loss': 1.1170986890792847, 'validation/num_examples': 50000, 'test/accuracy': 0.6539000272750854, 'test/loss': 1.6980239152908325, 'test/num_examples': 10000, 'score': 44441.67366838455, 'total_duration': 46283.51500082016, 'accumulated_submission_time': 44441.67366838455, 'accumulated_eval_time': 1836.7223308086395, 'accumulated_logging_time': 3.1229963302612305, 'global_step': 132036, 'preemption_count': 0}), (133554, {'train/accuracy': 0.9058513641357422, 'train/loss': 0.5935448408126831, 'validation/accuracy': 0.772599995136261, 'validation/loss': 1.117397427558899, 'validation/num_examples': 50000, 'test/accuracy': 0.6536000370979309, 'test/loss': 1.6978504657745361, 'test/num_examples': 10000, 'score': 44951.80677986145, 'total_duration': 46814.53054857254, 'accumulated_submission_time': 44951.80677986145, 'accumulated_eval_time': 1857.5457971096039, 'accumulated_logging_time': 3.1587300300598145, 'global_step': 133554, 'preemption_count': 0}), (135072, {'train/accuracy': 0.9068678021430969, 'train/loss': 0.5902974605560303, 'validation/accuracy': 0.7732200026512146, 'validation/loss': 1.1165798902511597, 'validation/num_examples': 50000, 'test/accuracy': 0.6538000106811523, 'test/loss': 1.698063611984253, 'test/num_examples': 10000, 'score': 45461.94407606125, 'total_duration': 47345.66832566261, 'accumulated_submission_time': 45461.94407606125, 'accumulated_eval_time': 1878.4849972724915, 'accumulated_logging_time': 3.197124719619751, 'global_step': 135072, 'preemption_count': 0}), (136590, {'train/accuracy': 0.9063496589660645, 'train/loss': 0.5868960022926331, 'validation/accuracy': 0.7733599543571472, 'validation/loss': 1.11497962474823, 'validation/num_examples': 50000, 'test/accuracy': 0.6541000604629517, 'test/loss': 1.6960680484771729, 'test/num_examples': 10000, 'score': 45971.88573241234, 'total_duration': 47876.49062347412, 'accumulated_submission_time': 45971.88573241234, 'accumulated_eval_time': 1899.3071541786194, 'accumulated_logging_time': 3.2327637672424316, 'global_step': 136590, 'preemption_count': 0}), (138108, {'train/accuracy': 0.907645046710968, 'train/loss': 0.5866842865943909, 'validation/accuracy': 0.7729199528694153, 'validation/loss': 1.1145741939544678, 'validation/num_examples': 50000, 'test/accuracy': 0.6550000309944153, 'test/loss': 1.6953842639923096, 'test/num_examples': 10000, 'score': 46481.956923007965, 'total_duration': 48407.56366467476, 'accumulated_submission_time': 46481.956923007965, 'accumulated_eval_time': 1920.2494142055511, 'accumulated_logging_time': 3.2690396308898926, 'global_step': 138108, 'preemption_count': 0}), (139627, {'train/accuracy': 0.9070870280265808, 'train/loss': 0.5904383659362793, 'validation/accuracy': 0.7734000086784363, 'validation/loss': 1.1148360967636108, 'validation/num_examples': 50000, 'test/accuracy': 0.6565000414848328, 'test/loss': 1.696603775024414, 'test/num_examples': 10000, 'score': 46991.960359573364, 'total_duration': 48938.4100048542, 'accumulated_submission_time': 46991.960359573364, 'accumulated_eval_time': 1941.032898902893, 'accumulated_logging_time': 3.3056366443634033, 'global_step': 139627, 'preemption_count': 0}), (140000, {'train/accuracy': 0.9084422588348389, 'train/loss': 0.578050971031189, 'validation/accuracy': 0.7722199559211731, 'validation/loss': 1.1122545003890991, 'validation/num_examples': 50000, 'test/accuracy': 0.655500054359436, 'test/loss': 1.6934614181518555, 'test/num_examples': 10000, 'score': 47117.04019618034, 'total_duration': 49084.18835735321, 'accumulated_submission_time': 47117.04019618034, 'accumulated_eval_time': 1961.6837706565857, 'accumulated_logging_time': 3.3469996452331543, 'global_step': 140000, 'preemption_count': 0})], 'global_step': 140000} | |
I0914 20:52:45.258277 139785753851712 submission_runner.py:543] Timing: 47117.04019618034 | |
I0914 20:52:45.258334 139785753851712 submission_runner.py:545] Total number of evals: 94 | |
I0914 20:52:45.258379 139785753851712 submission_runner.py:546] ==================== | |
I0914 20:52:45.258617 139785753851712 submission_runner.py:614] Final imagenet_resnet score: 47117.04019618034 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment