Skip to content

Instantly share code, notes, and snippets.

@priyakasimbeg
Created March 29, 2024 20:54
Show Gist options
  • Save priyakasimbeg/c66aa6785353705ad977f59070aad365 to your computer and use it in GitHub Desktop.
Save priyakasimbeg/c66aa6785353705ad977f59070aad365 to your computer and use it in GitHub Desktop.
resnet momentum 9-2023
python3 submission_runner.py --framework=jax --workload=imagenet_resnet --submission_path=reference_algorithms/target_setting_algorithms/jax_momentum.py --tuning_search_space=reference_algorithms/target_setting_algorithms/imagenet_resnet/tuning_search_space.json --data_dir=/data/imagenet/jax --num_tuning_trials=1 --experiment_dir=/experiment_runs --experiment_name=targets_check_jax/momentum_run_0 --overwrite=true --save_checkpoints=false --max_global_steps=140000 --imagenet_v2_data_dir=/data/imagenet/jax 2>&1 | tee -a /logs/imagenet_resnet_jax_09-14-2023-07-13-53.log
2023-09-14 07:13:58.404017: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning:
TensorFlow Addons (TFA) has ended development and introduction of new features.
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP).
For more information see: https://github.com/tensorflow/addons/issues/2807
warnings.warn(
I0914 07:14:16.944514 139785753851712 logger_utils.py:76] Creating experiment directory at /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax.
I0914 07:14:17.916458 139785753851712 xla_bridge.py:455] Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Host Interpreter CUDA
I0914 07:14:17.917233 139785753851712 xla_bridge.py:455] Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client'
I0914 07:14:17.917379 139785753851712 xla_bridge.py:455] Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
I0914 07:14:17.923749 139785753851712 submission_runner.py:500] Using RNG seed 2760784846
I0914 07:14:23.812150 139785753851712 submission_runner.py:509] --- Tuning run 1/1 ---
I0914 07:14:23.812360 139785753851712 submission_runner.py:514] Creating tuning directory at /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1.
I0914 07:14:23.812535 139785753851712 logger_utils.py:92] Saving hparams to /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1/hparams.json.
I0914 07:14:23.996968 139785753851712 submission_runner.py:185] Initializing dataset.
I0914 07:14:24.013192 139785753851712 dataset_info.py:578] Load dataset info from /data/imagenet/jax/imagenet2012/5.1.0
I0914 07:14:24.023574 139785753851712 dataset_info.py:669] Fields info.[splits, supervised_keys] from disk and from code do not match. Keeping the one from code.
I0914 07:14:24.404851 139785753851712 logging_logger.py:49] Constructing tf.data.Dataset imagenet2012 for split train, from /data/imagenet/jax/imagenet2012/5.1.0
I0914 07:14:25.602084 139785753851712 submission_runner.py:192] Initializing model.
I0914 07:14:36.559219 139785753851712 submission_runner.py:226] Initializing optimizer.
I0914 07:14:38.135653 139785753851712 submission_runner.py:233] Initializing metrics bundle.
I0914 07:14:38.135895 139785753851712 submission_runner.py:251] Initializing checkpoint and logger.
I0914 07:14:38.137257 139785753851712 checkpoints.py:915] Found no checkpoint files in /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1 with prefix checkpoint_
I0914 07:14:39.022184 139785753851712 submission_runner.py:272] Saving meta data to /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1/meta_data_0.json.
I0914 07:14:39.023321 139785753851712 submission_runner.py:275] Saving flags to /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1/flags_0.json.
I0914 07:14:39.033189 139785753851712 submission_runner.py:285] Starting training loop.
2023-09-14 07:15:37.344488: E external/xla/xla/service/rendezvous.cc:31] This thread has been waiting for 10 seconds and may be stuck:
2023-09-14 07:15:39.805411: E external/xla/xla/service/rendezvous.cc:36] Thread is unstuck! Warning above was a false-positive. Perhaps the timeout is too short.
I0914 07:15:41.341467 139620620691200 logging_writer.py:48] [0] global_step=0, grad_norm=0.5389119982719421, loss=6.927049160003662
I0914 07:15:41.356961 139785753851712 spec.py:320] Evaluating on the training split.
I0914 07:15:42.325510 139785753851712 dataset_info.py:578] Load dataset info from /data/imagenet/jax/imagenet2012/5.1.0
I0914 07:15:42.334606 139785753851712 dataset_info.py:669] Fields info.[splits, supervised_keys] from disk and from code do not match. Keeping the one from code.
I0914 07:15:42.417476 139785753851712 logging_logger.py:49] Constructing tf.data.Dataset imagenet2012 for split train, from /data/imagenet/jax/imagenet2012/5.1.0
I0914 07:15:55.259609 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 07:15:56.705544 139785753851712 dataset_info.py:578] Load dataset info from /data/imagenet/jax/imagenet2012/5.1.0
I0914 07:15:56.731734 139785753851712 dataset_info.py:669] Fields info.[splits, supervised_keys] from disk and from code do not match. Keeping the one from code.
I0914 07:15:56.805313 139785753851712 logging_logger.py:49] Constructing tf.data.Dataset imagenet2012 for split validation, from /data/imagenet/jax/imagenet2012/5.1.0
I0914 07:16:16.525549 139785753851712 spec.py:348] Evaluating on the test split.
I0914 07:16:17.323033 139785753851712 dataset_info.py:578] Load dataset info from /data/imagenet/jax/imagenet_v2/matched-frequency/3.0.0
I0914 07:16:17.328617 139785753851712 dataset_builder.py:528] Reusing dataset imagenet_v2 (/data/imagenet/jax/imagenet_v2/matched-frequency/3.0.0)
I0914 07:16:17.367002 139785753851712 logging_logger.py:49] Constructing tf.data.Dataset imagenet_v2 for split test, from /data/imagenet/jax/imagenet_v2/matched-frequency/3.0.0
I0914 07:16:28.126622 139785753851712 submission_runner.py:376] Time since start: 109.09s, Step: 1, {'train/accuracy': 0.0009367027669213712, 'train/loss': 6.9118571281433105, 'validation/accuracy': 0.0010400000028312206, 'validation/loss': 6.911978721618652, 'validation/num_examples': 50000, 'test/accuracy': 0.0014000000664964318, 'test/loss': 6.91181755065918, 'test/num_examples': 10000, 'score': 62.32368874549866, 'total_duration': 109.0933792591095, 'accumulated_submission_time': 62.32368874549866, 'accumulated_eval_time': 46.76959991455078, 'accumulated_logging_time': 0}
I0914 07:16:28.146100 139590321039104 logging_writer.py:48] [1] accumulated_eval_time=46.769600, accumulated_logging_time=0, accumulated_submission_time=62.323689, global_step=1, preemption_count=0, score=62.323689, test/accuracy=0.001400, test/loss=6.911818, test/num_examples=10000, total_duration=109.093379, train/accuracy=0.000937, train/loss=6.911857, validation/accuracy=0.001040, validation/loss=6.911979, validation/num_examples=50000
I0914 07:16:28.500410 139590329431808 logging_writer.py:48] [1] global_step=1, grad_norm=0.5436305403709412, loss=6.927357196807861
I0914 07:16:28.843997 139590321039104 logging_writer.py:48] [2] global_step=2, grad_norm=0.5420133471488953, loss=6.9325995445251465
I0914 07:16:29.183627 139590329431808 logging_writer.py:48] [3] global_step=3, grad_norm=0.5460174679756165, loss=6.930103302001953
I0914 07:16:29.517827 139590321039104 logging_writer.py:48] [4] global_step=4, grad_norm=0.5241100788116455, loss=6.91835880279541
I0914 07:16:29.859511 139590329431808 logging_writer.py:48] [5] global_step=5, grad_norm=0.5328718423843384, loss=6.935100078582764
I0914 07:16:30.198420 139590321039104 logging_writer.py:48] [6] global_step=6, grad_norm=0.5385733842849731, loss=6.926104545593262
I0914 07:16:30.534452 139590329431808 logging_writer.py:48] [7] global_step=7, grad_norm=0.5387418866157532, loss=6.923493385314941
I0914 07:16:30.869961 139590321039104 logging_writer.py:48] [8] global_step=8, grad_norm=0.5350285172462463, loss=6.930649280548096
I0914 07:16:31.209498 139590329431808 logging_writer.py:48] [9] global_step=9, grad_norm=0.5604594945907593, loss=6.93665075302124
I0914 07:16:31.547532 139590321039104 logging_writer.py:48] [10] global_step=10, grad_norm=0.5472524762153625, loss=6.926746368408203
I0914 07:16:31.883261 139590329431808 logging_writer.py:48] [11] global_step=11, grad_norm=0.5602211356163025, loss=6.931008338928223
I0914 07:16:32.220176 139590321039104 logging_writer.py:48] [12] global_step=12, grad_norm=0.5302987694740295, loss=6.921492576599121
I0914 07:16:32.559637 139590329431808 logging_writer.py:48] [13] global_step=13, grad_norm=0.5309913754463196, loss=6.919932842254639
I0914 07:16:32.898500 139590321039104 logging_writer.py:48] [14] global_step=14, grad_norm=0.5365314483642578, loss=6.923084259033203
I0914 07:16:33.235160 139590329431808 logging_writer.py:48] [15] global_step=15, grad_norm=0.5396639108657837, loss=6.920051574707031
I0914 07:16:33.573624 139590321039104 logging_writer.py:48] [16] global_step=16, grad_norm=0.5340028405189514, loss=6.925442218780518
I0914 07:16:33.910352 139590329431808 logging_writer.py:48] [17] global_step=17, grad_norm=0.5570560097694397, loss=6.911653518676758
I0914 07:16:34.247454 139590321039104 logging_writer.py:48] [18] global_step=18, grad_norm=0.545469343662262, loss=6.915667533874512
I0914 07:16:34.590326 139590329431808 logging_writer.py:48] [19] global_step=19, grad_norm=0.5328952670097351, loss=6.908725261688232
I0914 07:16:34.927754 139590321039104 logging_writer.py:48] [20] global_step=20, grad_norm=0.5461469888687134, loss=6.909825325012207
I0914 07:16:35.265347 139590329431808 logging_writer.py:48] [21] global_step=21, grad_norm=0.5239453911781311, loss=6.9054274559021
I0914 07:16:35.601920 139590321039104 logging_writer.py:48] [22] global_step=22, grad_norm=0.536472499370575, loss=6.912668228149414
I0914 07:16:35.942337 139590329431808 logging_writer.py:48] [23] global_step=23, grad_norm=0.5243082046508789, loss=6.909153461456299
I0914 07:16:36.279525 139590321039104 logging_writer.py:48] [24] global_step=24, grad_norm=0.5406576991081238, loss=6.90311336517334
I0914 07:16:36.630361 139590329431808 logging_writer.py:48] [25] global_step=25, grad_norm=0.5318547487258911, loss=6.890480995178223
I0914 07:16:36.971984 139590321039104 logging_writer.py:48] [26] global_step=26, grad_norm=0.5387895107269287, loss=6.902065277099609
I0914 07:16:37.316698 139590329431808 logging_writer.py:48] [27] global_step=27, grad_norm=0.5189248323440552, loss=6.895449638366699
I0914 07:16:37.653784 139590321039104 logging_writer.py:48] [28] global_step=28, grad_norm=0.5199077725410461, loss=6.890660285949707
I0914 07:16:38.001337 139590329431808 logging_writer.py:48] [29] global_step=29, grad_norm=0.5134077072143555, loss=6.892641067504883
I0914 07:16:38.346123 139590321039104 logging_writer.py:48] [30] global_step=30, grad_norm=0.526112973690033, loss=6.894507884979248
I0914 07:16:38.683314 139590329431808 logging_writer.py:48] [31] global_step=31, grad_norm=0.5195169448852539, loss=6.889142036437988
I0914 07:16:39.021874 139590321039104 logging_writer.py:48] [32] global_step=32, grad_norm=0.5332788825035095, loss=6.886667251586914
I0914 07:16:39.359480 139590329431808 logging_writer.py:48] [33] global_step=33, grad_norm=0.5372386574745178, loss=6.903325080871582
I0914 07:16:39.696024 139590321039104 logging_writer.py:48] [34] global_step=34, grad_norm=0.5352444648742676, loss=6.8947858810424805
I0914 07:16:40.031192 139590329431808 logging_writer.py:48] [35] global_step=35, grad_norm=0.5384101271629333, loss=6.880593776702881
I0914 07:16:40.370426 139590321039104 logging_writer.py:48] [36] global_step=36, grad_norm=0.5275368690490723, loss=6.8852620124816895
I0914 07:16:40.709239 139590329431808 logging_writer.py:48] [37] global_step=37, grad_norm=0.5337218642234802, loss=6.8845086097717285
I0914 07:16:41.057002 139590321039104 logging_writer.py:48] [38] global_step=38, grad_norm=0.5435961484909058, loss=6.879532814025879
I0914 07:16:41.394190 139590329431808 logging_writer.py:48] [39] global_step=39, grad_norm=0.551781177520752, loss=6.884259223937988
I0914 07:16:41.729692 139590321039104 logging_writer.py:48] [40] global_step=40, grad_norm=0.548706591129303, loss=6.873312950134277
I0914 07:16:42.069289 139590329431808 logging_writer.py:48] [41] global_step=41, grad_norm=0.5305836796760559, loss=6.873071670532227
I0914 07:16:42.407241 139590321039104 logging_writer.py:48] [42] global_step=42, grad_norm=0.5488153100013733, loss=6.874739646911621
I0914 07:16:42.750670 139590329431808 logging_writer.py:48] [43] global_step=43, grad_norm=0.5339630246162415, loss=6.8718953132629395
I0914 07:16:43.091645 139590321039104 logging_writer.py:48] [44] global_step=44, grad_norm=0.5474773049354553, loss=6.872103691101074
I0914 07:16:43.431213 139590329431808 logging_writer.py:48] [45] global_step=45, grad_norm=0.5713949203491211, loss=6.85746431350708
I0914 07:16:43.771043 139590321039104 logging_writer.py:48] [46] global_step=46, grad_norm=0.5614795684814453, loss=6.87053918838501
I0914 07:16:44.118433 139590329431808 logging_writer.py:48] [47] global_step=47, grad_norm=0.5435739755630493, loss=6.856149673461914
I0914 07:16:44.460348 139590321039104 logging_writer.py:48] [48] global_step=48, grad_norm=0.5413722991943359, loss=6.855347633361816
I0914 07:16:44.802573 139590329431808 logging_writer.py:48] [49] global_step=49, grad_norm=0.5439017415046692, loss=6.848852157592773
I0914 07:16:45.140948 139590321039104 logging_writer.py:48] [50] global_step=50, grad_norm=0.557495653629303, loss=6.874087810516357
I0914 07:16:45.481902 139590329431808 logging_writer.py:48] [51] global_step=51, grad_norm=0.5573470592498779, loss=6.860464096069336
I0914 07:16:45.824782 139590321039104 logging_writer.py:48] [52] global_step=52, grad_norm=0.5395156145095825, loss=6.848325729370117
I0914 07:16:46.168753 139590329431808 logging_writer.py:48] [53] global_step=53, grad_norm=0.5514255166053772, loss=6.842846393585205
I0914 07:16:46.508444 139590321039104 logging_writer.py:48] [54] global_step=54, grad_norm=0.5766259431838989, loss=6.854337215423584
I0914 07:16:46.847536 139590329431808 logging_writer.py:48] [55] global_step=55, grad_norm=0.548021137714386, loss=6.82659912109375
I0914 07:16:47.191233 139590321039104 logging_writer.py:48] [56] global_step=56, grad_norm=0.5588611364364624, loss=6.844107151031494
I0914 07:16:47.528039 139590329431808 logging_writer.py:48] [57] global_step=57, grad_norm=0.5480129718780518, loss=6.835235118865967
I0914 07:16:47.864345 139590321039104 logging_writer.py:48] [58] global_step=58, grad_norm=0.5657825469970703, loss=6.8115458488464355
I0914 07:16:48.199600 139590329431808 logging_writer.py:48] [59] global_step=59, grad_norm=0.5682162046432495, loss=6.814191818237305
I0914 07:16:48.537695 139590321039104 logging_writer.py:48] [60] global_step=60, grad_norm=0.5582947731018066, loss=6.820403099060059
I0914 07:16:48.878333 139590329431808 logging_writer.py:48] [61] global_step=61, grad_norm=0.561733067035675, loss=6.821713924407959
I0914 07:16:49.216136 139590321039104 logging_writer.py:48] [62] global_step=62, grad_norm=0.5706186294555664, loss=6.80854606628418
I0914 07:16:49.550317 139590329431808 logging_writer.py:48] [63] global_step=63, grad_norm=0.5717720985412598, loss=6.817758083343506
I0914 07:16:49.898351 139590321039104 logging_writer.py:48] [64] global_step=64, grad_norm=0.5713201761245728, loss=6.8105292320251465
I0914 07:16:50.235677 139590329431808 logging_writer.py:48] [65] global_step=65, grad_norm=0.5701669454574585, loss=6.792567253112793
I0914 07:16:50.571818 139590321039104 logging_writer.py:48] [66] global_step=66, grad_norm=0.5822696089744568, loss=6.782873630523682
I0914 07:16:50.909978 139590329431808 logging_writer.py:48] [67] global_step=67, grad_norm=0.584923505783081, loss=6.796526908874512
I0914 07:16:51.247188 139590321039104 logging_writer.py:48] [68] global_step=68, grad_norm=0.5622556209564209, loss=6.78099250793457
I0914 07:16:51.587746 139590329431808 logging_writer.py:48] [69] global_step=69, grad_norm=0.6046913266181946, loss=6.79979944229126
I0914 07:16:51.922210 139590321039104 logging_writer.py:48] [70] global_step=70, grad_norm=0.587131917476654, loss=6.79705810546875
I0914 07:16:52.263475 139590329431808 logging_writer.py:48] [71] global_step=71, grad_norm=0.5949347615242004, loss=6.78912353515625
I0914 07:16:52.607199 139590321039104 logging_writer.py:48] [72] global_step=72, grad_norm=0.5920963883399963, loss=6.785521507263184
I0914 07:16:52.947867 139590329431808 logging_writer.py:48] [73] global_step=73, grad_norm=0.5777257084846497, loss=6.789895534515381
I0914 07:16:53.290904 139590321039104 logging_writer.py:48] [74] global_step=74, grad_norm=0.5883252024650574, loss=6.7708892822265625
I0914 07:16:53.629848 139590329431808 logging_writer.py:48] [75] global_step=75, grad_norm=0.6013261079788208, loss=6.777152061462402
I0914 07:16:53.973304 139590321039104 logging_writer.py:48] [76] global_step=76, grad_norm=0.5913688540458679, loss=6.762724876403809
I0914 07:16:54.312757 139590329431808 logging_writer.py:48] [77] global_step=77, grad_norm=0.6064963936805725, loss=6.779829025268555
I0914 07:16:54.654629 139590321039104 logging_writer.py:48] [78] global_step=78, grad_norm=0.6053351759910583, loss=6.746735572814941
I0914 07:16:54.994006 139590329431808 logging_writer.py:48] [79] global_step=79, grad_norm=0.5931686758995056, loss=6.754239082336426
I0914 07:16:55.331990 139590321039104 logging_writer.py:48] [80] global_step=80, grad_norm=0.5849683880805969, loss=6.748963356018066
I0914 07:16:55.670944 139590329431808 logging_writer.py:48] [81] global_step=81, grad_norm=0.5978469252586365, loss=6.737304210662842
I0914 07:16:56.019518 139590321039104 logging_writer.py:48] [82] global_step=82, grad_norm=0.5988901853561401, loss=6.765451908111572
I0914 07:16:56.365621 139590329431808 logging_writer.py:48] [83] global_step=83, grad_norm=0.6001420617103577, loss=6.743647575378418
I0914 07:16:56.708692 139590321039104 logging_writer.py:48] [84] global_step=84, grad_norm=0.602545976638794, loss=6.7492852210998535
I0914 07:16:57.049725 139590329431808 logging_writer.py:48] [85] global_step=85, grad_norm=0.6386739611625671, loss=6.7281060218811035
I0914 07:16:57.395249 139590321039104 logging_writer.py:48] [86] global_step=86, grad_norm=0.6063979268074036, loss=6.74575138092041
I0914 07:16:57.735456 139590329431808 logging_writer.py:48] [87] global_step=87, grad_norm=0.6207473874092102, loss=6.7130446434021
I0914 07:16:58.072184 139590321039104 logging_writer.py:48] [88] global_step=88, grad_norm=0.6147616505622864, loss=6.738728046417236
I0914 07:16:58.412707 139590329431808 logging_writer.py:48] [89] global_step=89, grad_norm=0.6013393402099609, loss=6.706392288208008
I0914 07:16:58.749669 139590321039104 logging_writer.py:48] [90] global_step=90, grad_norm=0.6069715619087219, loss=6.70119571685791
I0914 07:16:59.086777 139590329431808 logging_writer.py:48] [91] global_step=91, grad_norm=0.6262779235839844, loss=6.752510070800781
I0914 07:16:59.431769 139590321039104 logging_writer.py:48] [92] global_step=92, grad_norm=0.6138813495635986, loss=6.705592155456543
I0914 07:16:59.780551 139590329431808 logging_writer.py:48] [93] global_step=93, grad_norm=0.6135011315345764, loss=6.708225250244141
I0914 07:17:00.120775 139590321039104 logging_writer.py:48] [94] global_step=94, grad_norm=0.6277234554290771, loss=6.7320170402526855
I0914 07:17:00.455492 139590329431808 logging_writer.py:48] [95] global_step=95, grad_norm=0.6197319030761719, loss=6.67308235168457
I0914 07:17:00.791080 139590321039104 logging_writer.py:48] [96] global_step=96, grad_norm=0.6064175367355347, loss=6.688239097595215
I0914 07:17:01.137901 139590329431808 logging_writer.py:48] [97] global_step=97, grad_norm=0.6338247060775757, loss=6.662160873413086
I0914 07:17:01.489339 139590321039104 logging_writer.py:48] [98] global_step=98, grad_norm=0.6168617606163025, loss=6.703256607055664
I0914 07:17:01.829795 139590329431808 logging_writer.py:48] [99] global_step=99, grad_norm=0.6174088716506958, loss=6.699215888977051
I0914 07:17:02.169407 139590321039104 logging_writer.py:48] [100] global_step=100, grad_norm=0.6163209080696106, loss=6.67160701751709
I0914 07:19:17.025958 139590329431808 logging_writer.py:48] [500] global_step=500, grad_norm=0.5776574611663818, loss=6.170225143432617
I0914 07:22:05.670546 139590321039104 logging_writer.py:48] [1000] global_step=1000, grad_norm=0.4634348154067993, loss=5.531285762786865
I0914 07:24:54.105559 139590329431808 logging_writer.py:48] [1500] global_step=1500, grad_norm=0.4536585807800293, loss=5.192131996154785
I0914 07:24:58.233243 139785753851712 spec.py:320] Evaluating on the training split.
I0914 07:25:05.431389 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 07:25:13.491719 139785753851712 spec.py:348] Evaluating on the test split.
I0914 07:25:15.792362 139785753851712 submission_runner.py:376] Time since start: 636.76s, Step: 1514, {'train/accuracy': 0.18895487487316132, 'train/loss': 4.249844551086426, 'validation/accuracy': 0.1698399931192398, 'validation/loss': 4.387373924255371, 'validation/num_examples': 50000, 'test/accuracy': 0.12890000641345978, 'test/loss': 4.780310153961182, 'test/num_examples': 10000, 'score': 572.3801600933075, 'total_duration': 636.7591044902802, 'accumulated_submission_time': 572.3801600933075, 'accumulated_eval_time': 64.32867097854614, 'accumulated_logging_time': 0.02817511558532715}
I0914 07:25:15.810337 139590530758400 logging_writer.py:48] [1514] accumulated_eval_time=64.328671, accumulated_logging_time=0.028175, accumulated_submission_time=572.380160, global_step=1514, preemption_count=0, score=572.380160, test/accuracy=0.128900, test/loss=4.780310, test/num_examples=10000, total_duration=636.759104, train/accuracy=0.188955, train/loss=4.249845, validation/accuracy=0.169840, validation/loss=4.387374, validation/num_examples=50000
I0914 07:27:59.783074 139590614619904 logging_writer.py:48] [2000] global_step=2000, grad_norm=0.39719879627227783, loss=4.812987327575684
I0914 07:30:48.130979 139590530758400 logging_writer.py:48] [2500] global_step=2500, grad_norm=0.34383267164230347, loss=4.628384590148926
I0914 07:33:36.385151 139590614619904 logging_writer.py:48] [3000] global_step=3000, grad_norm=0.34005293250083923, loss=4.377949237823486
I0914 07:33:45.896412 139785753851712 spec.py:320] Evaluating on the training split.
I0914 07:33:53.137686 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 07:34:01.295050 139785753851712 spec.py:348] Evaluating on the test split.
I0914 07:34:03.611705 139785753851712 submission_runner.py:376] Time since start: 1164.58s, Step: 3030, {'train/accuracy': 0.35439252853393555, 'train/loss': 3.16475772857666, 'validation/accuracy': 0.32311999797821045, 'validation/loss': 3.354123592376709, 'validation/num_examples': 50000, 'test/accuracy': 0.24500000476837158, 'test/loss': 3.896735429763794, 'test/num_examples': 10000, 'score': 1082.4337828159332, 'total_duration': 1164.5784351825714, 'accumulated_submission_time': 1082.4337828159332, 'accumulated_eval_time': 82.04390573501587, 'accumulated_logging_time': 0.05537152290344238}
I0914 07:34:03.630025 139620444509952 logging_writer.py:48] [3030] accumulated_eval_time=82.043906, accumulated_logging_time=0.055372, accumulated_submission_time=1082.433783, global_step=3030, preemption_count=0, score=1082.433783, test/accuracy=0.245000, test/loss=3.896735, test/num_examples=10000, total_duration=1164.578435, train/accuracy=0.354393, train/loss=3.164758, validation/accuracy=0.323120, validation/loss=3.354124, validation/num_examples=50000
I0914 07:36:42.183385 139621082027776 logging_writer.py:48] [3500] global_step=3500, grad_norm=0.3057088851928711, loss=4.292041301727295
I0914 07:39:30.385200 139620444509952 logging_writer.py:48] [4000] global_step=4000, grad_norm=0.29683586955070496, loss=4.165511608123779
I0914 07:42:18.588014 139621082027776 logging_writer.py:48] [4500] global_step=4500, grad_norm=0.2872718274593353, loss=4.058863162994385
I0914 07:42:33.815204 139785753851712 spec.py:320] Evaluating on the training split.
I0914 07:42:41.004353 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 07:42:49.069682 139785753851712 spec.py:348] Evaluating on the test split.
I0914 07:42:51.320148 139785753851712 submission_runner.py:376] Time since start: 1692.29s, Step: 4547, {'train/accuracy': 0.4696468412876129, 'train/loss': 2.565059185028076, 'validation/accuracy': 0.4343400001525879, 'validation/loss': 2.7238423824310303, 'validation/num_examples': 50000, 'test/accuracy': 0.3296000063419342, 'test/loss': 3.3760478496551514, 'test/num_examples': 10000, 'score': 1592.5839030742645, 'total_duration': 1692.2868733406067, 'accumulated_submission_time': 1592.5839030742645, 'accumulated_eval_time': 99.54879140853882, 'accumulated_logging_time': 0.08617043495178223}
I0914 07:42:51.339463 139620444509952 logging_writer.py:48] [4547] accumulated_eval_time=99.548791, accumulated_logging_time=0.086170, accumulated_submission_time=1592.583903, global_step=4547, preemption_count=0, score=1592.583903, test/accuracy=0.329600, test/loss=3.376048, test/num_examples=10000, total_duration=1692.286873, train/accuracy=0.469647, train/loss=2.565059, validation/accuracy=0.434340, validation/loss=2.723842, validation/num_examples=50000
I0914 07:45:24.082047 139620452902656 logging_writer.py:48] [5000] global_step=5000, grad_norm=0.2850759029388428, loss=4.08101749420166
I0914 07:48:12.281002 139620444509952 logging_writer.py:48] [5500] global_step=5500, grad_norm=0.2590577006340027, loss=3.891889810562134
I0914 07:51:00.544494 139620452902656 logging_writer.py:48] [6000] global_step=6000, grad_norm=0.2445574253797531, loss=3.913510799407959
I0914 07:51:21.496845 139785753851712 spec.py:320] Evaluating on the training split.
I0914 07:51:28.630486 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 07:51:36.807730 139785753851712 spec.py:348] Evaluating on the test split.
I0914 07:51:39.056757 139785753851712 submission_runner.py:376] Time since start: 2220.02s, Step: 6064, {'train/accuracy': 0.485072523355484, 'train/loss': 2.4722769260406494, 'validation/accuracy': 0.45179998874664307, 'validation/loss': 2.632256269454956, 'validation/num_examples': 50000, 'test/accuracy': 0.3456000089645386, 'test/loss': 3.28208327293396, 'test/num_examples': 10000, 'score': 2102.707985162735, 'total_duration': 2220.0234982967377, 'accumulated_submission_time': 2102.707985162735, 'accumulated_eval_time': 117.10866022109985, 'accumulated_logging_time': 0.11611032485961914}
I0914 07:51:39.075066 139620436117248 logging_writer.py:48] [6064] accumulated_eval_time=117.108660, accumulated_logging_time=0.116110, accumulated_submission_time=2102.707985, global_step=6064, preemption_count=0, score=2102.707985, test/accuracy=0.345600, test/loss=3.282083, test/num_examples=10000, total_duration=2220.023498, train/accuracy=0.485073, train/loss=2.472277, validation/accuracy=0.451800, validation/loss=2.632256, validation/num_examples=50000
I0914 07:54:06.019992 139620444509952 logging_writer.py:48] [6500] global_step=6500, grad_norm=0.2569459080696106, loss=3.9474620819091797
I0914 07:56:54.227489 139620436117248 logging_writer.py:48] [7000] global_step=7000, grad_norm=0.2399711310863495, loss=3.8695645332336426
I0914 07:59:42.452682 139620444509952 logging_writer.py:48] [7500] global_step=7500, grad_norm=0.23975849151611328, loss=3.802103281021118
I0914 08:00:09.119609 139785753851712 spec.py:320] Evaluating on the training split.
I0914 08:00:16.465856 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 08:00:24.594934 139785753851712 spec.py:348] Evaluating on the test split.
I0914 08:00:26.870099 139785753851712 submission_runner.py:376] Time since start: 2747.84s, Step: 7581, {'train/accuracy': 0.5335220098495483, 'train/loss': 2.235621452331543, 'validation/accuracy': 0.5026400089263916, 'validation/loss': 2.388488531112671, 'validation/num_examples': 50000, 'test/accuracy': 0.392300009727478, 'test/loss': 3.0264182090759277, 'test/num_examples': 10000, 'score': 2612.7207324504852, 'total_duration': 2747.8368368148804, 'accumulated_submission_time': 2612.7207324504852, 'accumulated_eval_time': 134.8591091632843, 'accumulated_logging_time': 0.14357876777648926}
I0914 08:00:26.887415 139620469688064 logging_writer.py:48] [7581] accumulated_eval_time=134.859109, accumulated_logging_time=0.143579, accumulated_submission_time=2612.720732, global_step=7581, preemption_count=0, score=2612.720732, test/accuracy=0.392300, test/loss=3.026418, test/num_examples=10000, total_duration=2747.836837, train/accuracy=0.533522, train/loss=2.235621, validation/accuracy=0.502640, validation/loss=2.388489, validation/num_examples=50000
I0914 08:02:48.232521 139621090420480 logging_writer.py:48] [8000] global_step=8000, grad_norm=0.24518267810344696, loss=3.8370201587677
I0914 08:05:36.454721 139620469688064 logging_writer.py:48] [8500] global_step=8500, grad_norm=0.2367110252380371, loss=3.8449392318725586
I0914 08:08:24.702806 139621090420480 logging_writer.py:48] [9000] global_step=9000, grad_norm=0.23814517259597778, loss=3.7492542266845703
I0914 08:08:57.097401 139785753851712 spec.py:320] Evaluating on the training split.
I0914 08:09:04.500794 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 08:09:12.674903 139785753851712 spec.py:348] Evaluating on the test split.
I0914 08:09:14.951228 139785753851712 submission_runner.py:376] Time since start: 3275.92s, Step: 9098, {'train/accuracy': 0.5826291441917419, 'train/loss': 2.0229527950286865, 'validation/accuracy': 0.5064799785614014, 'validation/loss': 2.3824005126953125, 'validation/num_examples': 50000, 'test/accuracy': 0.3856000304222107, 'test/loss': 3.0475265979766846, 'test/num_examples': 10000, 'score': 3122.8992822170258, 'total_duration': 3275.9179759025574, 'accumulated_submission_time': 3122.8992822170258, 'accumulated_eval_time': 152.71290373802185, 'accumulated_logging_time': 0.17004680633544922}
I0914 08:09:14.969036 139620461295360 logging_writer.py:48] [9098] accumulated_eval_time=152.712904, accumulated_logging_time=0.170047, accumulated_submission_time=3122.899282, global_step=9098, preemption_count=0, score=3122.899282, test/accuracy=0.385600, test/loss=3.047527, test/num_examples=10000, total_duration=3275.917976, train/accuracy=0.582629, train/loss=2.022953, validation/accuracy=0.506480, validation/loss=2.382401, validation/num_examples=50000
I0914 08:11:30.622020 139621082027776 logging_writer.py:48] [9500] global_step=9500, grad_norm=0.2386445701122284, loss=3.8122379779815674
I0914 08:14:18.848359 139620461295360 logging_writer.py:48] [10000] global_step=10000, grad_norm=0.23931212723255157, loss=3.726754903793335
I0914 08:17:07.096012 139621082027776 logging_writer.py:48] [10500] global_step=10500, grad_norm=0.23180843889713287, loss=3.6859421730041504
I0914 08:17:45.199921 139785753851712 spec.py:320] Evaluating on the training split.
I0914 08:17:52.574122 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 08:18:00.787055 139785753851712 spec.py:348] Evaluating on the test split.
I0914 08:18:03.038870 139785753851712 submission_runner.py:376] Time since start: 3804.01s, Step: 10615, {'train/accuracy': 0.5702128410339355, 'train/loss': 2.0194029808044434, 'validation/accuracy': 0.5161600112915039, 'validation/loss': 2.2897284030914307, 'validation/num_examples': 50000, 'test/accuracy': 0.3993000090122223, 'test/loss': 2.9481253623962402, 'test/num_examples': 10000, 'score': 3633.0976436138153, 'total_duration': 3804.0056059360504, 'accumulated_submission_time': 3633.0976436138153, 'accumulated_eval_time': 170.5518193244934, 'accumulated_logging_time': 0.1971442699432373}
I0914 08:18:03.065747 139620452902656 logging_writer.py:48] [10615] accumulated_eval_time=170.551819, accumulated_logging_time=0.197144, accumulated_submission_time=3633.097644, global_step=10615, preemption_count=0, score=3633.097644, test/accuracy=0.399300, test/loss=2.948125, test/num_examples=10000, total_duration=3804.005606, train/accuracy=0.570213, train/loss=2.019403, validation/accuracy=0.516160, validation/loss=2.289728, validation/num_examples=50000
I0914 08:20:12.967016 139620461295360 logging_writer.py:48] [11000] global_step=11000, grad_norm=0.24939975142478943, loss=3.7546169757843018
I0914 08:23:01.209996 139620452902656 logging_writer.py:48] [11500] global_step=11500, grad_norm=0.23770606517791748, loss=3.735459566116333
I0914 08:25:49.416116 139620461295360 logging_writer.py:48] [12000] global_step=12000, grad_norm=0.2510932385921478, loss=3.805049180984497
I0914 08:26:33.239436 139785753851712 spec.py:320] Evaluating on the training split.
I0914 08:26:40.575763 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 08:26:48.802873 139785753851712 spec.py:348] Evaluating on the test split.
I0914 08:26:51.083399 139785753851712 submission_runner.py:376] Time since start: 4332.05s, Step: 12132, {'train/accuracy': 0.5753945708274841, 'train/loss': 2.0027692317962646, 'validation/accuracy': 0.5300599932670593, 'validation/loss': 2.2071869373321533, 'validation/num_examples': 50000, 'test/accuracy': 0.414900004863739, 'test/loss': 2.8690314292907715, 'test/num_examples': 10000, 'score': 4143.2383596897125, 'total_duration': 4332.050138235092, 'accumulated_submission_time': 4143.2383596897125, 'accumulated_eval_time': 188.3957643508911, 'accumulated_logging_time': 0.23388051986694336}
I0914 08:26:51.102065 139620444509952 logging_writer.py:48] [12132] accumulated_eval_time=188.395764, accumulated_logging_time=0.233881, accumulated_submission_time=4143.238360, global_step=12132, preemption_count=0, score=4143.238360, test/accuracy=0.414900, test/loss=2.869031, test/num_examples=10000, total_duration=4332.050138, train/accuracy=0.575395, train/loss=2.002769, validation/accuracy=0.530060, validation/loss=2.207187, validation/num_examples=50000
I0914 08:28:55.254952 139620452902656 logging_writer.py:48] [12500] global_step=12500, grad_norm=0.24407465755939484, loss=3.693490743637085
I0914 08:31:43.490170 139620444509952 logging_writer.py:48] [13000] global_step=13000, grad_norm=0.24153253436088562, loss=3.671910524368286
I0914 08:34:31.727334 139620452902656 logging_writer.py:48] [13500] global_step=13500, grad_norm=0.2499263882637024, loss=3.6926016807556152
I0914 08:35:21.274003 139785753851712 spec.py:320] Evaluating on the training split.
I0914 08:35:29.321512 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 08:35:37.648797 139785753851712 spec.py:348] Evaluating on the test split.
I0914 08:35:39.919688 139785753851712 submission_runner.py:376] Time since start: 4860.89s, Step: 13649, {'train/accuracy': 0.5742785334587097, 'train/loss': 2.003106117248535, 'validation/accuracy': 0.5323799848556519, 'validation/loss': 2.2081964015960693, 'validation/num_examples': 50000, 'test/accuracy': 0.4196000099182129, 'test/loss': 2.892209529876709, 'test/num_examples': 10000, 'score': 4653.37796998024, 'total_duration': 4860.886433124542, 'accumulated_submission_time': 4653.37796998024, 'accumulated_eval_time': 207.04141402244568, 'accumulated_logging_time': 0.26221203804016113}
I0914 08:35:39.956246 139620444509952 logging_writer.py:48] [13649] accumulated_eval_time=207.041414, accumulated_logging_time=0.262212, accumulated_submission_time=4653.377970, global_step=13649, preemption_count=0, score=4653.377970, test/accuracy=0.419600, test/loss=2.892210, test/num_examples=10000, total_duration=4860.886433, train/accuracy=0.574279, train/loss=2.003106, validation/accuracy=0.532380, validation/loss=2.208196, validation/num_examples=50000
I0914 08:37:38.445249 139620452902656 logging_writer.py:48] [14000] global_step=14000, grad_norm=0.24695931375026703, loss=3.6911535263061523
I0914 08:40:26.670311 139620444509952 logging_writer.py:48] [14500] global_step=14500, grad_norm=0.25017544627189636, loss=3.5895378589630127
I0914 08:43:14.935185 139620452902656 logging_writer.py:48] [15000] global_step=15000, grad_norm=0.2352597415447235, loss=3.5701823234558105
I0914 08:44:10.209634 139785753851712 spec.py:320] Evaluating on the training split.
I0914 08:44:17.744957 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 08:44:26.187425 139785753851712 spec.py:348] Evaluating on the test split.
I0914 08:44:28.424991 139785753851712 submission_runner.py:376] Time since start: 5389.39s, Step: 15166, {'train/accuracy': 0.5779455900192261, 'train/loss': 2.057650327682495, 'validation/accuracy': 0.5369799733161926, 'validation/loss': 2.2513246536254883, 'validation/num_examples': 50000, 'test/accuracy': 0.414900004863739, 'test/loss': 2.942534923553467, 'test/num_examples': 10000, 'score': 5163.59117102623, 'total_duration': 5389.39173579216, 'accumulated_submission_time': 5163.59117102623, 'accumulated_eval_time': 225.25672912597656, 'accumulated_logging_time': 0.31658077239990234}
I0914 08:44:28.446834 139620436117248 logging_writer.py:48] [15166] accumulated_eval_time=225.256729, accumulated_logging_time=0.316581, accumulated_submission_time=5163.591171, global_step=15166, preemption_count=0, score=5163.591171, test/accuracy=0.414900, test/loss=2.942535, test/num_examples=10000, total_duration=5389.391736, train/accuracy=0.577946, train/loss=2.057650, validation/accuracy=0.536980, validation/loss=2.251325, validation/num_examples=50000
I0914 08:46:21.164688 139620444509952 logging_writer.py:48] [15500] global_step=15500, grad_norm=0.2539088726043701, loss=3.7177305221557617
I0914 08:49:09.408946 139620436117248 logging_writer.py:48] [16000] global_step=16000, grad_norm=0.2508637309074402, loss=3.5680971145629883
I0914 08:51:57.652329 139620444509952 logging_writer.py:48] [16500] global_step=16500, grad_norm=0.26029929518699646, loss=3.609468698501587
I0914 08:52:58.664029 139785753851712 spec.py:320] Evaluating on the training split.
I0914 08:53:06.896972 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 08:53:15.780883 139785753851712 spec.py:348] Evaluating on the test split.
I0914 08:53:18.137967 139785753851712 submission_runner.py:376] Time since start: 5919.10s, Step: 16683, {'train/accuracy': 0.5338209271430969, 'train/loss': 2.335331678390503, 'validation/accuracy': 0.5003600120544434, 'validation/loss': 2.478173017501831, 'validation/num_examples': 50000, 'test/accuracy': 0.38210001587867737, 'test/loss': 3.1922895908355713, 'test/num_examples': 10000, 'score': 5673.775738954544, 'total_duration': 5919.104665517807, 'accumulated_submission_time': 5673.775738954544, 'accumulated_eval_time': 244.7305908203125, 'accumulated_logging_time': 0.3483104705810547}
I0914 08:53:18.170782 139620444509952 logging_writer.py:48] [16683] accumulated_eval_time=244.730591, accumulated_logging_time=0.348310, accumulated_submission_time=5673.775739, global_step=16683, preemption_count=0, score=5673.775739, test/accuracy=0.382100, test/loss=3.192290, test/num_examples=10000, total_duration=5919.104666, train/accuracy=0.533821, train/loss=2.335332, validation/accuracy=0.500360, validation/loss=2.478173, validation/num_examples=50000
I0914 08:55:04.959357 139620452902656 logging_writer.py:48] [17000] global_step=17000, grad_norm=0.25013065338134766, loss=3.6450700759887695
I0914 08:57:53.102471 139620444509952 logging_writer.py:48] [17500] global_step=17500, grad_norm=0.2470160871744156, loss=3.6803393363952637
I0914 09:00:41.390616 139620452902656 logging_writer.py:48] [18000] global_step=18000, grad_norm=0.24692237377166748, loss=3.684927225112915
I0914 09:01:48.390591 139785753851712 spec.py:320] Evaluating on the training split.
I0914 09:01:56.225740 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 09:02:05.743061 139785753851712 spec.py:348] Evaluating on the test split.
I0914 09:02:08.019340 139785753851712 submission_runner.py:376] Time since start: 6448.99s, Step: 18201, {'train/accuracy': 0.6382134556770325, 'train/loss': 1.74356210231781, 'validation/accuracy': 0.5595200061798096, 'validation/loss': 2.1038706302642822, 'validation/num_examples': 50000, 'test/accuracy': 0.43620002269744873, 'test/loss': 2.7823822498321533, 'test/num_examples': 10000, 'score': 6183.962655305862, 'total_duration': 6448.986067771912, 'accumulated_submission_time': 6183.962655305862, 'accumulated_eval_time': 264.35929918289185, 'accumulated_logging_time': 0.39099645614624023}
I0914 09:02:08.039974 139621082027776 logging_writer.py:48] [18201] accumulated_eval_time=264.359299, accumulated_logging_time=0.390996, accumulated_submission_time=6183.962655, global_step=18201, preemption_count=0, score=6183.962655, test/accuracy=0.436200, test/loss=2.782382, test/num_examples=10000, total_duration=6448.986068, train/accuracy=0.638213, train/loss=1.743562, validation/accuracy=0.559520, validation/loss=2.103871, validation/num_examples=50000
I0914 09:03:48.995682 139621090420480 logging_writer.py:48] [18500] global_step=18500, grad_norm=0.24664853513240814, loss=3.5926640033721924
I0914 09:06:37.140828 139621082027776 logging_writer.py:48] [19000] global_step=19000, grad_norm=0.2476327270269394, loss=3.5965888500213623
I0914 09:09:25.389448 139621090420480 logging_writer.py:48] [19500] global_step=19500, grad_norm=0.24864919483661652, loss=3.5664284229278564
I0914 09:10:38.143328 139785753851712 spec.py:320] Evaluating on the training split.
I0914 09:10:46.363858 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 09:10:55.978470 139785753851712 spec.py:348] Evaluating on the test split.
I0914 09:10:58.233433 139785753851712 submission_runner.py:376] Time since start: 6979.20s, Step: 19718, {'train/accuracy': 0.5982142686843872, 'train/loss': 1.9227712154388428, 'validation/accuracy': 0.5423600077629089, 'validation/loss': 2.185263156890869, 'validation/num_examples': 50000, 'test/accuracy': 0.4272000193595886, 'test/loss': 2.8164870738983154, 'test/num_examples': 10000, 'score': 6694.032505512238, 'total_duration': 6979.200145483017, 'accumulated_submission_time': 6694.032505512238, 'accumulated_eval_time': 284.4493384361267, 'accumulated_logging_time': 0.4227602481842041}
I0914 09:10:58.268679 139620436117248 logging_writer.py:48] [19718] accumulated_eval_time=284.449338, accumulated_logging_time=0.422760, accumulated_submission_time=6694.032506, global_step=19718, preemption_count=0, score=6694.032506, test/accuracy=0.427200, test/loss=2.816487, test/num_examples=10000, total_duration=6979.200145, train/accuracy=0.598214, train/loss=1.922771, validation/accuracy=0.542360, validation/loss=2.185263, validation/num_examples=50000
I0914 09:12:33.433119 139620444509952 logging_writer.py:48] [20000] global_step=20000, grad_norm=0.2531515657901764, loss=3.5633723735809326
I0914 09:15:21.484676 139620436117248 logging_writer.py:48] [20500] global_step=20500, grad_norm=0.2647220194339752, loss=3.608556032180786
I0914 09:18:09.739126 139620444509952 logging_writer.py:48] [21000] global_step=21000, grad_norm=0.2493944615125656, loss=3.5959181785583496
I0914 09:19:28.235945 139785753851712 spec.py:320] Evaluating on the training split.
I0914 09:19:35.889616 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 09:19:45.067148 139785753851712 spec.py:348] Evaluating on the test split.
I0914 09:19:47.308690 139785753851712 submission_runner.py:376] Time since start: 7508.28s, Step: 21235, {'train/accuracy': 0.6011040806770325, 'train/loss': 1.9267860651016235, 'validation/accuracy': 0.5546199679374695, 'validation/loss': 2.1422059535980225, 'validation/num_examples': 50000, 'test/accuracy': 0.4358000159263611, 'test/loss': 2.809361457824707, 'test/num_examples': 10000, 'score': 7203.964419841766, 'total_duration': 7508.275420188904, 'accumulated_submission_time': 7203.964419841766, 'accumulated_eval_time': 303.52203822135925, 'accumulated_logging_time': 0.47086191177368164}
I0914 09:19:47.331918 139621082027776 logging_writer.py:48] [21235] accumulated_eval_time=303.522038, accumulated_logging_time=0.470862, accumulated_submission_time=7203.964420, global_step=21235, preemption_count=0, score=7203.964420, test/accuracy=0.435800, test/loss=2.809361, test/num_examples=10000, total_duration=7508.275420, train/accuracy=0.601104, train/loss=1.926786, validation/accuracy=0.554620, validation/loss=2.142206, validation/num_examples=50000
I0914 09:21:16.821161 139621090420480 logging_writer.py:48] [21500] global_step=21500, grad_norm=0.2546178698539734, loss=3.6239407062530518
I0914 09:24:04.748127 139621082027776 logging_writer.py:48] [22000] global_step=22000, grad_norm=0.2460772544145584, loss=3.537613868713379
I0914 09:26:52.895702 139621090420480 logging_writer.py:48] [22500] global_step=22500, grad_norm=0.2539433240890503, loss=3.622434139251709
I0914 09:28:17.428775 139785753851712 spec.py:320] Evaluating on the training split.
I0914 09:28:24.868973 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 09:28:34.428437 139785753851712 spec.py:348] Evaluating on the test split.
I0914 09:28:36.700894 139785753851712 submission_runner.py:376] Time since start: 8037.67s, Step: 22753, {'train/accuracy': 0.5950454473495483, 'train/loss': 1.9752215147018433, 'validation/accuracy': 0.5528799891471863, 'validation/loss': 2.1724321842193604, 'validation/num_examples': 50000, 'test/accuracy': 0.4336000084877014, 'test/loss': 2.810899257659912, 'test/num_examples': 10000, 'score': 7714.026890993118, 'total_duration': 8037.667640447617, 'accumulated_submission_time': 7714.026890993118, 'accumulated_eval_time': 322.7941265106201, 'accumulated_logging_time': 0.5051746368408203}
I0914 09:28:36.720781 139620452902656 logging_writer.py:48] [22753] accumulated_eval_time=322.794127, accumulated_logging_time=0.505175, accumulated_submission_time=7714.026891, global_step=22753, preemption_count=0, score=7714.026891, test/accuracy=0.433600, test/loss=2.810899, test/num_examples=10000, total_duration=8037.667640, train/accuracy=0.595045, train/loss=1.975222, validation/accuracy=0.552880, validation/loss=2.172432, validation/num_examples=50000
I0914 09:30:00.012202 139620461295360 logging_writer.py:48] [23000] global_step=23000, grad_norm=0.2528955638408661, loss=3.527564287185669
I0914 09:32:48.240035 139620452902656 logging_writer.py:48] [23500] global_step=23500, grad_norm=0.257273405790329, loss=3.5579819679260254
I0914 09:35:36.496314 139620461295360 logging_writer.py:48] [24000] global_step=24000, grad_norm=0.2570257782936096, loss=3.555431365966797
I0914 09:37:06.786156 139785753851712 spec.py:320] Evaluating on the training split.
I0914 09:37:14.321404 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 09:37:24.197975 139785753851712 spec.py:348] Evaluating on the test split.
I0914 09:37:26.461626 139785753851712 submission_runner.py:376] Time since start: 8567.43s, Step: 24270, {'train/accuracy': 0.5795798897743225, 'train/loss': 2.023024320602417, 'validation/accuracy': 0.5407800078392029, 'validation/loss': 2.230668306350708, 'validation/num_examples': 50000, 'test/accuracy': 0.41540002822875977, 'test/loss': 2.895124673843384, 'test/num_examples': 10000, 'score': 8224.06004691124, 'total_duration': 8567.42837190628, 'accumulated_submission_time': 8224.06004691124, 'accumulated_eval_time': 342.4695653915405, 'accumulated_logging_time': 0.5343174934387207}
I0914 09:37:26.481554 139620444509952 logging_writer.py:48] [24270] accumulated_eval_time=342.469565, accumulated_logging_time=0.534317, accumulated_submission_time=8224.060047, global_step=24270, preemption_count=0, score=8224.060047, test/accuracy=0.415400, test/loss=2.895125, test/num_examples=10000, total_duration=8567.428372, train/accuracy=0.579580, train/loss=2.023024, validation/accuracy=0.540780, validation/loss=2.230668, validation/num_examples=50000
I0914 09:38:43.997312 139620452902656 logging_writer.py:48] [24500] global_step=24500, grad_norm=0.25301215052604675, loss=3.5550875663757324
I0914 09:41:32.170509 139620444509952 logging_writer.py:48] [25000] global_step=25000, grad_norm=0.2632233500480652, loss=3.5506157875061035
I0914 09:44:20.310847 139620452902656 logging_writer.py:48] [25500] global_step=25500, grad_norm=0.2634458839893341, loss=3.6524815559387207
I0914 09:45:56.501331 139785753851712 spec.py:320] Evaluating on the training split.
I0914 09:46:04.409524 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 09:46:13.919648 139785753851712 spec.py:348] Evaluating on the test split.
I0914 09:46:16.160629 139785753851712 submission_runner.py:376] Time since start: 9097.13s, Step: 25788, {'train/accuracy': 0.6070232391357422, 'train/loss': 1.906052589416504, 'validation/accuracy': 0.5681399703025818, 'validation/loss': 2.081261396408081, 'validation/num_examples': 50000, 'test/accuracy': 0.4448000192642212, 'test/loss': 2.7579729557037354, 'test/num_examples': 10000, 'score': 8734.046847581863, 'total_duration': 9097.127324581146, 'accumulated_submission_time': 8734.046847581863, 'accumulated_eval_time': 362.12878465652466, 'accumulated_logging_time': 0.5639915466308594}
I0914 09:46:16.183245 139621090420480 logging_writer.py:48] [25788] accumulated_eval_time=362.128785, accumulated_logging_time=0.563992, accumulated_submission_time=8734.046848, global_step=25788, preemption_count=0, score=8734.046848, test/accuracy=0.444800, test/loss=2.757973, test/num_examples=10000, total_duration=9097.127325, train/accuracy=0.607023, train/loss=1.906053, validation/accuracy=0.568140, validation/loss=2.081261, validation/num_examples=50000
I0914 09:47:27.752431 139621098813184 logging_writer.py:48] [26000] global_step=26000, grad_norm=0.25486627221107483, loss=3.5052614212036133
I0914 09:50:15.872663 139621090420480 logging_writer.py:48] [26500] global_step=26500, grad_norm=0.24980434775352478, loss=3.4703752994537354
I0914 09:53:04.141244 139621098813184 logging_writer.py:48] [27000] global_step=27000, grad_norm=0.25720298290252686, loss=3.4930596351623535
I0914 09:54:46.202930 139785753851712 spec.py:320] Evaluating on the training split.
I0914 09:54:54.460194 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 09:55:04.621035 139785753851712 spec.py:348] Evaluating on the test split.
I0914 09:55:06.895931 139785753851712 submission_runner.py:376] Time since start: 9627.86s, Step: 27305, {'train/accuracy': 0.6470025181770325, 'train/loss': 1.7079155445098877, 'validation/accuracy': 0.5748400092124939, 'validation/loss': 2.0490245819091797, 'validation/num_examples': 50000, 'test/accuracy': 0.4561000168323517, 'test/loss': 2.6967175006866455, 'test/num_examples': 10000, 'score': 9244.031145811081, 'total_duration': 9627.8626434803, 'accumulated_submission_time': 9244.031145811081, 'accumulated_eval_time': 382.8217294216156, 'accumulated_logging_time': 0.5991528034210205}
I0914 09:55:06.920699 139620452902656 logging_writer.py:48] [27305] accumulated_eval_time=382.821729, accumulated_logging_time=0.599153, accumulated_submission_time=9244.031146, global_step=27305, preemption_count=0, score=9244.031146, test/accuracy=0.456100, test/loss=2.696718, test/num_examples=10000, total_duration=9627.862643, train/accuracy=0.647003, train/loss=1.707916, validation/accuracy=0.574840, validation/loss=2.049025, validation/num_examples=50000
I0914 09:56:12.884320 139620461295360 logging_writer.py:48] [27500] global_step=27500, grad_norm=0.24228453636169434, loss=3.4076318740844727
I0914 09:59:01.164959 139620452902656 logging_writer.py:48] [28000] global_step=28000, grad_norm=0.25307339429855347, loss=3.5211498737335205
I0914 10:01:49.381948 139620461295360 logging_writer.py:48] [28500] global_step=28500, grad_norm=0.2569205164909363, loss=3.4712512493133545
I0914 10:03:37.166977 139785753851712 spec.py:320] Evaluating on the training split.
I0914 10:03:45.304134 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 10:03:55.034219 139785753851712 spec.py:348] Evaluating on the test split.
I0914 10:03:57.295296 139785753851712 submission_runner.py:376] Time since start: 10158.26s, Step: 28822, {'train/accuracy': 0.6150151491165161, 'train/loss': 1.8384064435958862, 'validation/accuracy': 0.5671799778938293, 'validation/loss': 2.073777675628662, 'validation/num_examples': 50000, 'test/accuracy': 0.4439000189304352, 'test/loss': 2.7257843017578125, 'test/num_examples': 10000, 'score': 9754.244331598282, 'total_duration': 10158.26204609871, 'accumulated_submission_time': 9754.244331598282, 'accumulated_eval_time': 402.9500343799591, 'accumulated_logging_time': 0.6338088512420654}
I0914 10:03:57.321540 139620444509952 logging_writer.py:48] [28822] accumulated_eval_time=402.950034, accumulated_logging_time=0.633809, accumulated_submission_time=9754.244332, global_step=28822, preemption_count=0, score=9754.244332, test/accuracy=0.443900, test/loss=2.725784, test/num_examples=10000, total_duration=10158.262046, train/accuracy=0.615015, train/loss=1.838406, validation/accuracy=0.567180, validation/loss=2.073778, validation/num_examples=50000
I0914 10:04:57.400686 139621090420480 logging_writer.py:48] [29000] global_step=29000, grad_norm=0.2559642493724823, loss=3.615403890609741
I0914 10:07:45.404992 139620444509952 logging_writer.py:48] [29500] global_step=29500, grad_norm=0.25525912642478943, loss=3.485414743423462
I0914 10:10:33.635075 139621090420480 logging_writer.py:48] [30000] global_step=30000, grad_norm=0.2619873881340027, loss=3.531353235244751
I0914 10:12:27.489350 139785753851712 spec.py:320] Evaluating on the training split.
I0914 10:12:35.666204 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 10:12:46.634149 139785753851712 spec.py:348] Evaluating on the test split.
I0914 10:12:48.918695 139785753851712 submission_runner.py:376] Time since start: 10689.89s, Step: 30340, {'train/accuracy': 0.6219905614852905, 'train/loss': 1.8566009998321533, 'validation/accuracy': 0.5748000144958496, 'validation/loss': 2.0822231769561768, 'validation/num_examples': 50000, 'test/accuracy': 0.4506000280380249, 'test/loss': 2.730398178100586, 'test/num_examples': 10000, 'score': 10264.379125356674, 'total_duration': 10689.885441303253, 'accumulated_submission_time': 10264.379125356674, 'accumulated_eval_time': 424.3793590068817, 'accumulated_logging_time': 0.6697630882263184}
I0914 10:12:48.940200 139621090420480 logging_writer.py:48] [30340] accumulated_eval_time=424.379359, accumulated_logging_time=0.669763, accumulated_submission_time=10264.379125, global_step=30340, preemption_count=0, score=10264.379125, test/accuracy=0.450600, test/loss=2.730398, test/num_examples=10000, total_duration=10689.885441, train/accuracy=0.621991, train/loss=1.856601, validation/accuracy=0.574800, validation/loss=2.082223, validation/num_examples=50000
I0914 10:13:42.962412 139621098813184 logging_writer.py:48] [30500] global_step=30500, grad_norm=0.24631181359291077, loss=3.4548144340515137
I0914 10:16:31.118767 139621090420480 logging_writer.py:48] [31000] global_step=31000, grad_norm=0.2661013603210449, loss=3.5372538566589355
I0914 10:19:19.433270 139621098813184 logging_writer.py:48] [31500] global_step=31500, grad_norm=0.26415982842445374, loss=3.5574021339416504
I0914 10:21:18.972899 139785753851712 spec.py:320] Evaluating on the training split.
I0914 10:21:27.011381 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 10:21:38.255725 139785753851712 spec.py:348] Evaluating on the test split.
I0914 10:21:40.473356 139785753851712 submission_runner.py:376] Time since start: 11221.44s, Step: 31857, {'train/accuracy': 0.6304607391357422, 'train/loss': 1.7625677585601807, 'validation/accuracy': 0.5828399658203125, 'validation/loss': 1.97676682472229, 'validation/num_examples': 50000, 'test/accuracy': 0.460500031709671, 'test/loss': 2.6475884914398193, 'test/num_examples': 10000, 'score': 10774.378732919693, 'total_duration': 11221.44010066986, 'accumulated_submission_time': 10774.378732919693, 'accumulated_eval_time': 445.87978982925415, 'accumulated_logging_time': 0.7013595104217529}
I0914 10:21:40.499169 139618842306304 logging_writer.py:48] [31857] accumulated_eval_time=445.879790, accumulated_logging_time=0.701360, accumulated_submission_time=10774.378733, global_step=31857, preemption_count=0, score=10774.378733, test/accuracy=0.460500, test/loss=2.647588, test/num_examples=10000, total_duration=11221.440101, train/accuracy=0.630461, train/loss=1.762568, validation/accuracy=0.582840, validation/loss=1.976767, validation/num_examples=50000
I0914 10:22:28.939013 139618850699008 logging_writer.py:48] [32000] global_step=32000, grad_norm=0.27015623450279236, loss=3.525996446609497
I0914 10:25:17.226516 139618842306304 logging_writer.py:48] [32500] global_step=32500, grad_norm=0.2672438621520996, loss=3.547415256500244
I0914 10:28:05.477488 139618850699008 logging_writer.py:48] [33000] global_step=33000, grad_norm=0.2667967975139618, loss=3.43361759185791
I0914 10:30:10.750766 139785753851712 spec.py:320] Evaluating on the training split.
I0914 10:30:18.859109 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 10:30:29.977935 139785753851712 spec.py:348] Evaluating on the test split.
I0914 10:30:32.261843 139785753851712 submission_runner.py:376] Time since start: 11753.23s, Step: 33374, {'train/accuracy': 0.6193598508834839, 'train/loss': 1.826348066329956, 'validation/accuracy': 0.5787799954414368, 'validation/loss': 2.038647413253784, 'validation/num_examples': 50000, 'test/accuracy': 0.46250003576278687, 'test/loss': 2.679227352142334, 'test/num_examples': 10000, 'score': 11284.596626758575, 'total_duration': 11753.228493452072, 'accumulated_submission_time': 11284.596626758575, 'accumulated_eval_time': 467.390745639801, 'accumulated_logging_time': 0.738187313079834}
I0914 10:30:32.285148 139618850699008 logging_writer.py:48] [33374] accumulated_eval_time=467.390746, accumulated_logging_time=0.738187, accumulated_submission_time=11284.596627, global_step=33374, preemption_count=0, score=11284.596627, test/accuracy=0.462500, test/loss=2.679227, test/num_examples=10000, total_duration=11753.228493, train/accuracy=0.619360, train/loss=1.826348, validation/accuracy=0.578780, validation/loss=2.038647, validation/num_examples=50000
I0914 10:31:15.049534 139621090420480 logging_writer.py:48] [33500] global_step=33500, grad_norm=0.27199679613113403, loss=3.4771883487701416
I0914 10:34:03.261353 139618850699008 logging_writer.py:48] [34000] global_step=34000, grad_norm=0.267585426568985, loss=3.463315010070801
I0914 10:36:51.504971 139621090420480 logging_writer.py:48] [34500] global_step=34500, grad_norm=0.2632179260253906, loss=3.4990382194519043
I0914 10:39:02.502562 139785753851712 spec.py:320] Evaluating on the training split.
I0914 10:39:10.827893 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 10:39:21.889250 139785753851712 spec.py:348] Evaluating on the test split.
I0914 10:39:24.149815 139785753851712 submission_runner.py:376] Time since start: 12285.12s, Step: 34891, {'train/accuracy': 0.6135203838348389, 'train/loss': 1.8042339086532593, 'validation/accuracy': 0.5674799680709839, 'validation/loss': 2.0255982875823975, 'validation/num_examples': 50000, 'test/accuracy': 0.4407000243663788, 'test/loss': 2.6929032802581787, 'test/num_examples': 10000, 'score': 11794.77813744545, 'total_duration': 12285.116560459137, 'accumulated_submission_time': 11794.77813744545, 'accumulated_eval_time': 489.0379819869995, 'accumulated_logging_time': 0.7739980220794678}
I0914 10:39:24.173407 139618833913600 logging_writer.py:48] [34891] accumulated_eval_time=489.037982, accumulated_logging_time=0.773998, accumulated_submission_time=11794.778137, global_step=34891, preemption_count=0, score=11794.778137, test/accuracy=0.440700, test/loss=2.692903, test/num_examples=10000, total_duration=12285.116560, train/accuracy=0.613520, train/loss=1.804234, validation/accuracy=0.567480, validation/loss=2.025598, validation/num_examples=50000
I0914 10:40:01.195227 139618842306304 logging_writer.py:48] [35000] global_step=35000, grad_norm=0.2558711767196655, loss=3.463052988052368
I0914 10:42:49.472806 139618833913600 logging_writer.py:48] [35500] global_step=35500, grad_norm=0.2669128179550171, loss=3.420599937438965
I0914 10:45:37.677465 139618842306304 logging_writer.py:48] [36000] global_step=36000, grad_norm=0.2667349874973297, loss=3.52689528465271
I0914 10:47:54.355814 139785753851712 spec.py:320] Evaluating on the training split.
I0914 10:48:02.683128 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 10:48:14.021110 139785753851712 spec.py:348] Evaluating on the test split.
I0914 10:48:16.292880 139785753851712 submission_runner.py:376] Time since start: 12817.26s, Step: 36408, {'train/accuracy': 0.6544762253761292, 'train/loss': 1.7204208374023438, 'validation/accuracy': 0.5828199982643127, 'validation/loss': 2.025986433029175, 'validation/num_examples': 50000, 'test/accuracy': 0.45920002460479736, 'test/loss': 2.709810733795166, 'test/num_examples': 10000, 'score': 12304.926926612854, 'total_duration': 12817.259620189667, 'accumulated_submission_time': 12304.926926612854, 'accumulated_eval_time': 510.97503638267517, 'accumulated_logging_time': 0.8083920478820801}
I0914 10:48:16.314396 139621082027776 logging_writer.py:48] [36408] accumulated_eval_time=510.975036, accumulated_logging_time=0.808392, accumulated_submission_time=12304.926927, global_step=36408, preemption_count=0, score=12304.926927, test/accuracy=0.459200, test/loss=2.709811, test/num_examples=10000, total_duration=12817.259620, train/accuracy=0.654476, train/loss=1.720421, validation/accuracy=0.582820, validation/loss=2.025986, validation/num_examples=50000
I0914 10:48:47.581541 139621090420480 logging_writer.py:48] [36500] global_step=36500, grad_norm=0.26472190022468567, loss=3.4355721473693848
I0914 10:51:35.596939 139621082027776 logging_writer.py:48] [37000] global_step=37000, grad_norm=0.26586592197418213, loss=3.414236545562744
I0914 10:54:23.842362 139621090420480 logging_writer.py:48] [37500] global_step=37500, grad_norm=0.2715505361557007, loss=3.4005072116851807
I0914 10:56:46.598515 139785753851712 spec.py:320] Evaluating on the training split.
I0914 10:56:54.445778 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 10:57:05.205040 139785753851712 spec.py:348] Evaluating on the test split.
I0914 10:57:07.497152 139785753851712 submission_runner.py:376] Time since start: 13348.46s, Step: 37926, {'train/accuracy': 0.6563695669174194, 'train/loss': 1.7108644247055054, 'validation/accuracy': 0.5983799695968628, 'validation/loss': 1.9706319570541382, 'validation/num_examples': 50000, 'test/accuracy': 0.48260003328323364, 'test/loss': 2.6029040813446045, 'test/num_examples': 10000, 'score': 12815.176861763, 'total_duration': 13348.463897228241, 'accumulated_submission_time': 12815.176861763, 'accumulated_eval_time': 531.8736464977264, 'accumulated_logging_time': 0.8413448333740234}
I0914 10:57:07.519643 139618833913600 logging_writer.py:48] [37926] accumulated_eval_time=531.873646, accumulated_logging_time=0.841345, accumulated_submission_time=12815.176862, global_step=37926, preemption_count=0, score=12815.176862, test/accuracy=0.482600, test/loss=2.602904, test/num_examples=10000, total_duration=13348.463897, train/accuracy=0.656370, train/loss=1.710864, validation/accuracy=0.598380, validation/loss=1.970632, validation/num_examples=50000
I0914 10:57:32.709061 139618842306304 logging_writer.py:48] [38000] global_step=38000, grad_norm=0.2711631655693054, loss=3.4340717792510986
I0914 11:00:20.877791 139618833913600 logging_writer.py:48] [38500] global_step=38500, grad_norm=0.2710915207862854, loss=3.424994945526123
I0914 11:03:09.109183 139618842306304 logging_writer.py:48] [39000] global_step=39000, grad_norm=0.2728542387485504, loss=3.4836416244506836
I0914 11:05:37.603592 139785753851712 spec.py:320] Evaluating on the training split.
I0914 11:05:45.487278 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 11:05:56.162706 139785753851712 spec.py:348] Evaluating on the test split.
I0914 11:05:58.400947 139785753851712 submission_runner.py:376] Time since start: 13879.37s, Step: 39443, {'train/accuracy': 0.6421595811843872, 'train/loss': 1.6656968593597412, 'validation/accuracy': 0.5947999954223633, 'validation/loss': 1.8805724382400513, 'validation/num_examples': 50000, 'test/accuracy': 0.4724000096321106, 'test/loss': 2.5521347522735596, 'test/num_examples': 10000, 'score': 13325.226942777634, 'total_duration': 13879.367692947388, 'accumulated_submission_time': 13325.226942777634, 'accumulated_eval_time': 552.67098736763, 'accumulated_logging_time': 0.8749892711639404}
I0914 11:05:58.423598 139621090420480 logging_writer.py:48] [39443] accumulated_eval_time=552.670987, accumulated_logging_time=0.874989, accumulated_submission_time=13325.226943, global_step=39443, preemption_count=0, score=13325.226943, test/accuracy=0.472400, test/loss=2.552135, test/num_examples=10000, total_duration=13879.367693, train/accuracy=0.642160, train/loss=1.665697, validation/accuracy=0.594800, validation/loss=1.880572, validation/num_examples=50000
I0914 11:06:17.864188 139621098813184 logging_writer.py:48] [39500] global_step=39500, grad_norm=0.2749280035495758, loss=3.5009050369262695
I0914 11:09:05.970704 139621090420480 logging_writer.py:48] [40000] global_step=40000, grad_norm=0.2732629179954529, loss=3.5085206031799316
I0914 11:11:54.222726 139621098813184 logging_writer.py:48] [40500] global_step=40500, grad_norm=0.27376991510391235, loss=3.424867868423462
I0914 11:14:28.716043 139785753851712 spec.py:320] Evaluating on the training split.
I0914 11:14:36.618146 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 11:14:47.344748 139785753851712 spec.py:348] Evaluating on the test split.
I0914 11:14:49.609976 139785753851712 submission_runner.py:376] Time since start: 14410.58s, Step: 40961, {'train/accuracy': 0.6224888563156128, 'train/loss': 1.8308100700378418, 'validation/accuracy': 0.5813800096511841, 'validation/loss': 2.03011155128479, 'validation/num_examples': 50000, 'test/accuracy': 0.46330001950263977, 'test/loss': 2.654615640640259, 'test/num_examples': 10000, 'score': 13835.485067367554, 'total_duration': 14410.576689004898, 'accumulated_submission_time': 13835.485067367554, 'accumulated_eval_time': 573.5648620128632, 'accumulated_logging_time': 0.9091906547546387}
I0914 11:14:49.632932 139618850699008 logging_writer.py:48] [40961] accumulated_eval_time=573.564862, accumulated_logging_time=0.909191, accumulated_submission_time=13835.485067, global_step=40961, preemption_count=0, score=13835.485067, test/accuracy=0.463300, test/loss=2.654616, test/num_examples=10000, total_duration=14410.576689, train/accuracy=0.622489, train/loss=1.830810, validation/accuracy=0.581380, validation/loss=2.030112, validation/num_examples=50000
I0914 11:15:03.108616 139620410959616 logging_writer.py:48] [41000] global_step=41000, grad_norm=0.2798670530319214, loss=3.474769115447998
I0914 11:17:51.292984 139618850699008 logging_writer.py:48] [41500] global_step=41500, grad_norm=0.2704616189002991, loss=3.3771965503692627
I0914 11:20:39.542866 139620410959616 logging_writer.py:48] [42000] global_step=42000, grad_norm=0.2555727958679199, loss=3.3959810733795166
I0914 11:23:19.824655 139785753851712 spec.py:320] Evaluating on the training split.
I0914 11:23:27.647799 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 11:23:38.533474 139785753851712 spec.py:348] Evaluating on the test split.
I0914 11:23:40.750660 139785753851712 submission_runner.py:376] Time since start: 14941.72s, Step: 42478, {'train/accuracy': 0.6535195708274841, 'train/loss': 1.6985844373703003, 'validation/accuracy': 0.6078799962997437, 'validation/loss': 1.8977073431015015, 'validation/num_examples': 50000, 'test/accuracy': 0.4879000186920166, 'test/loss': 2.5333468914031982, 'test/num_examples': 10000, 'score': 14345.642942905426, 'total_duration': 14941.717395067215, 'accumulated_submission_time': 14345.642942905426, 'accumulated_eval_time': 594.4908349514008, 'accumulated_logging_time': 0.9433751106262207}
I0914 11:23:40.773953 139618045392640 logging_writer.py:48] [42478] accumulated_eval_time=594.490835, accumulated_logging_time=0.943375, accumulated_submission_time=14345.642943, global_step=42478, preemption_count=0, score=14345.642943, test/accuracy=0.487900, test/loss=2.533347, test/num_examples=10000, total_duration=14941.717395, train/accuracy=0.653520, train/loss=1.698584, validation/accuracy=0.607880, validation/loss=1.897707, validation/num_examples=50000
I0914 11:23:48.501946 139621098813184 logging_writer.py:48] [42500] global_step=42500, grad_norm=0.26943811774253845, loss=3.3446433544158936
I0914 11:26:36.638270 139618045392640 logging_writer.py:48] [43000] global_step=43000, grad_norm=0.26483917236328125, loss=3.457627058029175
I0914 11:29:24.871920 139621098813184 logging_writer.py:48] [43500] global_step=43500, grad_norm=0.2760085463523865, loss=3.4224777221679688
I0914 11:32:10.842974 139785753851712 spec.py:320] Evaluating on the training split.
I0914 11:32:18.730613 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 11:32:29.689754 139785753851712 spec.py:348] Evaluating on the test split.
I0914 11:32:31.948379 139785753851712 submission_runner.py:376] Time since start: 15472.92s, Step: 43995, {'train/accuracy': 0.6800063848495483, 'train/loss': 1.590664267539978, 'validation/accuracy': 0.5967199802398682, 'validation/loss': 1.9601261615753174, 'validation/num_examples': 50000, 'test/accuracy': 0.47510001063346863, 'test/loss': 2.590411424636841, 'test/num_examples': 10000, 'score': 14855.678381443024, 'total_duration': 15472.915107250214, 'accumulated_submission_time': 14855.678381443024, 'accumulated_eval_time': 615.5962023735046, 'accumulated_logging_time': 0.9775500297546387}
I0914 11:32:31.970802 139620410959616 logging_writer.py:48] [43995] accumulated_eval_time=615.596202, accumulated_logging_time=0.977550, accumulated_submission_time=14855.678381, global_step=43995, preemption_count=0, score=14855.678381, test/accuracy=0.475100, test/loss=2.590411, test/num_examples=10000, total_duration=15472.915107, train/accuracy=0.680006, train/loss=1.590664, validation/accuracy=0.596720, validation/loss=1.960126, validation/num_examples=50000
I0914 11:32:33.989422 139620419352320 logging_writer.py:48] [44000] global_step=44000, grad_norm=0.2681794762611389, loss=3.3710594177246094
I0914 11:35:22.091144 139620410959616 logging_writer.py:48] [44500] global_step=44500, grad_norm=0.2750702202320099, loss=3.4190914630889893
I0914 11:38:10.329913 139620419352320 logging_writer.py:48] [45000] global_step=45000, grad_norm=0.2686624228954315, loss=3.4064462184906006
I0914 11:40:58.527068 139620410959616 logging_writer.py:48] [45500] global_step=45500, grad_norm=0.2688489556312561, loss=3.4332542419433594
I0914 11:41:01.977522 139785753851712 spec.py:320] Evaluating on the training split.
I0914 11:41:10.070076 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 11:41:20.937899 139785753851712 spec.py:348] Evaluating on the test split.
I0914 11:41:23.177372 139785753851712 submission_runner.py:376] Time since start: 16004.14s, Step: 45512, {'train/accuracy': 0.6493741869926453, 'train/loss': 1.7071624994277954, 'validation/accuracy': 0.5888199806213379, 'validation/loss': 2.0002124309539795, 'validation/num_examples': 50000, 'test/accuracy': 0.45580002665519714, 'test/loss': 2.681110382080078, 'test/num_examples': 10000, 'score': 15365.652275562286, 'total_duration': 16004.144119977951, 'accumulated_submission_time': 15365.652275562286, 'accumulated_eval_time': 636.7960221767426, 'accumulated_logging_time': 1.010751485824585}
I0914 11:41:23.199499 139618045392640 logging_writer.py:48] [45512] accumulated_eval_time=636.796022, accumulated_logging_time=1.010751, accumulated_submission_time=15365.652276, global_step=45512, preemption_count=0, score=15365.652276, test/accuracy=0.455800, test/loss=2.681110, test/num_examples=10000, total_duration=16004.144120, train/accuracy=0.649374, train/loss=1.707162, validation/accuracy=0.588820, validation/loss=2.000212, validation/num_examples=50000
I0914 11:44:07.496607 139621098813184 logging_writer.py:48] [46000] global_step=46000, grad_norm=0.2720436155796051, loss=3.4086406230926514
I0914 11:46:55.758412 139618045392640 logging_writer.py:48] [46500] global_step=46500, grad_norm=0.28809547424316406, loss=3.4885668754577637
I0914 11:49:44.016989 139621098813184 logging_writer.py:48] [47000] global_step=47000, grad_norm=0.2793656885623932, loss=3.47119140625
I0914 11:49:53.191527 139785753851712 spec.py:320] Evaluating on the training split.
I0914 11:50:01.044825 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 11:50:12.127666 139785753851712 spec.py:348] Evaluating on the test split.
I0914 11:50:14.442233 139785753851712 submission_runner.py:376] Time since start: 16535.41s, Step: 47029, {'train/accuracy': 0.6650390625, 'train/loss': 1.6022390127182007, 'validation/accuracy': 0.6084200143814087, 'validation/loss': 1.8593943119049072, 'validation/num_examples': 50000, 'test/accuracy': 0.4799000322818756, 'test/loss': 2.522620677947998, 'test/num_examples': 10000, 'score': 15875.610366106033, 'total_duration': 16535.40897345543, 'accumulated_submission_time': 15875.610366106033, 'accumulated_eval_time': 658.0466804504395, 'accumulated_logging_time': 1.043591022491455}
I0914 11:50:14.464774 139618045392640 logging_writer.py:48] [47029] accumulated_eval_time=658.046680, accumulated_logging_time=1.043591, accumulated_submission_time=15875.610366, global_step=47029, preemption_count=0, score=15875.610366, test/accuracy=0.479900, test/loss=2.522621, test/num_examples=10000, total_duration=16535.408973, train/accuracy=0.665039, train/loss=1.602239, validation/accuracy=0.608420, validation/loss=1.859394, validation/num_examples=50000
I0914 11:52:53.123323 139620410959616 logging_writer.py:48] [47500] global_step=47500, grad_norm=0.2651343047618866, loss=3.3844122886657715
I0914 11:55:41.347779 139618045392640 logging_writer.py:48] [48000] global_step=48000, grad_norm=0.28748777508735657, loss=3.477534770965576
I0914 11:58:29.551903 139620410959616 logging_writer.py:48] [48500] global_step=48500, grad_norm=0.29311808943748474, loss=3.514925241470337
I0914 11:58:44.444758 139785753851712 spec.py:320] Evaluating on the training split.
I0914 11:58:52.251591 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 11:59:03.331104 139785753851712 spec.py:348] Evaluating on the test split.
I0914 11:59:05.625081 139785753851712 submission_runner.py:376] Time since start: 17066.59s, Step: 48546, {'train/accuracy': 0.6487563848495483, 'train/loss': 1.6832637786865234, 'validation/accuracy': 0.5993599891662598, 'validation/loss': 1.918584942817688, 'validation/num_examples': 50000, 'test/accuracy': 0.47190001606941223, 'test/loss': 2.5920767784118652, 'test/num_examples': 10000, 'score': 16385.556366205215, 'total_duration': 17066.591827869415, 'accumulated_submission_time': 16385.556366205215, 'accumulated_eval_time': 679.2269690036774, 'accumulated_logging_time': 1.0766348838806152}
I0914 11:59:05.646985 139618028607232 logging_writer.py:48] [48546] accumulated_eval_time=679.226969, accumulated_logging_time=1.076635, accumulated_submission_time=16385.556366, global_step=48546, preemption_count=0, score=16385.556366, test/accuracy=0.471900, test/loss=2.592077, test/num_examples=10000, total_duration=17066.591828, train/accuracy=0.648756, train/loss=1.683264, validation/accuracy=0.599360, validation/loss=1.918585, validation/num_examples=50000
I0914 12:01:38.705673 139618036999936 logging_writer.py:48] [49000] global_step=49000, grad_norm=0.28258246183395386, loss=3.4261724948883057
I0914 12:04:26.898727 139618028607232 logging_writer.py:48] [49500] global_step=49500, grad_norm=0.28819113969802856, loss=3.4131767749786377
I0914 12:07:15.160165 139618036999936 logging_writer.py:48] [50000] global_step=50000, grad_norm=0.27262210845947266, loss=3.3209586143493652
I0914 12:07:35.770280 139785753851712 spec.py:320] Evaluating on the training split.
I0914 12:07:43.563357 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 12:07:54.683665 139785753851712 spec.py:348] Evaluating on the test split.
I0914 12:07:56.960196 139785753851712 submission_runner.py:376] Time since start: 17597.93s, Step: 50063, {'train/accuracy': 0.6590401530265808, 'train/loss': 1.6971187591552734, 'validation/accuracy': 0.6092999577522278, 'validation/loss': 1.9145151376724243, 'validation/num_examples': 50000, 'test/accuracy': 0.4806000292301178, 'test/loss': 2.5815019607543945, 'test/num_examples': 10000, 'score': 16895.64519429207, 'total_duration': 17597.92692875862, 'accumulated_submission_time': 16895.64519429207, 'accumulated_eval_time': 700.4168322086334, 'accumulated_logging_time': 1.1097350120544434}
I0914 12:07:56.982750 139620419352320 logging_writer.py:48] [50063] accumulated_eval_time=700.416832, accumulated_logging_time=1.109735, accumulated_submission_time=16895.645194, global_step=50063, preemption_count=0, score=16895.645194, test/accuracy=0.480600, test/loss=2.581502, test/num_examples=10000, total_duration=17597.926929, train/accuracy=0.659040, train/loss=1.697119, validation/accuracy=0.609300, validation/loss=1.914515, validation/num_examples=50000
I0914 12:10:24.317590 139621082027776 logging_writer.py:48] [50500] global_step=50500, grad_norm=0.2909661531448364, loss=3.479194164276123
I0914 12:13:12.551684 139620419352320 logging_writer.py:48] [51000] global_step=51000, grad_norm=0.27277132868766785, loss=3.3441390991210938
I0914 12:16:00.776785 139621082027776 logging_writer.py:48] [51500] global_step=51500, grad_norm=0.2970084249973297, loss=3.453324794769287
I0914 12:16:27.110894 139785753851712 spec.py:320] Evaluating on the training split.
I0914 12:16:34.974406 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 12:16:46.097953 139785753851712 spec.py:348] Evaluating on the test split.
I0914 12:16:48.275398 139785753851712 submission_runner.py:376] Time since start: 18129.24s, Step: 51580, {'train/accuracy': 0.6513472199440002, 'train/loss': 1.657335877418518, 'validation/accuracy': 0.6073200106620789, 'validation/loss': 1.8589073419570923, 'validation/num_examples': 50000, 'test/accuracy': 0.4838000237941742, 'test/loss': 2.5184545516967773, 'test/num_examples': 10000, 'score': 17405.740804433823, 'total_duration': 18129.24203300476, 'accumulated_submission_time': 17405.740804433823, 'accumulated_eval_time': 721.5812134742737, 'accumulated_logging_time': 1.142387866973877}
I0914 12:16:48.297658 139618045392640 logging_writer.py:48] [51580] accumulated_eval_time=721.581213, accumulated_logging_time=1.142388, accumulated_submission_time=17405.740804, global_step=51580, preemption_count=0, score=17405.740804, test/accuracy=0.483800, test/loss=2.518455, test/num_examples=10000, total_duration=18129.242033, train/accuracy=0.651347, train/loss=1.657336, validation/accuracy=0.607320, validation/loss=1.858907, validation/num_examples=50000
I0914 12:19:09.672078 139620410959616 logging_writer.py:48] [52000] global_step=52000, grad_norm=0.29126450419425964, loss=3.4079971313476562
I0914 12:21:57.867535 139618045392640 logging_writer.py:48] [52500] global_step=52500, grad_norm=0.28497904539108276, loss=3.4135935306549072
I0914 12:24:46.141434 139620410959616 logging_writer.py:48] [53000] global_step=53000, grad_norm=0.2890980541706085, loss=3.3926331996917725
I0914 12:25:18.535628 139785753851712 spec.py:320] Evaluating on the training split.
I0914 12:25:26.364237 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 12:25:37.400487 139785753851712 spec.py:348] Evaluating on the test split.
I0914 12:25:39.667940 139785753851712 submission_runner.py:376] Time since start: 18660.63s, Step: 53098, {'train/accuracy': 0.6906289458274841, 'train/loss': 1.4674862623214722, 'validation/accuracy': 0.5989800095558167, 'validation/loss': 1.8677852153778076, 'validation/num_examples': 50000, 'test/accuracy': 0.48270002007484436, 'test/loss': 2.5035319328308105, 'test/num_examples': 10000, 'score': 17915.94573879242, 'total_duration': 18660.634664297104, 'accumulated_submission_time': 17915.94573879242, 'accumulated_eval_time': 742.713464975357, 'accumulated_logging_time': 1.1745285987854004}
I0914 12:25:39.696571 139621098813184 logging_writer.py:48] [53098] accumulated_eval_time=742.713465, accumulated_logging_time=1.174529, accumulated_submission_time=17915.945739, global_step=53098, preemption_count=0, score=17915.945739, test/accuracy=0.482700, test/loss=2.503532, test/num_examples=10000, total_duration=18660.634664, train/accuracy=0.690629, train/loss=1.467486, validation/accuracy=0.598980, validation/loss=1.867785, validation/num_examples=50000
I0914 12:27:55.156162 139621107205888 logging_writer.py:48] [53500] global_step=53500, grad_norm=0.2903880178928375, loss=3.4212396144866943
I0914 12:30:43.397672 139621098813184 logging_writer.py:48] [54000] global_step=54000, grad_norm=0.2859453558921814, loss=3.338300943374634
I0914 12:33:31.304433 139621107205888 logging_writer.py:48] [54500] global_step=54500, grad_norm=0.28716540336608887, loss=3.3848648071289062
I0914 12:34:09.739545 139785753851712 spec.py:320] Evaluating on the training split.
I0914 12:34:17.555717 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 12:34:28.689805 139785753851712 spec.py:348] Evaluating on the test split.
I0914 12:34:30.940042 139785753851712 submission_runner.py:376] Time since start: 19191.91s, Step: 54616, {'train/accuracy': 0.6725525856018066, 'train/loss': 1.5843250751495361, 'validation/accuracy': 0.6118199825286865, 'validation/loss': 1.859339952468872, 'validation/num_examples': 50000, 'test/accuracy': 0.47780001163482666, 'test/loss': 2.5555808544158936, 'test/num_examples': 10000, 'score': 18425.954810380936, 'total_duration': 19191.906791448593, 'accumulated_submission_time': 18425.954810380936, 'accumulated_eval_time': 763.9139442443848, 'accumulated_logging_time': 1.2138841152191162}
I0914 12:34:30.967104 139620410959616 logging_writer.py:48] [54616] accumulated_eval_time=763.913944, accumulated_logging_time=1.213884, accumulated_submission_time=18425.954810, global_step=54616, preemption_count=0, score=18425.954810, test/accuracy=0.477800, test/loss=2.555581, test/num_examples=10000, total_duration=19191.906791, train/accuracy=0.672553, train/loss=1.584325, validation/accuracy=0.611820, validation/loss=1.859340, validation/num_examples=50000
I0914 12:36:40.470300 139620419352320 logging_writer.py:48] [55000] global_step=55000, grad_norm=0.2885455787181854, loss=3.4199295043945312
I0914 12:39:28.701473 139620410959616 logging_writer.py:48] [55500] global_step=55500, grad_norm=0.2843439280986786, loss=3.3483569622039795
I0914 12:42:16.934671 139620419352320 logging_writer.py:48] [56000] global_step=56000, grad_norm=0.2804867625236511, loss=3.3117964267730713
I0914 12:43:01.096180 139785753851712 spec.py:320] Evaluating on the training split.
I0914 12:43:08.891999 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 12:43:19.882472 139785753851712 spec.py:348] Evaluating on the test split.
I0914 12:43:22.166954 139785753851712 submission_runner.py:376] Time since start: 19723.13s, Step: 56133, {'train/accuracy': 0.6829958558082581, 'train/loss': 1.4818432331085205, 'validation/accuracy': 0.6251199841499329, 'validation/loss': 1.7404627799987793, 'validation/num_examples': 50000, 'test/accuracy': 0.5063000321388245, 'test/loss': 2.392498254776001, 'test/num_examples': 10000, 'score': 18936.050762176514, 'total_duration': 19723.13370156288, 'accumulated_submission_time': 18936.050762176514, 'accumulated_eval_time': 784.9846830368042, 'accumulated_logging_time': 1.2514803409576416}
I0914 12:43:22.194325 139618036999936 logging_writer.py:48] [56133] accumulated_eval_time=784.984683, accumulated_logging_time=1.251480, accumulated_submission_time=18936.050762, global_step=56133, preemption_count=0, score=18936.050762, test/accuracy=0.506300, test/loss=2.392498, test/num_examples=10000, total_duration=19723.133702, train/accuracy=0.682996, train/loss=1.481843, validation/accuracy=0.625120, validation/loss=1.740463, validation/num_examples=50000
I0914 12:45:25.954758 139618045392640 logging_writer.py:48] [56500] global_step=56500, grad_norm=0.28177163004875183, loss=3.2978594303131104
I0914 12:48:13.941949 139618036999936 logging_writer.py:48] [57000] global_step=57000, grad_norm=0.2938775420188904, loss=3.336643695831299
I0914 12:51:02.182384 139618045392640 logging_writer.py:48] [57500] global_step=57500, grad_norm=0.28899526596069336, loss=3.2880070209503174
I0914 12:51:52.388687 139785753851712 spec.py:320] Evaluating on the training split.
I0914 12:52:00.224894 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 12:52:11.284370 139785753851712 spec.py:348] Evaluating on the test split.
I0914 12:52:13.527475 139785753851712 submission_runner.py:376] Time since start: 20254.49s, Step: 57651, {'train/accuracy': 0.6757014989852905, 'train/loss': 1.5783743858337402, 'validation/accuracy': 0.619879961013794, 'validation/loss': 1.8317604064941406, 'validation/num_examples': 50000, 'test/accuracy': 0.49640002846717834, 'test/loss': 2.4753739833831787, 'test/num_examples': 10000, 'score': 19446.212296009064, 'total_duration': 20254.494203805923, 'accumulated_submission_time': 19446.212296009064, 'accumulated_eval_time': 806.1234366893768, 'accumulated_logging_time': 1.2888352870941162}
I0914 12:52:13.550995 139618036999936 logging_writer.py:48] [57651] accumulated_eval_time=806.123437, accumulated_logging_time=1.288835, accumulated_submission_time=19446.212296, global_step=57651, preemption_count=0, score=19446.212296, test/accuracy=0.496400, test/loss=2.475374, test/num_examples=10000, total_duration=20254.494204, train/accuracy=0.675701, train/loss=1.578374, validation/accuracy=0.619880, validation/loss=1.831760, validation/num_examples=50000
I0914 12:54:11.198793 139618045392640 logging_writer.py:48] [58000] global_step=58000, grad_norm=0.2972732484340668, loss=3.3535263538360596
I0914 12:56:59.446379 139618036999936 logging_writer.py:48] [58500] global_step=58500, grad_norm=0.2926310896873474, loss=3.381610631942749
I0914 12:59:47.653911 139618045392640 logging_writer.py:48] [59000] global_step=59000, grad_norm=0.2968601882457733, loss=3.4223320484161377
I0914 13:00:43.613631 139785753851712 spec.py:320] Evaluating on the training split.
I0914 13:00:51.421866 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 13:01:02.407345 139785753851712 spec.py:348] Evaluating on the test split.
I0914 13:01:04.662746 139785753851712 submission_runner.py:376] Time since start: 20785.63s, Step: 59168, {'train/accuracy': 0.6541573405265808, 'train/loss': 1.6419929265975952, 'validation/accuracy': 0.6013599634170532, 'validation/loss': 1.8765325546264648, 'validation/num_examples': 50000, 'test/accuracy': 0.4806000292301178, 'test/loss': 2.5387802124023438, 'test/num_examples': 10000, 'score': 19956.23943376541, 'total_duration': 20785.629487276077, 'accumulated_submission_time': 19956.23943376541, 'accumulated_eval_time': 827.1725206375122, 'accumulated_logging_time': 1.3252980709075928}
I0914 13:01:04.686016 139620410959616 logging_writer.py:48] [59168] accumulated_eval_time=827.172521, accumulated_logging_time=1.325298, accumulated_submission_time=19956.239434, global_step=59168, preemption_count=0, score=19956.239434, test/accuracy=0.480600, test/loss=2.538780, test/num_examples=10000, total_duration=20785.629487, train/accuracy=0.654157, train/loss=1.641993, validation/accuracy=0.601360, validation/loss=1.876533, validation/num_examples=50000
I0914 13:02:56.666364 139620419352320 logging_writer.py:48] [59500] global_step=59500, grad_norm=0.2902616560459137, loss=3.3645551204681396
I0914 13:05:44.877468 139620410959616 logging_writer.py:48] [60000] global_step=60000, grad_norm=0.2877161502838135, loss=3.365004777908325
I0914 13:08:33.128105 139620419352320 logging_writer.py:48] [60500] global_step=60500, grad_norm=0.2932965159416199, loss=3.345557928085327
I0914 13:09:34.795152 139785753851712 spec.py:320] Evaluating on the training split.
I0914 13:09:42.586314 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 13:09:53.636936 139785753851712 spec.py:348] Evaluating on the test split.
I0914 13:09:55.923867 139785753851712 submission_runner.py:376] Time since start: 21316.89s, Step: 60685, {'train/accuracy': 0.6724131107330322, 'train/loss': 1.6111648082733154, 'validation/accuracy': 0.6247599720954895, 'validation/loss': 1.8165513277053833, 'validation/num_examples': 50000, 'test/accuracy': 0.49490001797676086, 'test/loss': 2.476473093032837, 'test/num_examples': 10000, 'score': 20466.31075167656, 'total_duration': 21316.890612602234, 'accumulated_submission_time': 20466.31075167656, 'accumulated_eval_time': 848.3012022972107, 'accumulated_logging_time': 1.362497329711914}
I0914 13:09:55.951478 139618036999936 logging_writer.py:48] [60685] accumulated_eval_time=848.301202, accumulated_logging_time=1.362497, accumulated_submission_time=20466.310752, global_step=60685, preemption_count=0, score=20466.310752, test/accuracy=0.494900, test/loss=2.476473, test/num_examples=10000, total_duration=21316.890613, train/accuracy=0.672413, train/loss=1.611165, validation/accuracy=0.624760, validation/loss=1.816551, validation/num_examples=50000
I0914 13:11:42.044412 139618045392640 logging_writer.py:48] [61000] global_step=61000, grad_norm=0.2919887602329254, loss=3.329354763031006
I0914 13:14:30.159597 139618036999936 logging_writer.py:48] [61500] global_step=61500, grad_norm=0.29778042435646057, loss=3.3058791160583496
I0914 13:17:18.386161 139618045392640 logging_writer.py:48] [62000] global_step=62000, grad_norm=0.2837899625301361, loss=3.382122755050659
I0914 13:18:26.086736 139785753851712 spec.py:320] Evaluating on the training split.
I0914 13:18:33.855817 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 13:18:45.015492 139785753851712 spec.py:348] Evaluating on the test split.
I0914 13:18:47.231468 139785753851712 submission_runner.py:376] Time since start: 21848.20s, Step: 62203, {'train/accuracy': 0.6910076141357422, 'train/loss': 1.4848798513412476, 'validation/accuracy': 0.611739993095398, 'validation/loss': 1.8428031206130981, 'validation/num_examples': 50000, 'test/accuracy': 0.4788000285625458, 'test/loss': 2.5234971046447754, 'test/num_examples': 10000, 'score': 20976.412611722946, 'total_duration': 21848.198214292526, 'accumulated_submission_time': 20976.412611722946, 'accumulated_eval_time': 869.4459004402161, 'accumulated_logging_time': 1.4001359939575195}
I0914 13:18:47.254843 139618036999936 logging_writer.py:48] [62203] accumulated_eval_time=869.445900, accumulated_logging_time=1.400136, accumulated_submission_time=20976.412612, global_step=62203, preemption_count=0, score=20976.412612, test/accuracy=0.478800, test/loss=2.523497, test/num_examples=10000, total_duration=21848.198214, train/accuracy=0.691008, train/loss=1.484880, validation/accuracy=0.611740, validation/loss=1.842803, validation/num_examples=50000
I0914 13:20:27.308857 139620419352320 logging_writer.py:48] [62500] global_step=62500, grad_norm=0.30506154894828796, loss=3.297473669052124
I0914 13:23:15.523633 139618036999936 logging_writer.py:48] [63000] global_step=63000, grad_norm=0.3115837275981903, loss=3.338627815246582
I0914 13:26:03.682383 139620419352320 logging_writer.py:48] [63500] global_step=63500, grad_norm=0.31050509214401245, loss=3.2706823348999023
I0914 13:27:17.456019 139785753851712 spec.py:320] Evaluating on the training split.
I0914 13:27:25.201742 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 13:27:36.333347 139785753851712 spec.py:348] Evaluating on the test split.
I0914 13:27:38.580904 139785753851712 submission_runner.py:376] Time since start: 22379.55s, Step: 63721, {'train/accuracy': 0.6920240521430969, 'train/loss': 1.4649821519851685, 'validation/accuracy': 0.6265400052070618, 'validation/loss': 1.7699750661849976, 'validation/num_examples': 50000, 'test/accuracy': 0.5057000517845154, 'test/loss': 2.418595314025879, 'test/num_examples': 10000, 'score': 21486.580602407455, 'total_duration': 22379.547651052475, 'accumulated_submission_time': 21486.580602407455, 'accumulated_eval_time': 890.5707561969757, 'accumulated_logging_time': 1.433117389678955}
I0914 13:27:38.603337 139621090420480 logging_writer.py:48] [63721] accumulated_eval_time=890.570756, accumulated_logging_time=1.433117, accumulated_submission_time=21486.580602, global_step=63721, preemption_count=0, score=21486.580602, test/accuracy=0.505700, test/loss=2.418595, test/num_examples=10000, total_duration=22379.547651, train/accuracy=0.692024, train/loss=1.464982, validation/accuracy=0.626540, validation/loss=1.769975, validation/num_examples=50000
I0914 13:29:12.660554 139621098813184 logging_writer.py:48] [64000] global_step=64000, grad_norm=0.29441016912460327, loss=3.2818336486816406
I0914 13:32:00.863299 139621090420480 logging_writer.py:48] [64500] global_step=64500, grad_norm=0.2864547669887543, loss=3.258450508117676
I0914 13:34:49.087769 139621098813184 logging_writer.py:48] [65000] global_step=65000, grad_norm=0.29472991824150085, loss=3.2548041343688965
I0914 13:36:08.911741 139785753851712 spec.py:320] Evaluating on the training split.
I0914 13:36:16.670967 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 13:36:27.876635 139785753851712 spec.py:348] Evaluating on the test split.
I0914 13:36:30.129736 139785753851712 submission_runner.py:376] Time since start: 22911.10s, Step: 65239, {'train/accuracy': 0.6842514276504517, 'train/loss': 1.510816216468811, 'validation/accuracy': 0.6255399584770203, 'validation/loss': 1.7839473485946655, 'validation/num_examples': 50000, 'test/accuracy': 0.4951000213623047, 'test/loss': 2.4557323455810547, 'test/num_examples': 10000, 'score': 21996.856678962708, 'total_duration': 22911.09648013115, 'accumulated_submission_time': 21996.856678962708, 'accumulated_eval_time': 911.7887194156647, 'accumulated_logging_time': 1.464731216430664}
I0914 13:36:30.153545 139620419352320 logging_writer.py:48] [65239] accumulated_eval_time=911.788719, accumulated_logging_time=1.464731, accumulated_submission_time=21996.856679, global_step=65239, preemption_count=0, score=21996.856679, test/accuracy=0.495100, test/loss=2.455732, test/num_examples=10000, total_duration=22911.096480, train/accuracy=0.684251, train/loss=1.510816, validation/accuracy=0.625540, validation/loss=1.783947, validation/num_examples=50000
I0914 13:37:58.123054 139621082027776 logging_writer.py:48] [65500] global_step=65500, grad_norm=0.30739957094192505, loss=3.3116114139556885
I0914 13:40:46.180480 139620419352320 logging_writer.py:48] [66000] global_step=66000, grad_norm=0.3004262447357178, loss=3.3701276779174805
I0914 13:43:34.421802 139621082027776 logging_writer.py:48] [66500] global_step=66500, grad_norm=0.2919093072414398, loss=3.2609100341796875
I0914 13:45:00.284182 139785753851712 spec.py:320] Evaluating on the training split.
I0914 13:45:08.062767 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 13:45:19.342478 139785753851712 spec.py:348] Evaluating on the test split.
I0914 13:45:21.922167 139785753851712 submission_runner.py:376] Time since start: 23442.89s, Step: 66757, {'train/accuracy': 0.6906688213348389, 'train/loss': 1.5202724933624268, 'validation/accuracy': 0.6343199610710144, 'validation/loss': 1.7723238468170166, 'validation/num_examples': 50000, 'test/accuracy': 0.5078000426292419, 'test/loss': 2.4403321743011475, 'test/num_examples': 10000, 'score': 22506.95446920395, 'total_duration': 23442.88888812065, 'accumulated_submission_time': 22506.95446920395, 'accumulated_eval_time': 933.426650762558, 'accumulated_logging_time': 1.4981484413146973}
I0914 13:45:21.947076 139618036999936 logging_writer.py:48] [66757] accumulated_eval_time=933.426651, accumulated_logging_time=1.498148, accumulated_submission_time=22506.954469, global_step=66757, preemption_count=0, score=22506.954469, test/accuracy=0.507800, test/loss=2.440332, test/num_examples=10000, total_duration=23442.888888, train/accuracy=0.690669, train/loss=1.520272, validation/accuracy=0.634320, validation/loss=1.772324, validation/num_examples=50000
I0914 13:46:43.979573 139618045392640 logging_writer.py:48] [67000] global_step=67000, grad_norm=0.2981511354446411, loss=3.263331413269043
I0914 13:49:32.145080 139618036999936 logging_writer.py:48] [67500] global_step=67500, grad_norm=0.3048495054244995, loss=3.302417278289795
I0914 13:52:20.352500 139618045392640 logging_writer.py:48] [68000] global_step=68000, grad_norm=0.3056058883666992, loss=3.3564062118530273
I0914 13:53:52.119890 139785753851712 spec.py:320] Evaluating on the training split.
I0914 13:53:59.863528 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 13:54:11.162717 139785753851712 spec.py:348] Evaluating on the test split.
I0914 13:54:13.443134 139785753851712 submission_runner.py:376] Time since start: 23974.41s, Step: 68275, {'train/accuracy': 0.6909080147743225, 'train/loss': 1.470110535621643, 'validation/accuracy': 0.6406999826431274, 'validation/loss': 1.7015717029571533, 'validation/num_examples': 50000, 'test/accuracy': 0.5154000520706177, 'test/loss': 2.3714752197265625, 'test/num_examples': 10000, 'score': 23017.093948602676, 'total_duration': 23974.409868717194, 'accumulated_submission_time': 23017.093948602676, 'accumulated_eval_time': 954.7498555183411, 'accumulated_logging_time': 1.5329856872558594}
I0914 13:54:13.467223 139618045392640 logging_writer.py:48] [68275] accumulated_eval_time=954.749856, accumulated_logging_time=1.532986, accumulated_submission_time=23017.093949, global_step=68275, preemption_count=0, score=23017.093949, test/accuracy=0.515400, test/loss=2.371475, test/num_examples=10000, total_duration=23974.409869, train/accuracy=0.690908, train/loss=1.470111, validation/accuracy=0.640700, validation/loss=1.701572, validation/num_examples=50000
I0914 13:55:29.413370 139620419352320 logging_writer.py:48] [68500] global_step=68500, grad_norm=0.30467408895492554, loss=3.2854578495025635
I0914 13:58:17.648288 139618045392640 logging_writer.py:48] [69000] global_step=69000, grad_norm=0.3096533715724945, loss=3.2790355682373047
I0914 14:01:05.889391 139620419352320 logging_writer.py:48] [69500] global_step=69500, grad_norm=0.3090498149394989, loss=3.3411953449249268
I0914 14:02:43.574101 139785753851712 spec.py:320] Evaluating on the training split.
I0914 14:02:51.481779 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 14:03:02.490966 139785753851712 spec.py:348] Evaluating on the test split.
I0914 14:03:04.750303 139785753851712 submission_runner.py:376] Time since start: 24505.72s, Step: 69792, {'train/accuracy': 0.6947743892669678, 'train/loss': 1.4690238237380981, 'validation/accuracy': 0.6391599774360657, 'validation/loss': 1.700404167175293, 'validation/num_examples': 50000, 'test/accuracy': 0.5162000060081482, 'test/loss': 2.340641736984253, 'test/num_examples': 10000, 'score': 23527.16760492325, 'total_duration': 24505.717046260834, 'accumulated_submission_time': 23527.16760492325, 'accumulated_eval_time': 975.9260275363922, 'accumulated_logging_time': 1.5669758319854736}
I0914 14:03:04.773891 139620410959616 logging_writer.py:48] [69792] accumulated_eval_time=975.926028, accumulated_logging_time=1.566976, accumulated_submission_time=23527.167605, global_step=69792, preemption_count=0, score=23527.167605, test/accuracy=0.516200, test/loss=2.340642, test/num_examples=10000, total_duration=24505.717046, train/accuracy=0.694774, train/loss=1.469024, validation/accuracy=0.639160, validation/loss=1.700404, validation/num_examples=50000
I0914 14:04:15.037125 139621090420480 logging_writer.py:48] [70000] global_step=70000, grad_norm=0.3104066550731659, loss=3.3210484981536865
I0914 14:07:02.985107 139620410959616 logging_writer.py:48] [70500] global_step=70500, grad_norm=0.31107574701309204, loss=3.2496469020843506
I0914 14:09:51.206982 139621090420480 logging_writer.py:48] [71000] global_step=71000, grad_norm=0.3089660704135895, loss=3.335127353668213
I0914 14:11:34.908772 139785753851712 spec.py:320] Evaluating on the training split.
I0914 14:11:42.613535 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 14:11:53.646508 139785753851712 spec.py:348] Evaluating on the test split.
I0914 14:11:55.885930 139785753851712 submission_runner.py:376] Time since start: 25036.85s, Step: 71310, {'train/accuracy': 0.6975247263908386, 'train/loss': 1.482591152191162, 'validation/accuracy': 0.6232199668884277, 'validation/loss': 1.8254543542861938, 'validation/num_examples': 50000, 'test/accuracy': 0.4918000102043152, 'test/loss': 2.5100009441375732, 'test/num_examples': 10000, 'score': 24037.270278930664, 'total_duration': 25036.85267686844, 'accumulated_submission_time': 24037.270278930664, 'accumulated_eval_time': 996.903163433075, 'accumulated_logging_time': 1.599836826324463}
I0914 14:11:55.910848 139618028607232 logging_writer.py:48] [71310] accumulated_eval_time=996.903163, accumulated_logging_time=1.599837, accumulated_submission_time=24037.270279, global_step=71310, preemption_count=0, score=24037.270279, test/accuracy=0.491800, test/loss=2.510001, test/num_examples=10000, total_duration=25036.852677, train/accuracy=0.697525, train/loss=1.482591, validation/accuracy=0.623220, validation/loss=1.825454, validation/num_examples=50000
I0914 14:13:00.034183 139618036999936 logging_writer.py:48] [71500] global_step=71500, grad_norm=0.2995145320892334, loss=3.251258373260498
I0914 14:15:47.818787 139618028607232 logging_writer.py:48] [72000] global_step=72000, grad_norm=0.31369438767433167, loss=3.2654480934143066
I0914 14:18:35.882528 139618036999936 logging_writer.py:48] [72500] global_step=72500, grad_norm=0.30754441022872925, loss=3.236921787261963
I0914 14:20:26.003981 139785753851712 spec.py:320] Evaluating on the training split.
I0914 14:20:33.833399 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 14:20:44.757493 139785753851712 spec.py:348] Evaluating on the test split.
I0914 14:20:47.036332 139785753851712 submission_runner.py:376] Time since start: 25568.00s, Step: 72829, {'train/accuracy': 0.7215601205825806, 'train/loss': 1.327860713005066, 'validation/accuracy': 0.6525799632072449, 'validation/loss': 1.6442086696624756, 'validation/num_examples': 50000, 'test/accuracy': 0.5234000086784363, 'test/loss': 2.286708116531372, 'test/num_examples': 10000, 'score': 24547.331042289734, 'total_duration': 25568.003078222275, 'accumulated_submission_time': 24547.331042289734, 'accumulated_eval_time': 1017.9354872703552, 'accumulated_logging_time': 1.6340680122375488}
I0914 14:20:47.064872 139618028607232 logging_writer.py:48] [72829] accumulated_eval_time=1017.935487, accumulated_logging_time=1.634068, accumulated_submission_time=24547.331042, global_step=72829, preemption_count=0, score=24547.331042, test/accuracy=0.523400, test/loss=2.286708, test/num_examples=10000, total_duration=25568.003078, train/accuracy=0.721560, train/loss=1.327861, validation/accuracy=0.652580, validation/loss=1.644209, validation/num_examples=50000
I0914 14:21:44.860488 139621082027776 logging_writer.py:48] [73000] global_step=73000, grad_norm=0.3169178366661072, loss=3.2606658935546875
I0914 14:24:32.812964 139618028607232 logging_writer.py:48] [73500] global_step=73500, grad_norm=0.3276415467262268, loss=3.2667770385742188
I0914 14:27:21.031353 139621082027776 logging_writer.py:48] [74000] global_step=74000, grad_norm=0.3188159465789795, loss=3.289651393890381
I0914 14:29:17.197638 139785753851712 spec.py:320] Evaluating on the training split.
I0914 14:29:24.919732 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 14:29:36.035775 139785753851712 spec.py:348] Evaluating on the test split.
I0914 14:29:38.325842 139785753851712 submission_runner.py:376] Time since start: 26099.29s, Step: 74347, {'train/accuracy': 0.6916055083274841, 'train/loss': 1.4878246784210205, 'validation/accuracy': 0.6294199824333191, 'validation/loss': 1.7609314918518066, 'validation/num_examples': 50000, 'test/accuracy': 0.5049999952316284, 'test/loss': 2.4299111366271973, 'test/num_examples': 10000, 'score': 25057.428204774857, 'total_duration': 26099.292588233948, 'accumulated_submission_time': 25057.428204774857, 'accumulated_eval_time': 1039.0636780261993, 'accumulated_logging_time': 1.675804615020752}
I0914 14:29:38.350225 139618028607232 logging_writer.py:48] [74347] accumulated_eval_time=1039.063678, accumulated_logging_time=1.675805, accumulated_submission_time=25057.428205, global_step=74347, preemption_count=0, score=25057.428205, test/accuracy=0.505000, test/loss=2.429911, test/num_examples=10000, total_duration=26099.292588, train/accuracy=0.691606, train/loss=1.487825, validation/accuracy=0.629420, validation/loss=1.760931, validation/num_examples=50000
I0914 14:30:30.036252 139618036999936 logging_writer.py:48] [74500] global_step=74500, grad_norm=0.29929229617118835, loss=3.231595039367676
I0914 14:33:18.128115 139618028607232 logging_writer.py:48] [75000] global_step=75000, grad_norm=0.30234041810035706, loss=3.156550407409668
I0914 14:36:06.326229 139618036999936 logging_writer.py:48] [75500] global_step=75500, grad_norm=0.32938486337661743, loss=3.363729953765869
I0914 14:38:08.577289 139785753851712 spec.py:320] Evaluating on the training split.
I0914 14:38:16.598191 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 14:38:27.694827 139785753851712 spec.py:348] Evaluating on the test split.
I0914 14:38:29.926154 139785753851712 submission_runner.py:376] Time since start: 26630.89s, Step: 75865, {'train/accuracy': 0.7079480290412903, 'train/loss': 1.436248540878296, 'validation/accuracy': 0.6432799696922302, 'validation/loss': 1.7123216390609741, 'validation/num_examples': 50000, 'test/accuracy': 0.5208000540733337, 'test/loss': 2.3495261669158936, 'test/num_examples': 10000, 'score': 25567.620626449585, 'total_duration': 26630.89289879799, 'accumulated_submission_time': 25567.620626449585, 'accumulated_eval_time': 1060.4125316143036, 'accumulated_logging_time': 1.7115974426269531}
I0914 14:38:29.953331 139620419352320 logging_writer.py:48] [75865] accumulated_eval_time=1060.412532, accumulated_logging_time=1.711597, accumulated_submission_time=25567.620626, global_step=75865, preemption_count=0, score=25567.620626, test/accuracy=0.520800, test/loss=2.349526, test/num_examples=10000, total_duration=26630.892899, train/accuracy=0.707948, train/loss=1.436249, validation/accuracy=0.643280, validation/loss=1.712322, validation/num_examples=50000
I0914 14:39:15.698670 139621082027776 logging_writer.py:48] [76000] global_step=76000, grad_norm=0.312216192483902, loss=3.186450719833374
I0914 14:42:03.896099 139620419352320 logging_writer.py:48] [76500] global_step=76500, grad_norm=0.3170742392539978, loss=3.1856775283813477
I0914 14:44:51.996706 139621082027776 logging_writer.py:48] [77000] global_step=77000, grad_norm=0.31434234976768494, loss=3.184199810028076
I0914 14:46:59.946507 139785753851712 spec.py:320] Evaluating on the training split.
I0914 14:47:07.696892 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 14:47:18.790356 139785753851712 spec.py:348] Evaluating on the test split.
I0914 14:47:21.105739 139785753851712 submission_runner.py:376] Time since start: 27162.07s, Step: 77382, {'train/accuracy': 0.6989995241165161, 'train/loss': 1.422307014465332, 'validation/accuracy': 0.642579972743988, 'validation/loss': 1.6762522459030151, 'validation/num_examples': 50000, 'test/accuracy': 0.5154000520706177, 'test/loss': 2.3255693912506104, 'test/num_examples': 10000, 'score': 26077.580560445786, 'total_duration': 27162.072484493256, 'accumulated_submission_time': 26077.580560445786, 'accumulated_eval_time': 1081.571738243103, 'accumulated_logging_time': 1.748826503753662}
I0914 14:47:21.126562 139618028607232 logging_writer.py:48] [77382] accumulated_eval_time=1081.571738, accumulated_logging_time=1.748827, accumulated_submission_time=26077.580560, global_step=77382, preemption_count=0, score=26077.580560, test/accuracy=0.515400, test/loss=2.325569, test/num_examples=10000, total_duration=27162.072484, train/accuracy=0.699000, train/loss=1.422307, validation/accuracy=0.642580, validation/loss=1.676252, validation/num_examples=50000
I0914 14:48:01.154166 139618036999936 logging_writer.py:48] [77500] global_step=77500, grad_norm=0.31988903880119324, loss=3.2403721809387207
I0914 14:50:49.111113 139618028607232 logging_writer.py:48] [78000] global_step=78000, grad_norm=0.31698429584503174, loss=3.1237356662750244
I0914 14:53:37.223134 139618036999936 logging_writer.py:48] [78500] global_step=78500, grad_norm=0.3259633481502533, loss=3.25683856010437
I0914 14:55:51.240862 139785753851712 spec.py:320] Evaluating on the training split.
I0914 14:55:58.930301 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 14:56:10.042633 139785753851712 spec.py:348] Evaluating on the test split.
I0914 14:56:12.309500 139785753851712 submission_runner.py:376] Time since start: 27693.28s, Step: 78900, {'train/accuracy': 0.7405332922935486, 'train/loss': 1.2673977613449097, 'validation/accuracy': 0.6460199952125549, 'validation/loss': 1.6637077331542969, 'validation/num_examples': 50000, 'test/accuracy': 0.5200999975204468, 'test/loss': 2.316652536392212, 'test/num_examples': 10000, 'score': 26587.66230893135, 'total_duration': 27693.276193618774, 'accumulated_submission_time': 26587.66230893135, 'accumulated_eval_time': 1102.640297651291, 'accumulated_logging_time': 1.7789013385772705}
I0914 14:56:12.334219 139621090420480 logging_writer.py:48] [78900] accumulated_eval_time=1102.640298, accumulated_logging_time=1.778901, accumulated_submission_time=26587.662309, global_step=78900, preemption_count=0, score=26587.662309, test/accuracy=0.520100, test/loss=2.316653, test/num_examples=10000, total_duration=27693.276194, train/accuracy=0.740533, train/loss=1.267398, validation/accuracy=0.646020, validation/loss=1.663708, validation/num_examples=50000
I0914 14:56:46.298706 139621098813184 logging_writer.py:48] [79000] global_step=79000, grad_norm=0.3292911946773529, loss=3.169358015060425
I0914 14:59:34.547732 139621090420480 logging_writer.py:48] [79500] global_step=79500, grad_norm=0.329366534948349, loss=3.227858304977417
I0914 15:02:22.801796 139621098813184 logging_writer.py:48] [80000] global_step=80000, grad_norm=0.3188716173171997, loss=3.21577787399292
I0914 15:04:42.494943 139785753851712 spec.py:320] Evaluating on the training split.
I0914 15:04:50.231371 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 15:05:01.274785 139785753851712 spec.py:348] Evaluating on the test split.
I0914 15:05:03.605396 139785753851712 submission_runner.py:376] Time since start: 28224.57s, Step: 80417, {'train/accuracy': 0.7322424650192261, 'train/loss': 1.2937663793563843, 'validation/accuracy': 0.6529799699783325, 'validation/loss': 1.6502137184143066, 'validation/num_examples': 50000, 'test/accuracy': 0.527999997138977, 'test/loss': 2.2828421592712402, 'test/num_examples': 10000, 'score': 27097.790013074875, 'total_duration': 28224.572131872177, 'accumulated_submission_time': 27097.790013074875, 'accumulated_eval_time': 1123.7507123947144, 'accumulated_logging_time': 1.813563585281372}
I0914 15:05:03.629084 139618028607232 logging_writer.py:48] [80417] accumulated_eval_time=1123.750712, accumulated_logging_time=1.813564, accumulated_submission_time=27097.790013, global_step=80417, preemption_count=0, score=27097.790013, test/accuracy=0.528000, test/loss=2.282842, test/num_examples=10000, total_duration=28224.572132, train/accuracy=0.732242, train/loss=1.293766, validation/accuracy=0.652980, validation/loss=1.650214, validation/num_examples=50000
I0914 15:05:31.847938 139618045392640 logging_writer.py:48] [80500] global_step=80500, grad_norm=0.34364253282546997, loss=3.3060929775238037
I0914 15:08:19.974839 139618028607232 logging_writer.py:48] [81000] global_step=81000, grad_norm=0.3189429044723511, loss=3.134965419769287
I0914 15:11:07.884455 139618045392640 logging_writer.py:48] [81500] global_step=81500, grad_norm=0.3358222246170044, loss=3.146552085876465
I0914 15:13:33.672224 139785753851712 spec.py:320] Evaluating on the training split.
I0914 15:13:41.402096 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 15:13:52.418636 139785753851712 spec.py:348] Evaluating on the test split.
I0914 15:13:54.673453 139785753851712 submission_runner.py:376] Time since start: 28755.64s, Step: 81935, {'train/accuracy': 0.7216796875, 'train/loss': 1.3296536207199097, 'validation/accuracy': 0.6556800007820129, 'validation/loss': 1.635520577430725, 'validation/num_examples': 50000, 'test/accuracy': 0.5260000228881836, 'test/loss': 2.306215286254883, 'test/num_examples': 10000, 'score': 27607.79568052292, 'total_duration': 28755.640197753906, 'accumulated_submission_time': 27607.79568052292, 'accumulated_eval_time': 1144.751916885376, 'accumulated_logging_time': 1.8511121273040771}
I0914 15:13:54.701997 139621090420480 logging_writer.py:48] [81935] accumulated_eval_time=1144.751917, accumulated_logging_time=1.851112, accumulated_submission_time=27607.795681, global_step=81935, preemption_count=0, score=27607.795681, test/accuracy=0.526000, test/loss=2.306215, test/num_examples=10000, total_duration=28755.640198, train/accuracy=0.721680, train/loss=1.329654, validation/accuracy=0.655680, validation/loss=1.635521, validation/num_examples=50000
I0914 15:14:16.905740 139621098813184 logging_writer.py:48] [82000] global_step=82000, grad_norm=0.32502761483192444, loss=3.2584471702575684
I0914 15:17:05.131891 139621090420480 logging_writer.py:48] [82500] global_step=82500, grad_norm=0.33060410618782043, loss=3.2520453929901123
I0914 15:19:53.375931 139621098813184 logging_writer.py:48] [83000] global_step=83000, grad_norm=0.325259804725647, loss=3.173687219619751
I0914 15:22:24.842192 139785753851712 spec.py:320] Evaluating on the training split.
I0914 15:22:32.534203 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 15:22:43.610139 139785753851712 spec.py:348] Evaluating on the test split.
I0914 15:22:45.856654 139785753851712 submission_runner.py:376] Time since start: 29286.82s, Step: 83452, {'train/accuracy': 0.7293726205825806, 'train/loss': 1.2778840065002441, 'validation/accuracy': 0.664139986038208, 'validation/loss': 1.5573835372924805, 'validation/num_examples': 50000, 'test/accuracy': 0.534500002861023, 'test/loss': 2.211136817932129, 'test/num_examples': 10000, 'score': 28117.902702093124, 'total_duration': 29286.823399305344, 'accumulated_submission_time': 28117.902702093124, 'accumulated_eval_time': 1165.7663543224335, 'accumulated_logging_time': 1.8896336555480957}
I0914 15:22:45.884523 139618036999936 logging_writer.py:48] [83452] accumulated_eval_time=1165.766354, accumulated_logging_time=1.889634, accumulated_submission_time=28117.902702, global_step=83452, preemption_count=0, score=28117.902702, test/accuracy=0.534500, test/loss=2.211137, test/num_examples=10000, total_duration=29286.823399, train/accuracy=0.729373, train/loss=1.277884, validation/accuracy=0.664140, validation/loss=1.557384, validation/num_examples=50000
I0914 15:23:02.359415 139618045392640 logging_writer.py:48] [83500] global_step=83500, grad_norm=0.3362581431865692, loss=3.143125057220459
I0914 15:25:50.559870 139618036999936 logging_writer.py:48] [84000] global_step=84000, grad_norm=0.3290119767189026, loss=3.218841552734375
I0914 15:28:38.785134 139618045392640 logging_writer.py:48] [84500] global_step=84500, grad_norm=0.3364306390285492, loss=3.2182228565216064
I0914 15:31:15.978195 139785753851712 spec.py:320] Evaluating on the training split.
I0914 15:31:23.674424 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 15:31:34.592998 139785753851712 spec.py:348] Evaluating on the test split.
I0914 15:31:36.879657 139785753851712 submission_runner.py:376] Time since start: 29817.85s, Step: 84969, {'train/accuracy': 0.7102997303009033, 'train/loss': 1.3722602128982544, 'validation/accuracy': 0.6526600122451782, 'validation/loss': 1.63387930393219, 'validation/num_examples': 50000, 'test/accuracy': 0.5289000272750854, 'test/loss': 2.2834620475769043, 'test/num_examples': 10000, 'score': 28627.961901664734, 'total_duration': 29817.84640312195, 'accumulated_submission_time': 28627.961901664734, 'accumulated_eval_time': 1186.6677963733673, 'accumulated_logging_time': 1.9287693500518799}
I0914 15:31:36.904275 139618045392640 logging_writer.py:48] [84969] accumulated_eval_time=1186.667796, accumulated_logging_time=1.928769, accumulated_submission_time=28627.961902, global_step=84969, preemption_count=0, score=28627.961902, test/accuracy=0.528900, test/loss=2.283462, test/num_examples=10000, total_duration=29817.846403, train/accuracy=0.710300, train/loss=1.372260, validation/accuracy=0.652660, validation/loss=1.633879, validation/num_examples=50000
I0914 15:31:47.661670 139621090420480 logging_writer.py:48] [85000] global_step=85000, grad_norm=0.33524322509765625, loss=3.2073211669921875
I0914 15:34:35.556128 139618045392640 logging_writer.py:48] [85500] global_step=85500, grad_norm=0.344351589679718, loss=3.215855360031128
I0914 15:37:23.726473 139621090420480 logging_writer.py:48] [86000] global_step=86000, grad_norm=0.34118911623954773, loss=3.184333324432373
I0914 15:40:06.981885 139785753851712 spec.py:320] Evaluating on the training split.
I0914 15:40:14.750491 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 15:40:25.937859 139785753851712 spec.py:348] Evaluating on the test split.
I0914 15:40:28.169976 139785753851712 submission_runner.py:376] Time since start: 30349.14s, Step: 86487, {'train/accuracy': 0.7192083597183228, 'train/loss': 1.318070411682129, 'validation/accuracy': 0.6582799553871155, 'validation/loss': 1.5826328992843628, 'validation/num_examples': 50000, 'test/accuracy': 0.5327000021934509, 'test/loss': 2.2320618629455566, 'test/num_examples': 10000, 'score': 29138.005053281784, 'total_duration': 30349.136724233627, 'accumulated_submission_time': 29138.005053281784, 'accumulated_eval_time': 1207.8558654785156, 'accumulated_logging_time': 1.964245080947876}
I0914 15:40:28.195220 139620419352320 logging_writer.py:48] [86487] accumulated_eval_time=1207.855865, accumulated_logging_time=1.964245, accumulated_submission_time=29138.005053, global_step=86487, preemption_count=0, score=29138.005053, test/accuracy=0.532700, test/loss=2.232062, test/num_examples=10000, total_duration=30349.136724, train/accuracy=0.719208, train/loss=1.318070, validation/accuracy=0.658280, validation/loss=1.582633, validation/num_examples=50000
I0914 15:40:32.916276 139621082027776 logging_writer.py:48] [86500] global_step=86500, grad_norm=0.33634471893310547, loss=3.165295124053955
I0914 15:43:21.105581 139620419352320 logging_writer.py:48] [87000] global_step=87000, grad_norm=0.3431008756160736, loss=3.158895969390869
I0914 15:46:09.041963 139621082027776 logging_writer.py:48] [87500] global_step=87500, grad_norm=0.33978071808815, loss=3.1531012058258057
I0914 15:48:57.184348 139620419352320 logging_writer.py:48] [88000] global_step=88000, grad_norm=0.33860036730766296, loss=3.157572031021118
I0914 15:48:58.283360 139785753851712 spec.py:320] Evaluating on the training split.
I0914 15:49:05.941977 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 15:49:17.018113 139785753851712 spec.py:348] Evaluating on the test split.
I0914 15:49:19.268614 139785753851712 submission_runner.py:376] Time since start: 30880.24s, Step: 88005, {'train/accuracy': 0.7746930718421936, 'train/loss': 1.0886952877044678, 'validation/accuracy': 0.6663599610328674, 'validation/loss': 1.5537863969802856, 'validation/num_examples': 50000, 'test/accuracy': 0.5420000553131104, 'test/loss': 2.211760997772217, 'test/num_examples': 10000, 'score': 29648.06053853035, 'total_duration': 30880.23536133766, 'accumulated_submission_time': 29648.06053853035, 'accumulated_eval_time': 1228.8410770893097, 'accumulated_logging_time': 1.9997587203979492}
I0914 15:49:19.292289 139618036999936 logging_writer.py:48] [88005] accumulated_eval_time=1228.841077, accumulated_logging_time=1.999759, accumulated_submission_time=29648.060539, global_step=88005, preemption_count=0, score=29648.060539, test/accuracy=0.542000, test/loss=2.211761, test/num_examples=10000, total_duration=30880.235361, train/accuracy=0.774693, train/loss=1.088695, validation/accuracy=0.666360, validation/loss=1.553786, validation/num_examples=50000
I0914 15:52:06.000102 139618045392640 logging_writer.py:48] [88500] global_step=88500, grad_norm=0.3258196711540222, loss=3.0706851482391357
I0914 15:54:54.075734 139618036999936 logging_writer.py:48] [89000] global_step=89000, grad_norm=0.3466311991214752, loss=3.194755792617798
I0914 15:57:42.013890 139618045392640 logging_writer.py:48] [89500] global_step=89500, grad_norm=0.3438502848148346, loss=3.107487678527832
I0914 15:57:49.511776 139785753851712 spec.py:320] Evaluating on the training split.
I0914 15:57:57.318909 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 15:58:08.388863 139785753851712 spec.py:348] Evaluating on the test split.
I0914 15:58:10.616803 139785753851712 submission_runner.py:376] Time since start: 31411.58s, Step: 89524, {'train/accuracy': 0.7538663744926453, 'train/loss': 1.2191698551177979, 'validation/accuracy': 0.6728999614715576, 'validation/loss': 1.5614254474639893, 'validation/num_examples': 50000, 'test/accuracy': 0.5437000393867493, 'test/loss': 2.203444719314575, 'test/num_examples': 10000, 'score': 30158.248270750046, 'total_duration': 31411.583546876907, 'accumulated_submission_time': 30158.248270750046, 'accumulated_eval_time': 1249.946064710617, 'accumulated_logging_time': 2.0324220657348633}
I0914 15:58:10.640623 139621090420480 logging_writer.py:48] [89524] accumulated_eval_time=1249.946065, accumulated_logging_time=2.032422, accumulated_submission_time=30158.248271, global_step=89524, preemption_count=0, score=30158.248271, test/accuracy=0.543700, test/loss=2.203445, test/num_examples=10000, total_duration=31411.583547, train/accuracy=0.753866, train/loss=1.219170, validation/accuracy=0.672900, validation/loss=1.561425, validation/num_examples=50000
I0914 16:00:50.736585 139621098813184 logging_writer.py:48] [90000] global_step=90000, grad_norm=0.3367817997932434, loss=3.0911600589752197
I0914 16:03:38.763216 139621090420480 logging_writer.py:48] [90500] global_step=90500, grad_norm=0.3441648781299591, loss=3.13027024269104
I0914 16:06:26.761715 139621098813184 logging_writer.py:48] [91000] global_step=91000, grad_norm=0.3405533730983734, loss=3.1159348487854004
I0914 16:06:40.637633 139785753851712 spec.py:320] Evaluating on the training split.
I0914 16:06:48.270154 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 16:06:59.503044 139785753851712 spec.py:348] Evaluating on the test split.
I0914 16:07:01.769087 139785753851712 submission_runner.py:376] Time since start: 31942.74s, Step: 91043, {'train/accuracy': 0.7286351919174194, 'train/loss': 1.2893345355987549, 'validation/accuracy': 0.6548799872398376, 'validation/loss': 1.6241097450256348, 'validation/num_examples': 50000, 'test/accuracy': 0.5337000489234924, 'test/loss': 2.274324893951416, 'test/num_examples': 10000, 'score': 30668.21325492859, 'total_duration': 31942.735835552216, 'accumulated_submission_time': 30668.21325492859, 'accumulated_eval_time': 1271.077484369278, 'accumulated_logging_time': 2.065810203552246}
I0914 16:07:01.792829 139618045392640 logging_writer.py:48] [91043] accumulated_eval_time=1271.077484, accumulated_logging_time=2.065810, accumulated_submission_time=30668.213255, global_step=91043, preemption_count=0, score=30668.213255, test/accuracy=0.533700, test/loss=2.274325, test/num_examples=10000, total_duration=31942.735836, train/accuracy=0.728635, train/loss=1.289335, validation/accuracy=0.654880, validation/loss=1.624110, validation/num_examples=50000
I0914 16:09:35.603240 139620410959616 logging_writer.py:48] [91500] global_step=91500, grad_norm=0.33739516139030457, loss=3.0308287143707275
I0914 16:12:23.677797 139618045392640 logging_writer.py:48] [92000] global_step=92000, grad_norm=0.3435581922531128, loss=3.043058395385742
I0914 16:15:11.851698 139620410959616 logging_writer.py:48] [92500] global_step=92500, grad_norm=0.354044109582901, loss=3.1414432525634766
I0914 16:15:31.795955 139785753851712 spec.py:320] Evaluating on the training split.
I0914 16:15:39.450120 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 16:15:50.477826 139785753851712 spec.py:348] Evaluating on the test split.
I0914 16:15:52.755001 139785753851712 submission_runner.py:376] Time since start: 32473.72s, Step: 92561, {'train/accuracy': 0.7453164458274841, 'train/loss': 1.2298004627227783, 'validation/accuracy': 0.6710399985313416, 'validation/loss': 1.5516157150268555, 'validation/num_examples': 50000, 'test/accuracy': 0.5501000285148621, 'test/loss': 2.1891353130340576, 'test/num_examples': 10000, 'score': 31178.181513547897, 'total_duration': 32473.721732854843, 'accumulated_submission_time': 31178.181513547897, 'accumulated_eval_time': 1292.0364754199982, 'accumulated_logging_time': 2.102283000946045}
I0914 16:15:52.780585 139620410959616 logging_writer.py:48] [92561] accumulated_eval_time=1292.036475, accumulated_logging_time=2.102283, accumulated_submission_time=31178.181514, global_step=92561, preemption_count=0, score=31178.181514, test/accuracy=0.550100, test/loss=2.189135, test/num_examples=10000, total_duration=32473.721733, train/accuracy=0.745316, train/loss=1.229800, validation/accuracy=0.671040, validation/loss=1.551616, validation/num_examples=50000
I0914 16:18:20.689055 139621090420480 logging_writer.py:48] [93000] global_step=93000, grad_norm=0.3550674021244049, loss=3.0757088661193848
I0914 16:21:08.756993 139620410959616 logging_writer.py:48] [93500] global_step=93500, grad_norm=0.36305615305900574, loss=3.072169065475464
I0914 16:23:56.953319 139621090420480 logging_writer.py:48] [94000] global_step=94000, grad_norm=0.37776169180870056, loss=3.1176300048828125
I0914 16:24:22.947271 139785753851712 spec.py:320] Evaluating on the training split.
I0914 16:24:30.598288 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 16:24:41.735978 139785753851712 spec.py:348] Evaluating on the test split.
I0914 16:24:44.008029 139785753851712 submission_runner.py:376] Time since start: 33004.97s, Step: 94079, {'train/accuracy': 0.7446189522743225, 'train/loss': 1.263240098953247, 'validation/accuracy': 0.677619993686676, 'validation/loss': 1.5534390211105347, 'validation/num_examples': 50000, 'test/accuracy': 0.5550000071525574, 'test/loss': 2.191549062728882, 'test/num_examples': 10000, 'score': 31688.31489801407, 'total_duration': 33004.9747774601, 'accumulated_submission_time': 31688.31489801407, 'accumulated_eval_time': 1313.0971965789795, 'accumulated_logging_time': 2.1376092433929443}
I0914 16:24:44.032844 139618045392640 logging_writer.py:48] [94079] accumulated_eval_time=1313.097197, accumulated_logging_time=2.137609, accumulated_submission_time=31688.314898, global_step=94079, preemption_count=0, score=31688.314898, test/accuracy=0.555000, test/loss=2.191549, test/num_examples=10000, total_duration=33004.974777, train/accuracy=0.744619, train/loss=1.263240, validation/accuracy=0.677620, validation/loss=1.553439, validation/num_examples=50000
I0914 16:27:05.939068 139620410959616 logging_writer.py:48] [94500] global_step=94500, grad_norm=0.3512846827507019, loss=3.0405852794647217
I0914 16:29:54.129690 139618045392640 logging_writer.py:48] [95000] global_step=95000, grad_norm=0.3504914939403534, loss=3.1171083450317383
I0914 16:32:42.109276 139620410959616 logging_writer.py:48] [95500] global_step=95500, grad_norm=0.3645898997783661, loss=3.053126811981201
I0914 16:33:14.164235 139785753851712 spec.py:320] Evaluating on the training split.
I0914 16:33:21.834095 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 16:33:32.892427 139785753851712 spec.py:348] Evaluating on the test split.
I0914 16:33:35.152878 139785753851712 submission_runner.py:376] Time since start: 33536.12s, Step: 95597, {'train/accuracy': 0.7449776530265808, 'train/loss': 1.2335659265518188, 'validation/accuracy': 0.6818400025367737, 'validation/loss': 1.5183688402175903, 'validation/num_examples': 50000, 'test/accuracy': 0.5538000464439392, 'test/loss': 2.1452736854553223, 'test/num_examples': 10000, 'score': 32198.41109275818, 'total_duration': 33536.11962342262, 'accumulated_submission_time': 32198.41109275818, 'accumulated_eval_time': 1334.085800409317, 'accumulated_logging_time': 2.1747827529907227}
I0914 16:33:35.181872 139621090420480 logging_writer.py:48] [95597] accumulated_eval_time=1334.085800, accumulated_logging_time=2.174783, accumulated_submission_time=32198.411093, global_step=95597, preemption_count=0, score=32198.411093, test/accuracy=0.553800, test/loss=2.145274, test/num_examples=10000, total_duration=33536.119623, train/accuracy=0.744978, train/loss=1.233566, validation/accuracy=0.681840, validation/loss=1.518369, validation/num_examples=50000
I0914 16:35:50.819981 139621098813184 logging_writer.py:48] [96000] global_step=96000, grad_norm=0.36467593908309937, loss=3.0915420055389404
I0914 16:38:38.981382 139621090420480 logging_writer.py:48] [96500] global_step=96500, grad_norm=0.36556708812713623, loss=3.0985519886016846
I0914 16:41:27.161284 139621098813184 logging_writer.py:48] [97000] global_step=97000, grad_norm=0.36795929074287415, loss=2.9657955169677734
I0914 16:42:05.221490 139785753851712 spec.py:320] Evaluating on the training split.
I0914 16:42:12.871140 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 16:42:23.909234 139785753851712 spec.py:348] Evaluating on the test split.
I0914 16:42:26.155958 139785753851712 submission_runner.py:376] Time since start: 34067.12s, Step: 97115, {'train/accuracy': 0.7691525816917419, 'train/loss': 1.1535043716430664, 'validation/accuracy': 0.667199969291687, 'validation/loss': 1.590874195098877, 'validation/num_examples': 50000, 'test/accuracy': 0.5401000380516052, 'test/loss': 2.221604347229004, 'test/num_examples': 10000, 'score': 32708.41788005829, 'total_duration': 34067.122673511505, 'accumulated_submission_time': 32708.41788005829, 'accumulated_eval_time': 1355.0201969146729, 'accumulated_logging_time': 2.2134897708892822}
I0914 16:42:26.179425 139618045392640 logging_writer.py:48] [97115] accumulated_eval_time=1355.020197, accumulated_logging_time=2.213490, accumulated_submission_time=32708.417880, global_step=97115, preemption_count=0, score=32708.417880, test/accuracy=0.540100, test/loss=2.221604, test/num_examples=10000, total_duration=34067.122674, train/accuracy=0.769153, train/loss=1.153504, validation/accuracy=0.667200, validation/loss=1.590874, validation/num_examples=50000
I0914 16:44:36.011977 139620410959616 logging_writer.py:48] [97500] global_step=97500, grad_norm=0.36638134717941284, loss=3.09440541267395
I0914 16:47:24.182037 139618045392640 logging_writer.py:48] [98000] global_step=98000, grad_norm=0.382851243019104, loss=3.08982515335083
I0914 16:50:12.426895 139620410959616 logging_writer.py:48] [98500] global_step=98500, grad_norm=0.3698352873325348, loss=3.0431742668151855
I0914 16:50:56.266190 139785753851712 spec.py:320] Evaluating on the training split.
I0914 16:51:04.192825 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 16:51:15.348309 139785753851712 spec.py:348] Evaluating on the test split.
I0914 16:51:17.568398 139785753851712 submission_runner.py:376] Time since start: 34598.54s, Step: 98632, {'train/accuracy': 0.7757692933082581, 'train/loss': 1.1324656009674072, 'validation/accuracy': 0.690559983253479, 'validation/loss': 1.5002992153167725, 'validation/num_examples': 50000, 'test/accuracy': 0.5630000233650208, 'test/loss': 2.134718656539917, 'test/num_examples': 10000, 'score': 33218.47289562225, 'total_duration': 34598.53514504433, 'accumulated_submission_time': 33218.47289562225, 'accumulated_eval_time': 1376.3223690986633, 'accumulated_logging_time': 2.246415138244629}
I0914 16:51:17.593133 139618036999936 logging_writer.py:48] [98632] accumulated_eval_time=1376.322369, accumulated_logging_time=2.246415, accumulated_submission_time=33218.472896, global_step=98632, preemption_count=0, score=33218.472896, test/accuracy=0.563000, test/loss=2.134719, test/num_examples=10000, total_duration=34598.535145, train/accuracy=0.775769, train/loss=1.132466, validation/accuracy=0.690560, validation/loss=1.500299, validation/num_examples=50000
I0914 16:53:21.508238 139621090420480 logging_writer.py:48] [99000] global_step=99000, grad_norm=0.3786008358001709, loss=3.025963068008423
I0914 16:56:09.690086 139618036999936 logging_writer.py:48] [99500] global_step=99500, grad_norm=0.3736327886581421, loss=3.033701181411743
I0914 16:58:57.918751 139621090420480 logging_writer.py:48] [100000] global_step=100000, grad_norm=0.38231179118156433, loss=3.0271847248077393
I0914 16:59:47.846658 139785753851712 spec.py:320] Evaluating on the training split.
I0914 16:59:55.566396 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 17:00:06.535897 139785753851712 spec.py:348] Evaluating on the test split.
I0914 17:00:08.794648 139785753851712 submission_runner.py:376] Time since start: 35129.76s, Step: 100150, {'train/accuracy': 0.7731584906578064, 'train/loss': 1.1149073839187622, 'validation/accuracy': 0.6941999793052673, 'validation/loss': 1.4566799402236938, 'validation/num_examples': 50000, 'test/accuracy': 0.5652000308036804, 'test/loss': 2.10357928276062, 'test/num_examples': 10000, 'score': 33728.69375014305, 'total_duration': 35129.76139855385, 'accumulated_submission_time': 33728.69375014305, 'accumulated_eval_time': 1397.2703416347504, 'accumulated_logging_time': 2.2805235385894775}
I0914 17:00:08.818891 139621082027776 logging_writer.py:48] [100150] accumulated_eval_time=1397.270342, accumulated_logging_time=2.280524, accumulated_submission_time=33728.693750, global_step=100150, preemption_count=0, score=33728.693750, test/accuracy=0.565200, test/loss=2.103579, test/num_examples=10000, total_duration=35129.761399, train/accuracy=0.773158, train/loss=1.114907, validation/accuracy=0.694200, validation/loss=1.456680, validation/num_examples=50000
I0914 17:02:06.840601 139621107205888 logging_writer.py:48] [100500] global_step=100500, grad_norm=0.3828810453414917, loss=3.052368402481079
I0914 17:04:55.038065 139621082027776 logging_writer.py:48] [101000] global_step=101000, grad_norm=0.387489914894104, loss=3.0429494380950928
I0914 17:07:43.136696 139621107205888 logging_writer.py:48] [101500] global_step=101500, grad_norm=0.37475812435150146, loss=2.9192471504211426
I0914 17:08:38.965880 139785753851712 spec.py:320] Evaluating on the training split.
I0914 17:08:46.603290 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 17:08:57.649946 139785753851712 spec.py:348] Evaluating on the test split.
I0914 17:08:59.932699 139785753851712 submission_runner.py:376] Time since start: 35660.90s, Step: 101668, {'train/accuracy': 0.7729790806770325, 'train/loss': 1.1216973066329956, 'validation/accuracy': 0.6942600011825562, 'validation/loss': 1.4576067924499512, 'validation/num_examples': 50000, 'test/accuracy': 0.5682000517845154, 'test/loss': 2.1024677753448486, 'test/num_examples': 10000, 'score': 34238.80842423439, 'total_duration': 35660.899446964264, 'accumulated_submission_time': 34238.80842423439, 'accumulated_eval_time': 1418.23712515831, 'accumulated_logging_time': 2.314119815826416}
I0914 17:08:59.957205 139620410959616 logging_writer.py:48] [101668] accumulated_eval_time=1418.237125, accumulated_logging_time=2.314120, accumulated_submission_time=34238.808424, global_step=101668, preemption_count=0, score=34238.808424, test/accuracy=0.568200, test/loss=2.102468, test/num_examples=10000, total_duration=35660.899447, train/accuracy=0.772979, train/loss=1.121697, validation/accuracy=0.694260, validation/loss=1.457607, validation/num_examples=50000
I0914 17:10:51.731181 139620419352320 logging_writer.py:48] [102000] global_step=102000, grad_norm=0.37464380264282227, loss=2.9934191703796387
I0914 17:13:39.813981 139620410959616 logging_writer.py:48] [102500] global_step=102500, grad_norm=0.3883311152458191, loss=3.0084176063537598
I0914 17:16:28.027521 139620419352320 logging_writer.py:48] [103000] global_step=103000, grad_norm=0.38096585869789124, loss=2.983563184738159
I0914 17:17:30.245515 139785753851712 spec.py:320] Evaluating on the training split.
I0914 17:17:37.844074 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 17:17:49.018025 139785753851712 spec.py:348] Evaluating on the test split.
I0914 17:17:51.227629 139785753851712 submission_runner.py:376] Time since start: 36192.19s, Step: 103187, {'train/accuracy': 0.7712850570678711, 'train/loss': 1.1430009603500366, 'validation/accuracy': 0.6977199912071228, 'validation/loss': 1.4706509113311768, 'validation/num_examples': 50000, 'test/accuracy': 0.5703999996185303, 'test/loss': 2.109886646270752, 'test/num_examples': 10000, 'score': 34749.06494665146, 'total_duration': 36192.19437289238, 'accumulated_submission_time': 34749.06494665146, 'accumulated_eval_time': 1439.2192113399506, 'accumulated_logging_time': 2.3478918075561523}
I0914 17:17:51.252707 139621082027776 logging_writer.py:48] [103187] accumulated_eval_time=1439.219211, accumulated_logging_time=2.347892, accumulated_submission_time=34749.064947, global_step=103187, preemption_count=0, score=34749.064947, test/accuracy=0.570400, test/loss=2.109887, test/num_examples=10000, total_duration=36192.194373, train/accuracy=0.771285, train/loss=1.143001, validation/accuracy=0.697720, validation/loss=1.470651, validation/num_examples=50000
I0914 17:19:36.790855 139621098813184 logging_writer.py:48] [103500] global_step=103500, grad_norm=0.38000091910362244, loss=2.9304559230804443
I0914 17:22:24.911260 139621082027776 logging_writer.py:48] [104000] global_step=104000, grad_norm=0.39263468980789185, loss=2.971359968185425
I0914 17:25:13.136347 139621098813184 logging_writer.py:48] [104500] global_step=104500, grad_norm=0.40036970376968384, loss=3.0242748260498047
I0914 17:26:21.542417 139785753851712 spec.py:320] Evaluating on the training split.
I0914 17:26:29.194283 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 17:26:40.280743 139785753851712 spec.py:348] Evaluating on the test split.
I0914 17:26:42.518318 139785753851712 submission_runner.py:376] Time since start: 36723.48s, Step: 104705, {'train/accuracy': 0.7724210619926453, 'train/loss': 1.1543452739715576, 'validation/accuracy': 0.6925199627876282, 'validation/loss': 1.485355019569397, 'validation/num_examples': 50000, 'test/accuracy': 0.5617000460624695, 'test/loss': 2.1401867866516113, 'test/num_examples': 10000, 'score': 35259.32285261154, 'total_duration': 36723.48499917984, 'accumulated_submission_time': 35259.32285261154, 'accumulated_eval_time': 1460.195018529892, 'accumulated_logging_time': 2.381948232650757}
I0914 17:26:42.543345 139618045392640 logging_writer.py:48] [104705] accumulated_eval_time=1460.195019, accumulated_logging_time=2.381948, accumulated_submission_time=35259.322853, global_step=104705, preemption_count=0, score=35259.322853, test/accuracy=0.561700, test/loss=2.140187, test/num_examples=10000, total_duration=36723.484999, train/accuracy=0.772421, train/loss=1.154345, validation/accuracy=0.692520, validation/loss=1.485355, validation/num_examples=50000
I0914 17:28:21.862946 139620410959616 logging_writer.py:48] [105000] global_step=105000, grad_norm=0.4067330062389374, loss=3.011190891265869
I0914 17:31:09.761775 139618045392640 logging_writer.py:48] [105500] global_step=105500, grad_norm=0.39534327387809753, loss=2.948467493057251
I0914 17:33:57.917924 139620410959616 logging_writer.py:48] [106000] global_step=106000, grad_norm=0.40077218413352966, loss=3.005322217941284
I0914 17:35:12.584997 139785753851712 spec.py:320] Evaluating on the training split.
I0914 17:35:20.279434 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 17:35:31.368846 139785753851712 spec.py:348] Evaluating on the test split.
I0914 17:35:33.610364 139785753851712 submission_runner.py:376] Time since start: 37254.58s, Step: 106224, {'train/accuracy': 0.8110052347183228, 'train/loss': 0.9716266393661499, 'validation/accuracy': 0.7090199589729309, 'validation/loss': 1.3921597003936768, 'validation/num_examples': 50000, 'test/accuracy': 0.5842000246047974, 'test/loss': 2.0155460834503174, 'test/num_examples': 10000, 'score': 35769.33108854294, 'total_duration': 37254.577083826065, 'accumulated_submission_time': 35769.33108854294, 'accumulated_eval_time': 1481.2203319072723, 'accumulated_logging_time': 2.4178240299224854}
I0914 17:35:33.645757 139621090420480 logging_writer.py:48] [106224] accumulated_eval_time=1481.220332, accumulated_logging_time=2.417824, accumulated_submission_time=35769.331089, global_step=106224, preemption_count=0, score=35769.331089, test/accuracy=0.584200, test/loss=2.015546, test/num_examples=10000, total_duration=37254.577084, train/accuracy=0.811005, train/loss=0.971627, validation/accuracy=0.709020, validation/loss=1.392160, validation/num_examples=50000
I0914 17:37:06.574394 139621098813184 logging_writer.py:48] [106500] global_step=106500, grad_norm=0.4171896278858185, loss=2.970425605773926
I0914 17:39:54.659567 139621090420480 logging_writer.py:48] [107000] global_step=107000, grad_norm=0.38486701250076294, loss=2.8754992485046387
I0914 17:42:42.683755 139621098813184 logging_writer.py:48] [107500] global_step=107500, grad_norm=0.41185760498046875, loss=2.9692084789276123
I0914 17:44:03.787653 139785753851712 spec.py:320] Evaluating on the training split.
I0914 17:44:11.570801 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 17:44:22.618466 139785753851712 spec.py:348] Evaluating on the test split.
I0914 17:44:24.886515 139785753851712 submission_runner.py:376] Time since start: 37785.85s, Step: 107743, {'train/accuracy': 0.7952407598495483, 'train/loss': 1.0415843725204468, 'validation/accuracy': 0.7023400068283081, 'validation/loss': 1.4332274198532104, 'validation/num_examples': 50000, 'test/accuracy': 0.5763000249862671, 'test/loss': 2.0579850673675537, 'test/num_examples': 10000, 'score': 36279.43704533577, 'total_duration': 37785.853261470795, 'accumulated_submission_time': 36279.43704533577, 'accumulated_eval_time': 1502.319188117981, 'accumulated_logging_time': 2.4656083583831787}
I0914 17:44:24.914275 139618028607232 logging_writer.py:48] [107743] accumulated_eval_time=1502.319188, accumulated_logging_time=2.465608, accumulated_submission_time=36279.437045, global_step=107743, preemption_count=0, score=36279.437045, test/accuracy=0.576300, test/loss=2.057985, test/num_examples=10000, total_duration=37785.853261, train/accuracy=0.795241, train/loss=1.041584, validation/accuracy=0.702340, validation/loss=1.433227, validation/num_examples=50000
I0914 17:45:51.602435 139618036999936 logging_writer.py:48] [108000] global_step=108000, grad_norm=0.4147533178329468, loss=3.016648769378662
I0914 17:48:39.803782 139618028607232 logging_writer.py:48] [108500] global_step=108500, grad_norm=0.40793949365615845, loss=2.919987440109253
I0914 17:51:28.004422 139618036999936 logging_writer.py:48] [109000] global_step=109000, grad_norm=0.43241071701049805, loss=3.0008387565612793
I0914 17:52:54.913345 139785753851712 spec.py:320] Evaluating on the training split.
I0914 17:53:02.549693 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 17:53:13.555589 139785753851712 spec.py:348] Evaluating on the test split.
I0914 17:53:15.787854 139785753851712 submission_runner.py:376] Time since start: 38316.75s, Step: 109260, {'train/accuracy': 0.80961012840271, 'train/loss': 0.9733060002326965, 'validation/accuracy': 0.7177000045776367, 'validation/loss': 1.3576146364212036, 'validation/num_examples': 50000, 'test/accuracy': 0.5911000370979309, 'test/loss': 1.9605129957199097, 'test/num_examples': 10000, 'score': 36789.401881456375, 'total_duration': 38316.75459957123, 'accumulated_submission_time': 36789.401881456375, 'accumulated_eval_time': 1523.1936659812927, 'accumulated_logging_time': 2.504333019256592}
I0914 17:53:15.813561 139621098813184 logging_writer.py:48] [109260] accumulated_eval_time=1523.193666, accumulated_logging_time=2.504333, accumulated_submission_time=36789.401881, global_step=109260, preemption_count=0, score=36789.401881, test/accuracy=0.591100, test/loss=1.960513, test/num_examples=10000, total_duration=38316.754600, train/accuracy=0.809610, train/loss=0.973306, validation/accuracy=0.717700, validation/loss=1.357615, validation/num_examples=50000
I0914 17:54:36.872528 139621107205888 logging_writer.py:48] [109500] global_step=109500, grad_norm=0.4119095206260681, loss=2.924466609954834
I0914 17:57:25.075344 139621098813184 logging_writer.py:48] [110000] global_step=110000, grad_norm=0.42898768186569214, loss=2.928290605545044
I0914 18:00:13.132177 139621107205888 logging_writer.py:48] [110500] global_step=110500, grad_norm=0.42565202713012695, loss=2.8965280055999756
I0914 18:01:45.962886 139785753851712 spec.py:320] Evaluating on the training split.
I0914 18:01:53.526907 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 18:02:04.644013 139785753851712 spec.py:348] Evaluating on the test split.
I0914 18:02:06.896817 139785753851712 submission_runner.py:376] Time since start: 38847.86s, Step: 110778, {'train/accuracy': 0.7960578799247742, 'train/loss': 1.0393996238708496, 'validation/accuracy': 0.7094399929046631, 'validation/loss': 1.413783073425293, 'validation/num_examples': 50000, 'test/accuracy': 0.5824000239372253, 'test/loss': 2.058758020401001, 'test/num_examples': 10000, 'score': 37299.51941990852, 'total_duration': 38847.8635661602, 'accumulated_submission_time': 37299.51941990852, 'accumulated_eval_time': 1544.1275751590729, 'accumulated_logging_time': 2.539213180541992}
I0914 18:02:06.922950 139618036999936 logging_writer.py:48] [110778] accumulated_eval_time=1544.127575, accumulated_logging_time=2.539213, accumulated_submission_time=37299.519420, global_step=110778, preemption_count=0, score=37299.519420, test/accuracy=0.582400, test/loss=2.058758, test/num_examples=10000, total_duration=38847.863566, train/accuracy=0.796058, train/loss=1.039400, validation/accuracy=0.709440, validation/loss=1.413783, validation/num_examples=50000
I0914 18:03:21.900630 139618045392640 logging_writer.py:48] [111000] global_step=111000, grad_norm=0.43163755536079407, loss=2.9281558990478516
I0914 18:06:10.041956 139618036999936 logging_writer.py:48] [111500] global_step=111500, grad_norm=0.4414684474468231, loss=2.9374234676361084
I0914 18:08:58.126828 139618045392640 logging_writer.py:48] [112000] global_step=112000, grad_norm=0.4413857161998749, loss=2.953256845474243
I0914 18:10:37.223568 139785753851712 spec.py:320] Evaluating on the training split.
I0914 18:10:44.865906 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 18:10:55.847971 139785753851712 spec.py:348] Evaluating on the test split.
I0914 18:10:58.111961 139785753851712 submission_runner.py:376] Time since start: 39379.08s, Step: 112297, {'train/accuracy': 0.8116230964660645, 'train/loss': 0.9464977383613586, 'validation/accuracy': 0.721019983291626, 'validation/loss': 1.3188493251800537, 'validation/num_examples': 50000, 'test/accuracy': 0.5951000452041626, 'test/loss': 1.9455469846725464, 'test/num_examples': 10000, 'score': 37809.78449511528, 'total_duration': 39379.07870936394, 'accumulated_submission_time': 37809.78449511528, 'accumulated_eval_time': 1565.0159449577332, 'accumulated_logging_time': 2.5778493881225586}
I0914 18:10:58.137747 139621082027776 logging_writer.py:48] [112297] accumulated_eval_time=1565.015945, accumulated_logging_time=2.577849, accumulated_submission_time=37809.784495, global_step=112297, preemption_count=0, score=37809.784495, test/accuracy=0.595100, test/loss=1.945547, test/num_examples=10000, total_duration=39379.078709, train/accuracy=0.811623, train/loss=0.946498, validation/accuracy=0.721020, validation/loss=1.318849, validation/num_examples=50000
I0914 18:12:06.665533 139621090420480 logging_writer.py:48] [112500] global_step=112500, grad_norm=0.4168124198913574, loss=2.871910333633423
I0914 18:14:54.676797 139621082027776 logging_writer.py:48] [113000] global_step=113000, grad_norm=0.43743082880973816, loss=2.907787799835205
I0914 18:17:42.738482 139621090420480 logging_writer.py:48] [113500] global_step=113500, grad_norm=0.4323841631412506, loss=2.890209674835205
I0914 18:19:28.127635 139785753851712 spec.py:320] Evaluating on the training split.
I0914 18:19:35.779218 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 18:19:46.886007 139785753851712 spec.py:348] Evaluating on the test split.
I0914 18:19:49.129853 139785753851712 submission_runner.py:376] Time since start: 39910.10s, Step: 113815, {'train/accuracy': 0.8452048897743225, 'train/loss': 0.8711603879928589, 'validation/accuracy': 0.7268399596214294, 'validation/loss': 1.3569468259811401, 'validation/num_examples': 50000, 'test/accuracy': 0.5951000452041626, 'test/loss': 1.9684876203536987, 'test/num_examples': 10000, 'score': 38319.7420566082, 'total_duration': 39910.09659743309, 'accumulated_submission_time': 38319.7420566082, 'accumulated_eval_time': 1586.0181345939636, 'accumulated_logging_time': 2.6128127574920654}
I0914 18:19:49.157007 139618045392640 logging_writer.py:48] [113815] accumulated_eval_time=1586.018135, accumulated_logging_time=2.612813, accumulated_submission_time=38319.742057, global_step=113815, preemption_count=0, score=38319.742057, test/accuracy=0.595100, test/loss=1.968488, test/num_examples=10000, total_duration=39910.096597, train/accuracy=0.845205, train/loss=0.871160, validation/accuracy=0.726840, validation/loss=1.356947, validation/num_examples=50000
I0914 18:20:51.536039 139620410959616 logging_writer.py:48] [114000] global_step=114000, grad_norm=0.46374964714050293, loss=2.9240376949310303
I0914 18:23:39.450235 139618045392640 logging_writer.py:48] [114500] global_step=114500, grad_norm=0.44095996022224426, loss=2.8425261974334717
I0914 18:26:27.675474 139620410959616 logging_writer.py:48] [115000] global_step=115000, grad_norm=0.4671684503555298, loss=2.88558292388916
I0914 18:28:19.439936 139785753851712 spec.py:320] Evaluating on the training split.
I0914 18:28:27.056215 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 18:28:37.988294 139785753851712 spec.py:348] Evaluating on the test split.
I0914 18:28:40.252659 139785753851712 submission_runner.py:376] Time since start: 40441.22s, Step: 115334, {'train/accuracy': 0.8356186151504517, 'train/loss': 0.846179187297821, 'validation/accuracy': 0.7235400080680847, 'validation/loss': 1.3148432970046997, 'validation/num_examples': 50000, 'test/accuracy': 0.5975000262260437, 'test/loss': 1.953201413154602, 'test/num_examples': 10000, 'score': 38829.9924018383, 'total_duration': 40441.21931767464, 'accumulated_submission_time': 38829.9924018383, 'accumulated_eval_time': 1606.8307423591614, 'accumulated_logging_time': 2.6497488021850586}
I0914 18:28:40.279415 139618036999936 logging_writer.py:48] [115334] accumulated_eval_time=1606.830742, accumulated_logging_time=2.649749, accumulated_submission_time=38829.992402, global_step=115334, preemption_count=0, score=38829.992402, test/accuracy=0.597500, test/loss=1.953201, test/num_examples=10000, total_duration=40441.219318, train/accuracy=0.835619, train/loss=0.846179, validation/accuracy=0.723540, validation/loss=1.314843, validation/num_examples=50000
I0914 18:29:36.330388 139618045392640 logging_writer.py:48] [115500] global_step=115500, grad_norm=0.4787191152572632, loss=2.8335397243499756
I0914 18:32:24.424423 139618036999936 logging_writer.py:48] [116000] global_step=116000, grad_norm=0.44966191053390503, loss=2.8393120765686035
I0914 18:35:12.558569 139618045392640 logging_writer.py:48] [116500] global_step=116500, grad_norm=0.453999787569046, loss=2.8157007694244385
I0914 18:37:10.536283 139785753851712 spec.py:320] Evaluating on the training split.
I0914 18:37:18.147148 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 18:37:29.201172 139785753851712 spec.py:348] Evaluating on the test split.
I0914 18:37:31.462862 139785753851712 submission_runner.py:376] Time since start: 40972.43s, Step: 116853, {'train/accuracy': 0.8401426672935486, 'train/loss': 0.8460429906845093, 'validation/accuracy': 0.7312600016593933, 'validation/loss': 1.2911760807037354, 'validation/num_examples': 50000, 'test/accuracy': 0.6062000393867493, 'test/loss': 1.8956819772720337, 'test/num_examples': 10000, 'score': 39340.216069698334, 'total_duration': 40972.4295835495, 'accumulated_submission_time': 39340.216069698334, 'accumulated_eval_time': 1627.757281780243, 'accumulated_logging_time': 2.6862545013427734}
I0914 18:37:31.492592 139618045392640 logging_writer.py:48] [116853] accumulated_eval_time=1627.757282, accumulated_logging_time=2.686255, accumulated_submission_time=39340.216070, global_step=116853, preemption_count=0, score=39340.216070, test/accuracy=0.606200, test/loss=1.895682, test/num_examples=10000, total_duration=40972.429584, train/accuracy=0.840143, train/loss=0.846043, validation/accuracy=0.731260, validation/loss=1.291176, validation/num_examples=50000
I0914 18:38:21.166066 139620410959616 logging_writer.py:48] [117000] global_step=117000, grad_norm=0.47764918208122253, loss=2.888979434967041
I0914 18:41:09.129042 139618045392640 logging_writer.py:48] [117500] global_step=117500, grad_norm=0.4835554361343384, loss=2.7958009243011475
I0914 18:43:57.120205 139620410959616 logging_writer.py:48] [118000] global_step=118000, grad_norm=0.4646367132663727, loss=2.7950828075408936
I0914 18:46:01.562391 139785753851712 spec.py:320] Evaluating on the training split.
I0914 18:46:09.242262 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 18:46:20.153388 139785753851712 spec.py:348] Evaluating on the test split.
I0914 18:46:22.558940 139785753851712 submission_runner.py:376] Time since start: 41503.53s, Step: 118372, {'train/accuracy': 0.8466597199440002, 'train/loss': 0.8141149878501892, 'validation/accuracy': 0.7381399869918823, 'validation/loss': 1.2623021602630615, 'validation/num_examples': 50000, 'test/accuracy': 0.6214000582695007, 'test/loss': 1.8717379570007324, 'test/num_examples': 10000, 'score': 39850.24843668938, 'total_duration': 41503.52567815781, 'accumulated_submission_time': 39850.24843668938, 'accumulated_eval_time': 1648.753799200058, 'accumulated_logging_time': 2.7309017181396484}
I0914 18:46:22.587269 139621098813184 logging_writer.py:48] [118372] accumulated_eval_time=1648.753799, accumulated_logging_time=2.730902, accumulated_submission_time=39850.248437, global_step=118372, preemption_count=0, score=39850.248437, test/accuracy=0.621400, test/loss=1.871738, test/num_examples=10000, total_duration=41503.525678, train/accuracy=0.846660, train/loss=0.814115, validation/accuracy=0.738140, validation/loss=1.262302, validation/num_examples=50000
I0914 18:47:05.924679 139621107205888 logging_writer.py:48] [118500] global_step=118500, grad_norm=0.49744462966918945, loss=2.8599724769592285
I0914 18:49:54.139391 139621098813184 logging_writer.py:48] [119000] global_step=119000, grad_norm=0.47480571269989014, loss=2.7920033931732178
I0914 18:52:42.335139 139621107205888 logging_writer.py:48] [119500] global_step=119500, grad_norm=0.4592367112636566, loss=2.7162587642669678
I0914 18:54:52.652643 139785753851712 spec.py:320] Evaluating on the training split.
I0914 18:55:00.214142 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 18:55:11.169922 139785753851712 spec.py:348] Evaluating on the test split.
I0914 18:55:13.409800 139785753851712 submission_runner.py:376] Time since start: 42034.38s, Step: 119889, {'train/accuracy': 0.8564253449440002, 'train/loss': 0.7718811631202698, 'validation/accuracy': 0.7484999895095825, 'validation/loss': 1.2095959186553955, 'validation/num_examples': 50000, 'test/accuracy': 0.6221000552177429, 'test/loss': 1.825244426727295, 'test/num_examples': 10000, 'score': 40360.281074762344, 'total_duration': 42034.37654709816, 'accumulated_submission_time': 40360.281074762344, 'accumulated_eval_time': 1669.5109317302704, 'accumulated_logging_time': 2.7693583965301514}
I0914 18:55:13.440362 139620410959616 logging_writer.py:48] [119889] accumulated_eval_time=1669.510932, accumulated_logging_time=2.769358, accumulated_submission_time=40360.281075, global_step=119889, preemption_count=0, score=40360.281075, test/accuracy=0.622100, test/loss=1.825244, test/num_examples=10000, total_duration=42034.376547, train/accuracy=0.856425, train/loss=0.771881, validation/accuracy=0.748500, validation/loss=1.209596, validation/num_examples=50000
I0914 18:55:51.055367 139620419352320 logging_writer.py:48] [120000] global_step=120000, grad_norm=0.46088361740112305, loss=2.748404026031494
I0914 18:58:38.933666 139620410959616 logging_writer.py:48] [120500] global_step=120500, grad_norm=0.47682926058769226, loss=2.788738965988159
I0914 19:01:27.151314 139620419352320 logging_writer.py:48] [121000] global_step=121000, grad_norm=0.48673200607299805, loss=2.728959560394287
I0914 19:03:43.670358 139785753851712 spec.py:320] Evaluating on the training split.
I0914 19:03:51.275867 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 19:04:02.285306 139785753851712 spec.py:348] Evaluating on the test split.
I0914 19:04:04.623245 139785753851712 submission_runner.py:376] Time since start: 42565.59s, Step: 121408, {'train/accuracy': 0.8634805083274841, 'train/loss': 0.7633724212646484, 'validation/accuracy': 0.750220000743866, 'validation/loss': 1.2169595956802368, 'validation/num_examples': 50000, 'test/accuracy': 0.626800000667572, 'test/loss': 1.814558506011963, 'test/num_examples': 10000, 'score': 40870.47562837601, 'total_duration': 42565.58989524841, 'accumulated_submission_time': 40870.47562837601, 'accumulated_eval_time': 1690.4636988639832, 'accumulated_logging_time': 2.8127453327178955}
I0914 19:04:04.650436 139617894389504 logging_writer.py:48] [121408] accumulated_eval_time=1690.463699, accumulated_logging_time=2.812745, accumulated_submission_time=40870.475628, global_step=121408, preemption_count=0, score=40870.475628, test/accuracy=0.626800, test/loss=1.814559, test/num_examples=10000, total_duration=42565.589895, train/accuracy=0.863481, train/loss=0.763372, validation/accuracy=0.750220, validation/loss=1.216960, validation/num_examples=50000
I0914 19:04:35.909023 139617902782208 logging_writer.py:48] [121500] global_step=121500, grad_norm=0.49807053804397583, loss=2.7201364040374756
I0914 19:07:24.007127 139617894389504 logging_writer.py:48] [122000] global_step=122000, grad_norm=0.479455828666687, loss=2.7059950828552246
I0914 19:10:12.186707 139617902782208 logging_writer.py:48] [122500] global_step=122500, grad_norm=0.4708760380744934, loss=2.661046266555786
I0914 19:12:34.906721 139785753851712 spec.py:320] Evaluating on the training split.
I0914 19:12:42.492753 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 19:12:53.594218 139785753851712 spec.py:348] Evaluating on the test split.
I0914 19:12:55.807027 139785753851712 submission_runner.py:376] Time since start: 43096.77s, Step: 122926, {'train/accuracy': 0.896882951259613, 'train/loss': 0.6356571316719055, 'validation/accuracy': 0.7584199905395508, 'validation/loss': 1.1778696775436401, 'validation/num_examples': 50000, 'test/accuracy': 0.6371000409126282, 'test/loss': 1.7690379619598389, 'test/num_examples': 10000, 'score': 41380.697590112686, 'total_duration': 43096.77365708351, 'accumulated_submission_time': 41380.697590112686, 'accumulated_eval_time': 1711.363877773285, 'accumulated_logging_time': 2.851658821105957}
I0914 19:12:55.839606 139620410959616 logging_writer.py:48] [122926] accumulated_eval_time=1711.363878, accumulated_logging_time=2.851659, accumulated_submission_time=41380.697590, global_step=122926, preemption_count=0, score=41380.697590, test/accuracy=0.637100, test/loss=1.769038, test/num_examples=10000, total_duration=43096.773657, train/accuracy=0.896883, train/loss=0.635657, validation/accuracy=0.758420, validation/loss=1.177870, validation/num_examples=50000
I0914 19:13:21.037121 139620419352320 logging_writer.py:48] [123000] global_step=123000, grad_norm=0.48896467685699463, loss=2.7046561241149902
I0914 19:16:09.015657 139620410959616 logging_writer.py:48] [123500] global_step=123500, grad_norm=0.4708644151687622, loss=2.6549618244171143
I0914 19:18:56.923338 139620419352320 logging_writer.py:48] [124000] global_step=124000, grad_norm=0.5001617670059204, loss=2.6695520877838135
I0914 19:21:26.042989 139785753851712 spec.py:320] Evaluating on the training split.
I0914 19:21:33.706474 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 19:21:44.803596 139785753851712 spec.py:348] Evaluating on the test split.
I0914 19:21:47.160427 139785753851712 submission_runner.py:376] Time since start: 43628.13s, Step: 124445, {'train/accuracy': 0.8963049650192261, 'train/loss': 0.6321005821228027, 'validation/accuracy': 0.7647599577903748, 'validation/loss': 1.1482919454574585, 'validation/num_examples': 50000, 'test/accuracy': 0.6454000473022461, 'test/loss': 1.7265831232070923, 'test/num_examples': 10000, 'score': 41890.86552166939, 'total_duration': 43628.127166986465, 'accumulated_submission_time': 41890.86552166939, 'accumulated_eval_time': 1732.4812920093536, 'accumulated_logging_time': 2.8973982334136963}
I0914 19:21:47.187817 139618045392640 logging_writer.py:48] [124445] accumulated_eval_time=1732.481292, accumulated_logging_time=2.897398, accumulated_submission_time=41890.865522, global_step=124445, preemption_count=0, score=41890.865522, test/accuracy=0.645400, test/loss=1.726583, test/num_examples=10000, total_duration=43628.127167, train/accuracy=0.896305, train/loss=0.632101, validation/accuracy=0.764760, validation/loss=1.148292, validation/num_examples=50000
I0914 19:22:06.001334 139620410959616 logging_writer.py:48] [124500] global_step=124500, grad_norm=0.5023619532585144, loss=2.6471095085144043
I0914 19:24:53.974798 139618045392640 logging_writer.py:48] [125000] global_step=125000, grad_norm=0.5099729299545288, loss=2.6568284034729004
I0914 19:27:41.930681 139620410959616 logging_writer.py:48] [125500] global_step=125500, grad_norm=0.4939233958721161, loss=2.586808204650879
I0914 19:30:17.467931 139785753851712 spec.py:320] Evaluating on the training split.
I0914 19:30:24.997955 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 19:30:36.002361 139785753851712 spec.py:348] Evaluating on the test split.
I0914 19:30:38.243386 139785753851712 submission_runner.py:376] Time since start: 44159.21s, Step: 125965, {'train/accuracy': 0.9003308415412903, 'train/loss': 0.6020267009735107, 'validation/accuracy': 0.7695800065994263, 'validation/loss': 1.123653769493103, 'validation/num_examples': 50000, 'test/accuracy': 0.6490000486373901, 'test/loss': 1.7115205526351929, 'test/num_examples': 10000, 'score': 42401.11342215538, 'total_duration': 44159.210112810135, 'accumulated_submission_time': 42401.11342215538, 'accumulated_eval_time': 1753.2567028999329, 'accumulated_logging_time': 2.934680700302124}
I0914 19:30:38.274112 139621098813184 logging_writer.py:48] [125965] accumulated_eval_time=1753.256703, accumulated_logging_time=2.934681, accumulated_submission_time=42401.113422, global_step=125965, preemption_count=0, score=42401.113422, test/accuracy=0.649000, test/loss=1.711521, test/num_examples=10000, total_duration=44159.210113, train/accuracy=0.900331, train/loss=0.602027, validation/accuracy=0.769580, validation/loss=1.123654, validation/num_examples=50000
I0914 19:30:50.376403 139621107205888 logging_writer.py:48] [126000] global_step=126000, grad_norm=0.4852544665336609, loss=2.5603702068328857
I0914 19:33:38.312931 139621098813184 logging_writer.py:48] [126500] global_step=126500, grad_norm=0.518208384513855, loss=2.619767665863037
I0914 19:36:26.558245 139621107205888 logging_writer.py:48] [127000] global_step=127000, grad_norm=0.5091156363487244, loss=2.5884017944335938
I0914 19:39:08.269389 139785753851712 spec.py:320] Evaluating on the training split.
I0914 19:39:15.808310 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 19:39:26.886655 139785753851712 spec.py:348] Evaluating on the test split.
I0914 19:39:29.079470 139785753851712 submission_runner.py:376] Time since start: 44690.05s, Step: 127483, {'train/accuracy': 0.9035195708274841, 'train/loss': 0.5998629331588745, 'validation/accuracy': 0.7721199989318848, 'validation/loss': 1.1213932037353516, 'validation/num_examples': 50000, 'test/accuracy': 0.6554000377655029, 'test/loss': 1.6994982957839966, 'test/num_examples': 10000, 'score': 42911.07231712341, 'total_duration': 44690.04618239403, 'accumulated_submission_time': 42911.07231712341, 'accumulated_eval_time': 1774.066725730896, 'accumulated_logging_time': 2.978921890258789}
I0914 19:39:29.122376 139620410959616 logging_writer.py:48] [127483] accumulated_eval_time=1774.066726, accumulated_logging_time=2.978922, accumulated_submission_time=42911.072317, global_step=127483, preemption_count=0, score=42911.072317, test/accuracy=0.655400, test/loss=1.699498, test/num_examples=10000, total_duration=44690.046182, train/accuracy=0.903520, train/loss=0.599863, validation/accuracy=0.772120, validation/loss=1.121393, validation/num_examples=50000
I0914 19:39:35.166222 139620419352320 logging_writer.py:48] [127500] global_step=127500, grad_norm=0.5102914571762085, loss=2.6478493213653564
I0914 19:42:23.218772 139620410959616 logging_writer.py:48] [128000] global_step=128000, grad_norm=0.5169419050216675, loss=2.622182846069336
I0914 19:45:11.451178 139620419352320 logging_writer.py:48] [128500] global_step=128500, grad_norm=0.5053176283836365, loss=2.5826773643493652
I0914 19:47:59.658819 139620410959616 logging_writer.py:48] [129000] global_step=129000, grad_norm=0.4903814196586609, loss=2.593526601791382
I0914 19:47:59.666086 139785753851712 spec.py:320] Evaluating on the training split.
I0914 19:48:07.148158 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 19:48:18.186425 139785753851712 spec.py:348] Evaluating on the test split.
I0914 19:48:20.480490 139785753851712 submission_runner.py:376] Time since start: 45221.45s, Step: 129001, {'train/accuracy': 0.9015664458274841, 'train/loss': 0.6057281494140625, 'validation/accuracy': 0.7714999914169312, 'validation/loss': 1.1205312013626099, 'validation/num_examples': 50000, 'test/accuracy': 0.6547000408172607, 'test/loss': 1.7005892992019653, 'test/num_examples': 10000, 'score': 43421.583512067795, 'total_duration': 45221.447182655334, 'accumulated_submission_time': 43421.583512067795, 'accumulated_eval_time': 1794.8810048103333, 'accumulated_logging_time': 3.0320894718170166}
I0914 19:48:20.522889 139621082027776 logging_writer.py:48] [129001] accumulated_eval_time=1794.881005, accumulated_logging_time=3.032089, accumulated_submission_time=43421.583512, global_step=129001, preemption_count=0, score=43421.583512, test/accuracy=0.654700, test/loss=1.700589, test/num_examples=10000, total_duration=45221.447183, train/accuracy=0.901566, train/loss=0.605728, validation/accuracy=0.771500, validation/loss=1.120531, validation/num_examples=50000
I0914 19:51:08.502212 139621090420480 logging_writer.py:48] [129500] global_step=129500, grad_norm=0.5298535823822021, loss=2.646083354949951
I0914 19:53:56.488409 139621082027776 logging_writer.py:48] [130000] global_step=130000, grad_norm=0.5100540518760681, loss=2.5769078731536865
I0914 19:56:44.719903 139621090420480 logging_writer.py:48] [130500] global_step=130500, grad_norm=0.5042638778686523, loss=2.5659964084625244
I0914 19:56:50.530486 139785753851712 spec.py:320] Evaluating on the training split.
I0914 19:56:58.011437 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 19:57:09.085397 139785753851712 spec.py:348] Evaluating on the test split.
I0914 19:57:11.381052 139785753851712 submission_runner.py:376] Time since start: 45752.35s, Step: 130519, {'train/accuracy': 0.9035793542861938, 'train/loss': 0.5980676412582397, 'validation/accuracy': 0.7721799612045288, 'validation/loss': 1.119748592376709, 'validation/num_examples': 50000, 'test/accuracy': 0.6549000144004822, 'test/loss': 1.6995123624801636, 'test/num_examples': 10000, 'score': 43931.55882525444, 'total_duration': 45752.34780359268, 'accumulated_submission_time': 43931.55882525444, 'accumulated_eval_time': 1815.7315328121185, 'accumulated_logging_time': 3.084369659423828}
I0914 19:57:11.410619 139618045392640 logging_writer.py:48] [130519] accumulated_eval_time=1815.731533, accumulated_logging_time=3.084370, accumulated_submission_time=43931.558825, global_step=130519, preemption_count=0, score=43931.558825, test/accuracy=0.654900, test/loss=1.699512, test/num_examples=10000, total_duration=45752.347804, train/accuracy=0.903579, train/loss=0.598068, validation/accuracy=0.772180, validation/loss=1.119749, validation/num_examples=50000
I0914 19:59:53.540511 139620410959616 logging_writer.py:48] [131000] global_step=131000, grad_norm=0.4992199242115021, loss=2.5815839767456055
I0914 20:02:41.781963 139618045392640 logging_writer.py:48] [131500] global_step=131500, grad_norm=0.5254312753677368, loss=2.7122814655303955
I0914 20:05:30.022560 139620410959616 logging_writer.py:48] [132000] global_step=132000, grad_norm=0.5144177079200745, loss=2.636721611022949
I0914 20:05:41.557423 139785753851712 spec.py:320] Evaluating on the training split.
I0914 20:05:49.096254 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 20:06:00.296557 139785753851712 spec.py:348] Evaluating on the test split.
I0914 20:06:02.548254 139785753851712 submission_runner.py:376] Time since start: 46283.52s, Step: 132036, {'train/accuracy': 0.9046555757522583, 'train/loss': 0.5945121645927429, 'validation/accuracy': 0.7727400064468384, 'validation/loss': 1.1170986890792847, 'validation/num_examples': 50000, 'test/accuracy': 0.6539000272750854, 'test/loss': 1.6980239152908325, 'test/num_examples': 10000, 'score': 44441.67366838455, 'total_duration': 46283.51500082016, 'accumulated_submission_time': 44441.67366838455, 'accumulated_eval_time': 1836.7223308086395, 'accumulated_logging_time': 3.1229963302612305}
I0914 20:06:02.574327 139621090420480 logging_writer.py:48] [132036] accumulated_eval_time=1836.722331, accumulated_logging_time=3.122996, accumulated_submission_time=44441.673668, global_step=132036, preemption_count=0, score=44441.673668, test/accuracy=0.653900, test/loss=1.698024, test/num_examples=10000, total_duration=46283.515001, train/accuracy=0.904656, train/loss=0.594512, validation/accuracy=0.772740, validation/loss=1.117099, validation/num_examples=50000
I0914 20:08:38.948357 139621098813184 logging_writer.py:48] [132500] global_step=132500, grad_norm=0.506624698638916, loss=2.62170147895813
I0914 20:11:27.108828 139621090420480 logging_writer.py:48] [133000] global_step=133000, grad_norm=0.4876885414123535, loss=2.5655786991119385
I0914 20:14:15.160693 139621098813184 logging_writer.py:48] [133500] global_step=133500, grad_norm=0.511705219745636, loss=2.5628979206085205
I0914 20:14:32.740298 139785753851712 spec.py:320] Evaluating on the training split.
I0914 20:14:40.237702 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 20:14:51.334104 139785753851712 spec.py:348] Evaluating on the test split.
I0914 20:14:53.563938 139785753851712 submission_runner.py:376] Time since start: 46814.53s, Step: 133554, {'train/accuracy': 0.9058513641357422, 'train/loss': 0.5935448408126831, 'validation/accuracy': 0.772599995136261, 'validation/loss': 1.117397427558899, 'validation/num_examples': 50000, 'test/accuracy': 0.6536000370979309, 'test/loss': 1.6978504657745361, 'test/num_examples': 10000, 'score': 44951.80677986145, 'total_duration': 46814.53054857254, 'accumulated_submission_time': 44951.80677986145, 'accumulated_eval_time': 1857.5457971096039, 'accumulated_logging_time': 3.1587300300598145}
I0914 20:14:53.591204 139618036999936 logging_writer.py:48] [133554] accumulated_eval_time=1857.545797, accumulated_logging_time=3.158730, accumulated_submission_time=44951.806780, global_step=133554, preemption_count=0, score=44951.806780, test/accuracy=0.653600, test/loss=1.697850, test/num_examples=10000, total_duration=46814.530549, train/accuracy=0.905851, train/loss=0.593545, validation/accuracy=0.772600, validation/loss=1.117397, validation/num_examples=50000
I0914 20:17:23.739792 139618045392640 logging_writer.py:48] [134000] global_step=134000, grad_norm=0.5241892337799072, loss=2.5950281620025635
I0914 20:20:11.909889 139618036999936 logging_writer.py:48] [134500] global_step=134500, grad_norm=0.5013359189033508, loss=2.549062967300415
I0914 20:23:00.125262 139618045392640 logging_writer.py:48] [135000] global_step=135000, grad_norm=0.49756449460983276, loss=2.585008382797241
I0914 20:23:23.762339 139785753851712 spec.py:320] Evaluating on the training split.
I0914 20:23:31.236675 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 20:23:42.431392 139785753851712 spec.py:348] Evaluating on the test split.
I0914 20:23:44.701690 139785753851712 submission_runner.py:376] Time since start: 47345.67s, Step: 135072, {'train/accuracy': 0.9068678021430969, 'train/loss': 0.5902974605560303, 'validation/accuracy': 0.7732200026512146, 'validation/loss': 1.1165798902511597, 'validation/num_examples': 50000, 'test/accuracy': 0.6538000106811523, 'test/loss': 1.698063611984253, 'test/num_examples': 10000, 'score': 45461.94407606125, 'total_duration': 47345.66832566261, 'accumulated_submission_time': 45461.94407606125, 'accumulated_eval_time': 1878.4849972724915, 'accumulated_logging_time': 3.197124719619751}
I0914 20:23:44.728093 139618036999936 logging_writer.py:48] [135072] accumulated_eval_time=1878.484997, accumulated_logging_time=3.197125, accumulated_submission_time=45461.944076, global_step=135072, preemption_count=0, score=45461.944076, test/accuracy=0.653800, test/loss=1.698064, test/num_examples=10000, total_duration=47345.668326, train/accuracy=0.906868, train/loss=0.590297, validation/accuracy=0.773220, validation/loss=1.116580, validation/num_examples=50000
I0914 20:26:08.701267 139621090420480 logging_writer.py:48] [135500] global_step=135500, grad_norm=0.5120027661323547, loss=2.583296298980713
I0914 20:28:56.874789 139618036999936 logging_writer.py:48] [136000] global_step=136000, grad_norm=0.5158140063285828, loss=2.5482170581817627
I0914 20:31:45.001054 139621090420480 logging_writer.py:48] [136500] global_step=136500, grad_norm=0.5265464186668396, loss=2.591872453689575
I0914 20:32:14.701688 139785753851712 spec.py:320] Evaluating on the training split.
I0914 20:32:22.171722 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 20:32:33.272115 139785753851712 spec.py:348] Evaluating on the test split.
I0914 20:32:35.523978 139785753851712 submission_runner.py:376] Time since start: 47876.49s, Step: 136590, {'train/accuracy': 0.9063496589660645, 'train/loss': 0.5868960022926331, 'validation/accuracy': 0.7733599543571472, 'validation/loss': 1.11497962474823, 'validation/num_examples': 50000, 'test/accuracy': 0.6541000604629517, 'test/loss': 1.6960680484771729, 'test/num_examples': 10000, 'score': 45971.88573241234, 'total_duration': 47876.49062347412, 'accumulated_submission_time': 45971.88573241234, 'accumulated_eval_time': 1899.3071541786194, 'accumulated_logging_time': 3.2327637672424316}
I0914 20:32:35.551081 139618036999936 logging_writer.py:48] [136590] accumulated_eval_time=1899.307154, accumulated_logging_time=3.232764, accumulated_submission_time=45971.885732, global_step=136590, preemption_count=0, score=45971.885732, test/accuracy=0.654100, test/loss=1.696068, test/num_examples=10000, total_duration=47876.490623, train/accuracy=0.906350, train/loss=0.586896, validation/accuracy=0.773360, validation/loss=1.114980, validation/num_examples=50000
I0914 20:34:53.602206 139618045392640 logging_writer.py:48] [137000] global_step=137000, grad_norm=0.5001352429389954, loss=2.6086196899414062
I0914 20:37:41.751103 139618036999936 logging_writer.py:48] [137500] global_step=137500, grad_norm=0.5193489193916321, loss=2.615438461303711
I0914 20:40:29.898882 139618045392640 logging_writer.py:48] [138000] global_step=138000, grad_norm=0.5259115099906921, loss=2.6131694316864014
I0914 20:41:05.654629 139785753851712 spec.py:320] Evaluating on the training split.
I0914 20:41:13.167435 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 20:41:24.345569 139785753851712 spec.py:348] Evaluating on the test split.
I0914 20:41:26.596942 139785753851712 submission_runner.py:376] Time since start: 48407.56s, Step: 138108, {'train/accuracy': 0.907645046710968, 'train/loss': 0.5866842865943909, 'validation/accuracy': 0.7729199528694153, 'validation/loss': 1.1145741939544678, 'validation/num_examples': 50000, 'test/accuracy': 0.6550000309944153, 'test/loss': 1.6953842639923096, 'test/num_examples': 10000, 'score': 46481.956923007965, 'total_duration': 48407.56366467476, 'accumulated_submission_time': 46481.956923007965, 'accumulated_eval_time': 1920.2494142055511, 'accumulated_logging_time': 3.2690396308898926}
I0914 20:41:26.624465 139621090420480 logging_writer.py:48] [138108] accumulated_eval_time=1920.249414, accumulated_logging_time=3.269040, accumulated_submission_time=46481.956923, global_step=138108, preemption_count=0, score=46481.956923, test/accuracy=0.655000, test/loss=1.695384, test/num_examples=10000, total_duration=48407.563665, train/accuracy=0.907645, train/loss=0.586684, validation/accuracy=0.772920, validation/loss=1.114574, validation/num_examples=50000
I0914 20:43:38.498144 139621098813184 logging_writer.py:48] [138500] global_step=138500, grad_norm=0.4836212694644928, loss=2.522245407104492
I0914 20:46:26.472573 139621090420480 logging_writer.py:48] [139000] global_step=139000, grad_norm=0.48201984167099, loss=2.5357580184936523
I0914 20:49:14.508329 139621098813184 logging_writer.py:48] [139500] global_step=139500, grad_norm=0.5080897212028503, loss=2.6238484382629395
I0914 20:49:56.659739 139785753851712 spec.py:320] Evaluating on the training split.
I0914 20:50:04.166987 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 20:50:15.245995 139785753851712 spec.py:348] Evaluating on the test split.
I0914 20:50:17.443364 139785753851712 submission_runner.py:376] Time since start: 48938.41s, Step: 139627, {'train/accuracy': 0.9070870280265808, 'train/loss': 0.5904383659362793, 'validation/accuracy': 0.7734000086784363, 'validation/loss': 1.1148360967636108, 'validation/num_examples': 50000, 'test/accuracy': 0.6565000414848328, 'test/loss': 1.696603775024414, 'test/num_examples': 10000, 'score': 46991.960359573364, 'total_duration': 48938.4100048542, 'accumulated_submission_time': 46991.960359573364, 'accumulated_eval_time': 1941.032898902893, 'accumulated_logging_time': 3.3056366443634033}
I0914 20:50:17.474641 139620410959616 logging_writer.py:48] [139627] accumulated_eval_time=1941.032899, accumulated_logging_time=3.305637, accumulated_submission_time=46991.960360, global_step=139627, preemption_count=0, score=46991.960360, test/accuracy=0.656500, test/loss=1.696604, test/num_examples=10000, total_duration=48938.410005, train/accuracy=0.907087, train/loss=0.590438, validation/accuracy=0.773400, validation/loss=1.114836, validation/num_examples=50000
I0914 20:52:22.570719 139785753851712 spec.py:320] Evaluating on the training split.
I0914 20:52:29.954459 139785753851712 spec.py:332] Evaluating on the validation split.
I0914 20:52:40.947678 139785753851712 spec.py:348] Evaluating on the test split.
I0914 20:52:43.221620 139785753851712 submission_runner.py:376] Time since start: 49084.19s, Step: 140000, {'train/accuracy': 0.9084422588348389, 'train/loss': 0.578050971031189, 'validation/accuracy': 0.7722199559211731, 'validation/loss': 1.1122545003890991, 'validation/num_examples': 50000, 'test/accuracy': 0.655500054359436, 'test/loss': 1.6934614181518555, 'test/num_examples': 10000, 'score': 47117.04019618034, 'total_duration': 49084.18835735321, 'accumulated_submission_time': 47117.04019618034, 'accumulated_eval_time': 1961.6837706565857, 'accumulated_logging_time': 3.3469996452331543}
I0914 20:52:43.254832 139618036999936 logging_writer.py:48] [140000] accumulated_eval_time=1961.683771, accumulated_logging_time=3.347000, accumulated_submission_time=47117.040196, global_step=140000, preemption_count=0, score=47117.040196, test/accuracy=0.655500, test/loss=1.693461, test/num_examples=10000, total_duration=49084.188357, train/accuracy=0.908442, train/loss=0.578051, validation/accuracy=0.772220, validation/loss=1.112255, validation/num_examples=50000
I0914 20:52:43.278700 139621082027776 logging_writer.py:48] [140000] global_step=140000, preemption_count=0, score=47117.040196
I0914 20:52:43.516733 139785753851712 checkpoints.py:490] Saving checkpoint at step: 140000
I0914 20:52:44.370062 139785753851712 checkpoints.py:422] Saved checkpoint at /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1/checkpoint_140000
I0914 20:52:44.389781 139785753851712 checkpoint_utils.py:240] Saved checkpoint to /experiment_runs/targets_check_jax/momentum_run_0/imagenet_resnet_jax/trial_1/checkpoint_140000.
I0914 20:52:45.253473 139785753851712 submission_runner.py:540] Tuning trial 1/1
I0914 20:52:45.253736 139785753851712 submission_runner.py:541] Hyperparameters: Hyperparameters(learning_rate=4.131896390902391, beta1=0.9274758113254791, beta2=0.9978504782314613, warmup_steps=6999, decay_steps_factor=0.9007765761611038, end_factor=0.001, weight_decay=5.6687777311501786e-06, label_smoothing=0.2)
I0914 20:52:45.258056 139785753851712 submission_runner.py:542] Metrics: {'eval_results': [(1, {'train/accuracy': 0.0009367027669213712, 'train/loss': 6.9118571281433105, 'validation/accuracy': 0.0010400000028312206, 'validation/loss': 6.911978721618652, 'validation/num_examples': 50000, 'test/accuracy': 0.0014000000664964318, 'test/loss': 6.91181755065918, 'test/num_examples': 10000, 'score': 62.32368874549866, 'total_duration': 109.0933792591095, 'accumulated_submission_time': 62.32368874549866, 'accumulated_eval_time': 46.76959991455078, 'accumulated_logging_time': 0, 'global_step': 1, 'preemption_count': 0}), (1514, {'train/accuracy': 0.18895487487316132, 'train/loss': 4.249844551086426, 'validation/accuracy': 0.1698399931192398, 'validation/loss': 4.387373924255371, 'validation/num_examples': 50000, 'test/accuracy': 0.12890000641345978, 'test/loss': 4.780310153961182, 'test/num_examples': 10000, 'score': 572.3801600933075, 'total_duration': 636.7591044902802, 'accumulated_submission_time': 572.3801600933075, 'accumulated_eval_time': 64.32867097854614, 'accumulated_logging_time': 0.02817511558532715, 'global_step': 1514, 'preemption_count': 0}), (3030, {'train/accuracy': 0.35439252853393555, 'train/loss': 3.16475772857666, 'validation/accuracy': 0.32311999797821045, 'validation/loss': 3.354123592376709, 'validation/num_examples': 50000, 'test/accuracy': 0.24500000476837158, 'test/loss': 3.896735429763794, 'test/num_examples': 10000, 'score': 1082.4337828159332, 'total_duration': 1164.5784351825714, 'accumulated_submission_time': 1082.4337828159332, 'accumulated_eval_time': 82.04390573501587, 'accumulated_logging_time': 0.05537152290344238, 'global_step': 3030, 'preemption_count': 0}), (4547, {'train/accuracy': 0.4696468412876129, 'train/loss': 2.565059185028076, 'validation/accuracy': 0.4343400001525879, 'validation/loss': 2.7238423824310303, 'validation/num_examples': 50000, 'test/accuracy': 0.3296000063419342, 'test/loss': 3.3760478496551514, 'test/num_examples': 10000, 'score': 1592.5839030742645, 'total_duration': 1692.2868733406067, 'accumulated_submission_time': 1592.5839030742645, 'accumulated_eval_time': 99.54879140853882, 'accumulated_logging_time': 0.08617043495178223, 'global_step': 4547, 'preemption_count': 0}), (6064, {'train/accuracy': 0.485072523355484, 'train/loss': 2.4722769260406494, 'validation/accuracy': 0.45179998874664307, 'validation/loss': 2.632256269454956, 'validation/num_examples': 50000, 'test/accuracy': 0.3456000089645386, 'test/loss': 3.28208327293396, 'test/num_examples': 10000, 'score': 2102.707985162735, 'total_duration': 2220.0234982967377, 'accumulated_submission_time': 2102.707985162735, 'accumulated_eval_time': 117.10866022109985, 'accumulated_logging_time': 0.11611032485961914, 'global_step': 6064, 'preemption_count': 0}), (7581, {'train/accuracy': 0.5335220098495483, 'train/loss': 2.235621452331543, 'validation/accuracy': 0.5026400089263916, 'validation/loss': 2.388488531112671, 'validation/num_examples': 50000, 'test/accuracy': 0.392300009727478, 'test/loss': 3.0264182090759277, 'test/num_examples': 10000, 'score': 2612.7207324504852, 'total_duration': 2747.8368368148804, 'accumulated_submission_time': 2612.7207324504852, 'accumulated_eval_time': 134.8591091632843, 'accumulated_logging_time': 0.14357876777648926, 'global_step': 7581, 'preemption_count': 0}), (9098, {'train/accuracy': 0.5826291441917419, 'train/loss': 2.0229527950286865, 'validation/accuracy': 0.5064799785614014, 'validation/loss': 2.3824005126953125, 'validation/num_examples': 50000, 'test/accuracy': 0.3856000304222107, 'test/loss': 3.0475265979766846, 'test/num_examples': 10000, 'score': 3122.8992822170258, 'total_duration': 3275.9179759025574, 'accumulated_submission_time': 3122.8992822170258, 'accumulated_eval_time': 152.71290373802185, 'accumulated_logging_time': 0.17004680633544922, 'global_step': 9098, 'preemption_count': 0}), (10615, {'train/accuracy': 0.5702128410339355, 'train/loss': 2.0194029808044434, 'validation/accuracy': 0.5161600112915039, 'validation/loss': 2.2897284030914307, 'validation/num_examples': 50000, 'test/accuracy': 0.3993000090122223, 'test/loss': 2.9481253623962402, 'test/num_examples': 10000, 'score': 3633.0976436138153, 'total_duration': 3804.0056059360504, 'accumulated_submission_time': 3633.0976436138153, 'accumulated_eval_time': 170.5518193244934, 'accumulated_logging_time': 0.1971442699432373, 'global_step': 10615, 'preemption_count': 0}), (12132, {'train/accuracy': 0.5753945708274841, 'train/loss': 2.0027692317962646, 'validation/accuracy': 0.5300599932670593, 'validation/loss': 2.2071869373321533, 'validation/num_examples': 50000, 'test/accuracy': 0.414900004863739, 'test/loss': 2.8690314292907715, 'test/num_examples': 10000, 'score': 4143.2383596897125, 'total_duration': 4332.050138235092, 'accumulated_submission_time': 4143.2383596897125, 'accumulated_eval_time': 188.3957643508911, 'accumulated_logging_time': 0.23388051986694336, 'global_step': 12132, 'preemption_count': 0}), (13649, {'train/accuracy': 0.5742785334587097, 'train/loss': 2.003106117248535, 'validation/accuracy': 0.5323799848556519, 'validation/loss': 2.2081964015960693, 'validation/num_examples': 50000, 'test/accuracy': 0.4196000099182129, 'test/loss': 2.892209529876709, 'test/num_examples': 10000, 'score': 4653.37796998024, 'total_duration': 4860.886433124542, 'accumulated_submission_time': 4653.37796998024, 'accumulated_eval_time': 207.04141402244568, 'accumulated_logging_time': 0.26221203804016113, 'global_step': 13649, 'preemption_count': 0}), (15166, {'train/accuracy': 0.5779455900192261, 'train/loss': 2.057650327682495, 'validation/accuracy': 0.5369799733161926, 'validation/loss': 2.2513246536254883, 'validation/num_examples': 50000, 'test/accuracy': 0.414900004863739, 'test/loss': 2.942534923553467, 'test/num_examples': 10000, 'score': 5163.59117102623, 'total_duration': 5389.39173579216, 'accumulated_submission_time': 5163.59117102623, 'accumulated_eval_time': 225.25672912597656, 'accumulated_logging_time': 0.31658077239990234, 'global_step': 15166, 'preemption_count': 0}), (16683, {'train/accuracy': 0.5338209271430969, 'train/loss': 2.335331678390503, 'validation/accuracy': 0.5003600120544434, 'validation/loss': 2.478173017501831, 'validation/num_examples': 50000, 'test/accuracy': 0.38210001587867737, 'test/loss': 3.1922895908355713, 'test/num_examples': 10000, 'score': 5673.775738954544, 'total_duration': 5919.104665517807, 'accumulated_submission_time': 5673.775738954544, 'accumulated_eval_time': 244.7305908203125, 'accumulated_logging_time': 0.3483104705810547, 'global_step': 16683, 'preemption_count': 0}), (18201, {'train/accuracy': 0.6382134556770325, 'train/loss': 1.74356210231781, 'validation/accuracy': 0.5595200061798096, 'validation/loss': 2.1038706302642822, 'validation/num_examples': 50000, 'test/accuracy': 0.43620002269744873, 'test/loss': 2.7823822498321533, 'test/num_examples': 10000, 'score': 6183.962655305862, 'total_duration': 6448.986067771912, 'accumulated_submission_time': 6183.962655305862, 'accumulated_eval_time': 264.35929918289185, 'accumulated_logging_time': 0.39099645614624023, 'global_step': 18201, 'preemption_count': 0}), (19718, {'train/accuracy': 0.5982142686843872, 'train/loss': 1.9227712154388428, 'validation/accuracy': 0.5423600077629089, 'validation/loss': 2.185263156890869, 'validation/num_examples': 50000, 'test/accuracy': 0.4272000193595886, 'test/loss': 2.8164870738983154, 'test/num_examples': 10000, 'score': 6694.032505512238, 'total_duration': 6979.200145483017, 'accumulated_submission_time': 6694.032505512238, 'accumulated_eval_time': 284.4493384361267, 'accumulated_logging_time': 0.4227602481842041, 'global_step': 19718, 'preemption_count': 0}), (21235, {'train/accuracy': 0.6011040806770325, 'train/loss': 1.9267860651016235, 'validation/accuracy': 0.5546199679374695, 'validation/loss': 2.1422059535980225, 'validation/num_examples': 50000, 'test/accuracy': 0.4358000159263611, 'test/loss': 2.809361457824707, 'test/num_examples': 10000, 'score': 7203.964419841766, 'total_duration': 7508.275420188904, 'accumulated_submission_time': 7203.964419841766, 'accumulated_eval_time': 303.52203822135925, 'accumulated_logging_time': 0.47086191177368164, 'global_step': 21235, 'preemption_count': 0}), (22753, {'train/accuracy': 0.5950454473495483, 'train/loss': 1.9752215147018433, 'validation/accuracy': 0.5528799891471863, 'validation/loss': 2.1724321842193604, 'validation/num_examples': 50000, 'test/accuracy': 0.4336000084877014, 'test/loss': 2.810899257659912, 'test/num_examples': 10000, 'score': 7714.026890993118, 'total_duration': 8037.667640447617, 'accumulated_submission_time': 7714.026890993118, 'accumulated_eval_time': 322.7941265106201, 'accumulated_logging_time': 0.5051746368408203, 'global_step': 22753, 'preemption_count': 0}), (24270, {'train/accuracy': 0.5795798897743225, 'train/loss': 2.023024320602417, 'validation/accuracy': 0.5407800078392029, 'validation/loss': 2.230668306350708, 'validation/num_examples': 50000, 'test/accuracy': 0.41540002822875977, 'test/loss': 2.895124673843384, 'test/num_examples': 10000, 'score': 8224.06004691124, 'total_duration': 8567.42837190628, 'accumulated_submission_time': 8224.06004691124, 'accumulated_eval_time': 342.4695653915405, 'accumulated_logging_time': 0.5343174934387207, 'global_step': 24270, 'preemption_count': 0}), (25788, {'train/accuracy': 0.6070232391357422, 'train/loss': 1.906052589416504, 'validation/accuracy': 0.5681399703025818, 'validation/loss': 2.081261396408081, 'validation/num_examples': 50000, 'test/accuracy': 0.4448000192642212, 'test/loss': 2.7579729557037354, 'test/num_examples': 10000, 'score': 8734.046847581863, 'total_duration': 9097.127324581146, 'accumulated_submission_time': 8734.046847581863, 'accumulated_eval_time': 362.12878465652466, 'accumulated_logging_time': 0.5639915466308594, 'global_step': 25788, 'preemption_count': 0}), (27305, {'train/accuracy': 0.6470025181770325, 'train/loss': 1.7079155445098877, 'validation/accuracy': 0.5748400092124939, 'validation/loss': 2.0490245819091797, 'validation/num_examples': 50000, 'test/accuracy': 0.4561000168323517, 'test/loss': 2.6967175006866455, 'test/num_examples': 10000, 'score': 9244.031145811081, 'total_duration': 9627.8626434803, 'accumulated_submission_time': 9244.031145811081, 'accumulated_eval_time': 382.8217294216156, 'accumulated_logging_time': 0.5991528034210205, 'global_step': 27305, 'preemption_count': 0}), (28822, {'train/accuracy': 0.6150151491165161, 'train/loss': 1.8384064435958862, 'validation/accuracy': 0.5671799778938293, 'validation/loss': 2.073777675628662, 'validation/num_examples': 50000, 'test/accuracy': 0.4439000189304352, 'test/loss': 2.7257843017578125, 'test/num_examples': 10000, 'score': 9754.244331598282, 'total_duration': 10158.26204609871, 'accumulated_submission_time': 9754.244331598282, 'accumulated_eval_time': 402.9500343799591, 'accumulated_logging_time': 0.6338088512420654, 'global_step': 28822, 'preemption_count': 0}), (30340, {'train/accuracy': 0.6219905614852905, 'train/loss': 1.8566009998321533, 'validation/accuracy': 0.5748000144958496, 'validation/loss': 2.0822231769561768, 'validation/num_examples': 50000, 'test/accuracy': 0.4506000280380249, 'test/loss': 2.730398178100586, 'test/num_examples': 10000, 'score': 10264.379125356674, 'total_duration': 10689.885441303253, 'accumulated_submission_time': 10264.379125356674, 'accumulated_eval_time': 424.3793590068817, 'accumulated_logging_time': 0.6697630882263184, 'global_step': 30340, 'preemption_count': 0}), (31857, {'train/accuracy': 0.6304607391357422, 'train/loss': 1.7625677585601807, 'validation/accuracy': 0.5828399658203125, 'validation/loss': 1.97676682472229, 'validation/num_examples': 50000, 'test/accuracy': 0.460500031709671, 'test/loss': 2.6475884914398193, 'test/num_examples': 10000, 'score': 10774.378732919693, 'total_duration': 11221.44010066986, 'accumulated_submission_time': 10774.378732919693, 'accumulated_eval_time': 445.87978982925415, 'accumulated_logging_time': 0.7013595104217529, 'global_step': 31857, 'preemption_count': 0}), (33374, {'train/accuracy': 0.6193598508834839, 'train/loss': 1.826348066329956, 'validation/accuracy': 0.5787799954414368, 'validation/loss': 2.038647413253784, 'validation/num_examples': 50000, 'test/accuracy': 0.46250003576278687, 'test/loss': 2.679227352142334, 'test/num_examples': 10000, 'score': 11284.596626758575, 'total_duration': 11753.228493452072, 'accumulated_submission_time': 11284.596626758575, 'accumulated_eval_time': 467.390745639801, 'accumulated_logging_time': 0.738187313079834, 'global_step': 33374, 'preemption_count': 0}), (34891, {'train/accuracy': 0.6135203838348389, 'train/loss': 1.8042339086532593, 'validation/accuracy': 0.5674799680709839, 'validation/loss': 2.0255982875823975, 'validation/num_examples': 50000, 'test/accuracy': 0.4407000243663788, 'test/loss': 2.6929032802581787, 'test/num_examples': 10000, 'score': 11794.77813744545, 'total_duration': 12285.116560459137, 'accumulated_submission_time': 11794.77813744545, 'accumulated_eval_time': 489.0379819869995, 'accumulated_logging_time': 0.7739980220794678, 'global_step': 34891, 'preemption_count': 0}), (36408, {'train/accuracy': 0.6544762253761292, 'train/loss': 1.7204208374023438, 'validation/accuracy': 0.5828199982643127, 'validation/loss': 2.025986433029175, 'validation/num_examples': 50000, 'test/accuracy': 0.45920002460479736, 'test/loss': 2.709810733795166, 'test/num_examples': 10000, 'score': 12304.926926612854, 'total_duration': 12817.259620189667, 'accumulated_submission_time': 12304.926926612854, 'accumulated_eval_time': 510.97503638267517, 'accumulated_logging_time': 0.8083920478820801, 'global_step': 36408, 'preemption_count': 0}), (37926, {'train/accuracy': 0.6563695669174194, 'train/loss': 1.7108644247055054, 'validation/accuracy': 0.5983799695968628, 'validation/loss': 1.9706319570541382, 'validation/num_examples': 50000, 'test/accuracy': 0.48260003328323364, 'test/loss': 2.6029040813446045, 'test/num_examples': 10000, 'score': 12815.176861763, 'total_duration': 13348.463897228241, 'accumulated_submission_time': 12815.176861763, 'accumulated_eval_time': 531.8736464977264, 'accumulated_logging_time': 0.8413448333740234, 'global_step': 37926, 'preemption_count': 0}), (39443, {'train/accuracy': 0.6421595811843872, 'train/loss': 1.6656968593597412, 'validation/accuracy': 0.5947999954223633, 'validation/loss': 1.8805724382400513, 'validation/num_examples': 50000, 'test/accuracy': 0.4724000096321106, 'test/loss': 2.5521347522735596, 'test/num_examples': 10000, 'score': 13325.226942777634, 'total_duration': 13879.367692947388, 'accumulated_submission_time': 13325.226942777634, 'accumulated_eval_time': 552.67098736763, 'accumulated_logging_time': 0.8749892711639404, 'global_step': 39443, 'preemption_count': 0}), (40961, {'train/accuracy': 0.6224888563156128, 'train/loss': 1.8308100700378418, 'validation/accuracy': 0.5813800096511841, 'validation/loss': 2.03011155128479, 'validation/num_examples': 50000, 'test/accuracy': 0.46330001950263977, 'test/loss': 2.654615640640259, 'test/num_examples': 10000, 'score': 13835.485067367554, 'total_duration': 14410.576689004898, 'accumulated_submission_time': 13835.485067367554, 'accumulated_eval_time': 573.5648620128632, 'accumulated_logging_time': 0.9091906547546387, 'global_step': 40961, 'preemption_count': 0}), (42478, {'train/accuracy': 0.6535195708274841, 'train/loss': 1.6985844373703003, 'validation/accuracy': 0.6078799962997437, 'validation/loss': 1.8977073431015015, 'validation/num_examples': 50000, 'test/accuracy': 0.4879000186920166, 'test/loss': 2.5333468914031982, 'test/num_examples': 10000, 'score': 14345.642942905426, 'total_duration': 14941.717395067215, 'accumulated_submission_time': 14345.642942905426, 'accumulated_eval_time': 594.4908349514008, 'accumulated_logging_time': 0.9433751106262207, 'global_step': 42478, 'preemption_count': 0}), (43995, {'train/accuracy': 0.6800063848495483, 'train/loss': 1.590664267539978, 'validation/accuracy': 0.5967199802398682, 'validation/loss': 1.9601261615753174, 'validation/num_examples': 50000, 'test/accuracy': 0.47510001063346863, 'test/loss': 2.590411424636841, 'test/num_examples': 10000, 'score': 14855.678381443024, 'total_duration': 15472.915107250214, 'accumulated_submission_time': 14855.678381443024, 'accumulated_eval_time': 615.5962023735046, 'accumulated_logging_time': 0.9775500297546387, 'global_step': 43995, 'preemption_count': 0}), (45512, {'train/accuracy': 0.6493741869926453, 'train/loss': 1.7071624994277954, 'validation/accuracy': 0.5888199806213379, 'validation/loss': 2.0002124309539795, 'validation/num_examples': 50000, 'test/accuracy': 0.45580002665519714, 'test/loss': 2.681110382080078, 'test/num_examples': 10000, 'score': 15365.652275562286, 'total_duration': 16004.144119977951, 'accumulated_submission_time': 15365.652275562286, 'accumulated_eval_time': 636.7960221767426, 'accumulated_logging_time': 1.010751485824585, 'global_step': 45512, 'preemption_count': 0}), (47029, {'train/accuracy': 0.6650390625, 'train/loss': 1.6022390127182007, 'validation/accuracy': 0.6084200143814087, 'validation/loss': 1.8593943119049072, 'validation/num_examples': 50000, 'test/accuracy': 0.4799000322818756, 'test/loss': 2.522620677947998, 'test/num_examples': 10000, 'score': 15875.610366106033, 'total_duration': 16535.40897345543, 'accumulated_submission_time': 15875.610366106033, 'accumulated_eval_time': 658.0466804504395, 'accumulated_logging_time': 1.043591022491455, 'global_step': 47029, 'preemption_count': 0}), (48546, {'train/accuracy': 0.6487563848495483, 'train/loss': 1.6832637786865234, 'validation/accuracy': 0.5993599891662598, 'validation/loss': 1.918584942817688, 'validation/num_examples': 50000, 'test/accuracy': 0.47190001606941223, 'test/loss': 2.5920767784118652, 'test/num_examples': 10000, 'score': 16385.556366205215, 'total_duration': 17066.591827869415, 'accumulated_submission_time': 16385.556366205215, 'accumulated_eval_time': 679.2269690036774, 'accumulated_logging_time': 1.0766348838806152, 'global_step': 48546, 'preemption_count': 0}), (50063, {'train/accuracy': 0.6590401530265808, 'train/loss': 1.6971187591552734, 'validation/accuracy': 0.6092999577522278, 'validation/loss': 1.9145151376724243, 'validation/num_examples': 50000, 'test/accuracy': 0.4806000292301178, 'test/loss': 2.5815019607543945, 'test/num_examples': 10000, 'score': 16895.64519429207, 'total_duration': 17597.92692875862, 'accumulated_submission_time': 16895.64519429207, 'accumulated_eval_time': 700.4168322086334, 'accumulated_logging_time': 1.1097350120544434, 'global_step': 50063, 'preemption_count': 0}), (51580, {'train/accuracy': 0.6513472199440002, 'train/loss': 1.657335877418518, 'validation/accuracy': 0.6073200106620789, 'validation/loss': 1.8589073419570923, 'validation/num_examples': 50000, 'test/accuracy': 0.4838000237941742, 'test/loss': 2.5184545516967773, 'test/num_examples': 10000, 'score': 17405.740804433823, 'total_duration': 18129.24203300476, 'accumulated_submission_time': 17405.740804433823, 'accumulated_eval_time': 721.5812134742737, 'accumulated_logging_time': 1.142387866973877, 'global_step': 51580, 'preemption_count': 0}), (53098, {'train/accuracy': 0.6906289458274841, 'train/loss': 1.4674862623214722, 'validation/accuracy': 0.5989800095558167, 'validation/loss': 1.8677852153778076, 'validation/num_examples': 50000, 'test/accuracy': 0.48270002007484436, 'test/loss': 2.5035319328308105, 'test/num_examples': 10000, 'score': 17915.94573879242, 'total_duration': 18660.634664297104, 'accumulated_submission_time': 17915.94573879242, 'accumulated_eval_time': 742.713464975357, 'accumulated_logging_time': 1.1745285987854004, 'global_step': 53098, 'preemption_count': 0}), (54616, {'train/accuracy': 0.6725525856018066, 'train/loss': 1.5843250751495361, 'validation/accuracy': 0.6118199825286865, 'validation/loss': 1.859339952468872, 'validation/num_examples': 50000, 'test/accuracy': 0.47780001163482666, 'test/loss': 2.5555808544158936, 'test/num_examples': 10000, 'score': 18425.954810380936, 'total_duration': 19191.906791448593, 'accumulated_submission_time': 18425.954810380936, 'accumulated_eval_time': 763.9139442443848, 'accumulated_logging_time': 1.2138841152191162, 'global_step': 54616, 'preemption_count': 0}), (56133, {'train/accuracy': 0.6829958558082581, 'train/loss': 1.4818432331085205, 'validation/accuracy': 0.6251199841499329, 'validation/loss': 1.7404627799987793, 'validation/num_examples': 50000, 'test/accuracy': 0.5063000321388245, 'test/loss': 2.392498254776001, 'test/num_examples': 10000, 'score': 18936.050762176514, 'total_duration': 19723.13370156288, 'accumulated_submission_time': 18936.050762176514, 'accumulated_eval_time': 784.9846830368042, 'accumulated_logging_time': 1.2514803409576416, 'global_step': 56133, 'preemption_count': 0}), (57651, {'train/accuracy': 0.6757014989852905, 'train/loss': 1.5783743858337402, 'validation/accuracy': 0.619879961013794, 'validation/loss': 1.8317604064941406, 'validation/num_examples': 50000, 'test/accuracy': 0.49640002846717834, 'test/loss': 2.4753739833831787, 'test/num_examples': 10000, 'score': 19446.212296009064, 'total_duration': 20254.494203805923, 'accumulated_submission_time': 19446.212296009064, 'accumulated_eval_time': 806.1234366893768, 'accumulated_logging_time': 1.2888352870941162, 'global_step': 57651, 'preemption_count': 0}), (59168, {'train/accuracy': 0.6541573405265808, 'train/loss': 1.6419929265975952, 'validation/accuracy': 0.6013599634170532, 'validation/loss': 1.8765325546264648, 'validation/num_examples': 50000, 'test/accuracy': 0.4806000292301178, 'test/loss': 2.5387802124023438, 'test/num_examples': 10000, 'score': 19956.23943376541, 'total_duration': 20785.629487276077, 'accumulated_submission_time': 19956.23943376541, 'accumulated_eval_time': 827.1725206375122, 'accumulated_logging_time': 1.3252980709075928, 'global_step': 59168, 'preemption_count': 0}), (60685, {'train/accuracy': 0.6724131107330322, 'train/loss': 1.6111648082733154, 'validation/accuracy': 0.6247599720954895, 'validation/loss': 1.8165513277053833, 'validation/num_examples': 50000, 'test/accuracy': 0.49490001797676086, 'test/loss': 2.476473093032837, 'test/num_examples': 10000, 'score': 20466.31075167656, 'total_duration': 21316.890612602234, 'accumulated_submission_time': 20466.31075167656, 'accumulated_eval_time': 848.3012022972107, 'accumulated_logging_time': 1.362497329711914, 'global_step': 60685, 'preemption_count': 0}), (62203, {'train/accuracy': 0.6910076141357422, 'train/loss': 1.4848798513412476, 'validation/accuracy': 0.611739993095398, 'validation/loss': 1.8428031206130981, 'validation/num_examples': 50000, 'test/accuracy': 0.4788000285625458, 'test/loss': 2.5234971046447754, 'test/num_examples': 10000, 'score': 20976.412611722946, 'total_duration': 21848.198214292526, 'accumulated_submission_time': 20976.412611722946, 'accumulated_eval_time': 869.4459004402161, 'accumulated_logging_time': 1.4001359939575195, 'global_step': 62203, 'preemption_count': 0}), (63721, {'train/accuracy': 0.6920240521430969, 'train/loss': 1.4649821519851685, 'validation/accuracy': 0.6265400052070618, 'validation/loss': 1.7699750661849976, 'validation/num_examples': 50000, 'test/accuracy': 0.5057000517845154, 'test/loss': 2.418595314025879, 'test/num_examples': 10000, 'score': 21486.580602407455, 'total_duration': 22379.547651052475, 'accumulated_submission_time': 21486.580602407455, 'accumulated_eval_time': 890.5707561969757, 'accumulated_logging_time': 1.433117389678955, 'global_step': 63721, 'preemption_count': 0}), (65239, {'train/accuracy': 0.6842514276504517, 'train/loss': 1.510816216468811, 'validation/accuracy': 0.6255399584770203, 'validation/loss': 1.7839473485946655, 'validation/num_examples': 50000, 'test/accuracy': 0.4951000213623047, 'test/loss': 2.4557323455810547, 'test/num_examples': 10000, 'score': 21996.856678962708, 'total_duration': 22911.09648013115, 'accumulated_submission_time': 21996.856678962708, 'accumulated_eval_time': 911.7887194156647, 'accumulated_logging_time': 1.464731216430664, 'global_step': 65239, 'preemption_count': 0}), (66757, {'train/accuracy': 0.6906688213348389, 'train/loss': 1.5202724933624268, 'validation/accuracy': 0.6343199610710144, 'validation/loss': 1.7723238468170166, 'validation/num_examples': 50000, 'test/accuracy': 0.5078000426292419, 'test/loss': 2.4403321743011475, 'test/num_examples': 10000, 'score': 22506.95446920395, 'total_duration': 23442.88888812065, 'accumulated_submission_time': 22506.95446920395, 'accumulated_eval_time': 933.426650762558, 'accumulated_logging_time': 1.4981484413146973, 'global_step': 66757, 'preemption_count': 0}), (68275, {'train/accuracy': 0.6909080147743225, 'train/loss': 1.470110535621643, 'validation/accuracy': 0.6406999826431274, 'validation/loss': 1.7015717029571533, 'validation/num_examples': 50000, 'test/accuracy': 0.5154000520706177, 'test/loss': 2.3714752197265625, 'test/num_examples': 10000, 'score': 23017.093948602676, 'total_duration': 23974.409868717194, 'accumulated_submission_time': 23017.093948602676, 'accumulated_eval_time': 954.7498555183411, 'accumulated_logging_time': 1.5329856872558594, 'global_step': 68275, 'preemption_count': 0}), (69792, {'train/accuracy': 0.6947743892669678, 'train/loss': 1.4690238237380981, 'validation/accuracy': 0.6391599774360657, 'validation/loss': 1.700404167175293, 'validation/num_examples': 50000, 'test/accuracy': 0.5162000060081482, 'test/loss': 2.340641736984253, 'test/num_examples': 10000, 'score': 23527.16760492325, 'total_duration': 24505.717046260834, 'accumulated_submission_time': 23527.16760492325, 'accumulated_eval_time': 975.9260275363922, 'accumulated_logging_time': 1.5669758319854736, 'global_step': 69792, 'preemption_count': 0}), (71310, {'train/accuracy': 0.6975247263908386, 'train/loss': 1.482591152191162, 'validation/accuracy': 0.6232199668884277, 'validation/loss': 1.8254543542861938, 'validation/num_examples': 50000, 'test/accuracy': 0.4918000102043152, 'test/loss': 2.5100009441375732, 'test/num_examples': 10000, 'score': 24037.270278930664, 'total_duration': 25036.85267686844, 'accumulated_submission_time': 24037.270278930664, 'accumulated_eval_time': 996.903163433075, 'accumulated_logging_time': 1.599836826324463, 'global_step': 71310, 'preemption_count': 0}), (72829, {'train/accuracy': 0.7215601205825806, 'train/loss': 1.327860713005066, 'validation/accuracy': 0.6525799632072449, 'validation/loss': 1.6442086696624756, 'validation/num_examples': 50000, 'test/accuracy': 0.5234000086784363, 'test/loss': 2.286708116531372, 'test/num_examples': 10000, 'score': 24547.331042289734, 'total_duration': 25568.003078222275, 'accumulated_submission_time': 24547.331042289734, 'accumulated_eval_time': 1017.9354872703552, 'accumulated_logging_time': 1.6340680122375488, 'global_step': 72829, 'preemption_count': 0}), (74347, {'train/accuracy': 0.6916055083274841, 'train/loss': 1.4878246784210205, 'validation/accuracy': 0.6294199824333191, 'validation/loss': 1.7609314918518066, 'validation/num_examples': 50000, 'test/accuracy': 0.5049999952316284, 'test/loss': 2.4299111366271973, 'test/num_examples': 10000, 'score': 25057.428204774857, 'total_duration': 26099.292588233948, 'accumulated_submission_time': 25057.428204774857, 'accumulated_eval_time': 1039.0636780261993, 'accumulated_logging_time': 1.675804615020752, 'global_step': 74347, 'preemption_count': 0}), (75865, {'train/accuracy': 0.7079480290412903, 'train/loss': 1.436248540878296, 'validation/accuracy': 0.6432799696922302, 'validation/loss': 1.7123216390609741, 'validation/num_examples': 50000, 'test/accuracy': 0.5208000540733337, 'test/loss': 2.3495261669158936, 'test/num_examples': 10000, 'score': 25567.620626449585, 'total_duration': 26630.89289879799, 'accumulated_submission_time': 25567.620626449585, 'accumulated_eval_time': 1060.4125316143036, 'accumulated_logging_time': 1.7115974426269531, 'global_step': 75865, 'preemption_count': 0}), (77382, {'train/accuracy': 0.6989995241165161, 'train/loss': 1.422307014465332, 'validation/accuracy': 0.642579972743988, 'validation/loss': 1.6762522459030151, 'validation/num_examples': 50000, 'test/accuracy': 0.5154000520706177, 'test/loss': 2.3255693912506104, 'test/num_examples': 10000, 'score': 26077.580560445786, 'total_duration': 27162.072484493256, 'accumulated_submission_time': 26077.580560445786, 'accumulated_eval_time': 1081.571738243103, 'accumulated_logging_time': 1.748826503753662, 'global_step': 77382, 'preemption_count': 0}), (78900, {'train/accuracy': 0.7405332922935486, 'train/loss': 1.2673977613449097, 'validation/accuracy': 0.6460199952125549, 'validation/loss': 1.6637077331542969, 'validation/num_examples': 50000, 'test/accuracy': 0.5200999975204468, 'test/loss': 2.316652536392212, 'test/num_examples': 10000, 'score': 26587.66230893135, 'total_duration': 27693.276193618774, 'accumulated_submission_time': 26587.66230893135, 'accumulated_eval_time': 1102.640297651291, 'accumulated_logging_time': 1.7789013385772705, 'global_step': 78900, 'preemption_count': 0}), (80417, {'train/accuracy': 0.7322424650192261, 'train/loss': 1.2937663793563843, 'validation/accuracy': 0.6529799699783325, 'validation/loss': 1.6502137184143066, 'validation/num_examples': 50000, 'test/accuracy': 0.527999997138977, 'test/loss': 2.2828421592712402, 'test/num_examples': 10000, 'score': 27097.790013074875, 'total_duration': 28224.572131872177, 'accumulated_submission_time': 27097.790013074875, 'accumulated_eval_time': 1123.7507123947144, 'accumulated_logging_time': 1.813563585281372, 'global_step': 80417, 'preemption_count': 0}), (81935, {'train/accuracy': 0.7216796875, 'train/loss': 1.3296536207199097, 'validation/accuracy': 0.6556800007820129, 'validation/loss': 1.635520577430725, 'validation/num_examples': 50000, 'test/accuracy': 0.5260000228881836, 'test/loss': 2.306215286254883, 'test/num_examples': 10000, 'score': 27607.79568052292, 'total_duration': 28755.640197753906, 'accumulated_submission_time': 27607.79568052292, 'accumulated_eval_time': 1144.751916885376, 'accumulated_logging_time': 1.8511121273040771, 'global_step': 81935, 'preemption_count': 0}), (83452, {'train/accuracy': 0.7293726205825806, 'train/loss': 1.2778840065002441, 'validation/accuracy': 0.664139986038208, 'validation/loss': 1.5573835372924805, 'validation/num_examples': 50000, 'test/accuracy': 0.534500002861023, 'test/loss': 2.211136817932129, 'test/num_examples': 10000, 'score': 28117.902702093124, 'total_duration': 29286.823399305344, 'accumulated_submission_time': 28117.902702093124, 'accumulated_eval_time': 1165.7663543224335, 'accumulated_logging_time': 1.8896336555480957, 'global_step': 83452, 'preemption_count': 0}), (84969, {'train/accuracy': 0.7102997303009033, 'train/loss': 1.3722602128982544, 'validation/accuracy': 0.6526600122451782, 'validation/loss': 1.63387930393219, 'validation/num_examples': 50000, 'test/accuracy': 0.5289000272750854, 'test/loss': 2.2834620475769043, 'test/num_examples': 10000, 'score': 28627.961901664734, 'total_duration': 29817.84640312195, 'accumulated_submission_time': 28627.961901664734, 'accumulated_eval_time': 1186.6677963733673, 'accumulated_logging_time': 1.9287693500518799, 'global_step': 84969, 'preemption_count': 0}), (86487, {'train/accuracy': 0.7192083597183228, 'train/loss': 1.318070411682129, 'validation/accuracy': 0.6582799553871155, 'validation/loss': 1.5826328992843628, 'validation/num_examples': 50000, 'test/accuracy': 0.5327000021934509, 'test/loss': 2.2320618629455566, 'test/num_examples': 10000, 'score': 29138.005053281784, 'total_duration': 30349.136724233627, 'accumulated_submission_time': 29138.005053281784, 'accumulated_eval_time': 1207.8558654785156, 'accumulated_logging_time': 1.964245080947876, 'global_step': 86487, 'preemption_count': 0}), (88005, {'train/accuracy': 0.7746930718421936, 'train/loss': 1.0886952877044678, 'validation/accuracy': 0.6663599610328674, 'validation/loss': 1.5537863969802856, 'validation/num_examples': 50000, 'test/accuracy': 0.5420000553131104, 'test/loss': 2.211760997772217, 'test/num_examples': 10000, 'score': 29648.06053853035, 'total_duration': 30880.23536133766, 'accumulated_submission_time': 29648.06053853035, 'accumulated_eval_time': 1228.8410770893097, 'accumulated_logging_time': 1.9997587203979492, 'global_step': 88005, 'preemption_count': 0}), (89524, {'train/accuracy': 0.7538663744926453, 'train/loss': 1.2191698551177979, 'validation/accuracy': 0.6728999614715576, 'validation/loss': 1.5614254474639893, 'validation/num_examples': 50000, 'test/accuracy': 0.5437000393867493, 'test/loss': 2.203444719314575, 'test/num_examples': 10000, 'score': 30158.248270750046, 'total_duration': 31411.583546876907, 'accumulated_submission_time': 30158.248270750046, 'accumulated_eval_time': 1249.946064710617, 'accumulated_logging_time': 2.0324220657348633, 'global_step': 89524, 'preemption_count': 0}), (91043, {'train/accuracy': 0.7286351919174194, 'train/loss': 1.2893345355987549, 'validation/accuracy': 0.6548799872398376, 'validation/loss': 1.6241097450256348, 'validation/num_examples': 50000, 'test/accuracy': 0.5337000489234924, 'test/loss': 2.274324893951416, 'test/num_examples': 10000, 'score': 30668.21325492859, 'total_duration': 31942.735835552216, 'accumulated_submission_time': 30668.21325492859, 'accumulated_eval_time': 1271.077484369278, 'accumulated_logging_time': 2.065810203552246, 'global_step': 91043, 'preemption_count': 0}), (92561, {'train/accuracy': 0.7453164458274841, 'train/loss': 1.2298004627227783, 'validation/accuracy': 0.6710399985313416, 'validation/loss': 1.5516157150268555, 'validation/num_examples': 50000, 'test/accuracy': 0.5501000285148621, 'test/loss': 2.1891353130340576, 'test/num_examples': 10000, 'score': 31178.181513547897, 'total_duration': 32473.721732854843, 'accumulated_submission_time': 31178.181513547897, 'accumulated_eval_time': 1292.0364754199982, 'accumulated_logging_time': 2.102283000946045, 'global_step': 92561, 'preemption_count': 0}), (94079, {'train/accuracy': 0.7446189522743225, 'train/loss': 1.263240098953247, 'validation/accuracy': 0.677619993686676, 'validation/loss': 1.5534390211105347, 'validation/num_examples': 50000, 'test/accuracy': 0.5550000071525574, 'test/loss': 2.191549062728882, 'test/num_examples': 10000, 'score': 31688.31489801407, 'total_duration': 33004.9747774601, 'accumulated_submission_time': 31688.31489801407, 'accumulated_eval_time': 1313.0971965789795, 'accumulated_logging_time': 2.1376092433929443, 'global_step': 94079, 'preemption_count': 0}), (95597, {'train/accuracy': 0.7449776530265808, 'train/loss': 1.2335659265518188, 'validation/accuracy': 0.6818400025367737, 'validation/loss': 1.5183688402175903, 'validation/num_examples': 50000, 'test/accuracy': 0.5538000464439392, 'test/loss': 2.1452736854553223, 'test/num_examples': 10000, 'score': 32198.41109275818, 'total_duration': 33536.11962342262, 'accumulated_submission_time': 32198.41109275818, 'accumulated_eval_time': 1334.085800409317, 'accumulated_logging_time': 2.1747827529907227, 'global_step': 95597, 'preemption_count': 0}), (97115, {'train/accuracy': 0.7691525816917419, 'train/loss': 1.1535043716430664, 'validation/accuracy': 0.667199969291687, 'validation/loss': 1.590874195098877, 'validation/num_examples': 50000, 'test/accuracy': 0.5401000380516052, 'test/loss': 2.221604347229004, 'test/num_examples': 10000, 'score': 32708.41788005829, 'total_duration': 34067.122673511505, 'accumulated_submission_time': 32708.41788005829, 'accumulated_eval_time': 1355.0201969146729, 'accumulated_logging_time': 2.2134897708892822, 'global_step': 97115, 'preemption_count': 0}), (98632, {'train/accuracy': 0.7757692933082581, 'train/loss': 1.1324656009674072, 'validation/accuracy': 0.690559983253479, 'validation/loss': 1.5002992153167725, 'validation/num_examples': 50000, 'test/accuracy': 0.5630000233650208, 'test/loss': 2.134718656539917, 'test/num_examples': 10000, 'score': 33218.47289562225, 'total_duration': 34598.53514504433, 'accumulated_submission_time': 33218.47289562225, 'accumulated_eval_time': 1376.3223690986633, 'accumulated_logging_time': 2.246415138244629, 'global_step': 98632, 'preemption_count': 0}), (100150, {'train/accuracy': 0.7731584906578064, 'train/loss': 1.1149073839187622, 'validation/accuracy': 0.6941999793052673, 'validation/loss': 1.4566799402236938, 'validation/num_examples': 50000, 'test/accuracy': 0.5652000308036804, 'test/loss': 2.10357928276062, 'test/num_examples': 10000, 'score': 33728.69375014305, 'total_duration': 35129.76139855385, 'accumulated_submission_time': 33728.69375014305, 'accumulated_eval_time': 1397.2703416347504, 'accumulated_logging_time': 2.2805235385894775, 'global_step': 100150, 'preemption_count': 0}), (101668, {'train/accuracy': 0.7729790806770325, 'train/loss': 1.1216973066329956, 'validation/accuracy': 0.6942600011825562, 'validation/loss': 1.4576067924499512, 'validation/num_examples': 50000, 'test/accuracy': 0.5682000517845154, 'test/loss': 2.1024677753448486, 'test/num_examples': 10000, 'score': 34238.80842423439, 'total_duration': 35660.899446964264, 'accumulated_submission_time': 34238.80842423439, 'accumulated_eval_time': 1418.23712515831, 'accumulated_logging_time': 2.314119815826416, 'global_step': 101668, 'preemption_count': 0}), (103187, {'train/accuracy': 0.7712850570678711, 'train/loss': 1.1430009603500366, 'validation/accuracy': 0.6977199912071228, 'validation/loss': 1.4706509113311768, 'validation/num_examples': 50000, 'test/accuracy': 0.5703999996185303, 'test/loss': 2.109886646270752, 'test/num_examples': 10000, 'score': 34749.06494665146, 'total_duration': 36192.19437289238, 'accumulated_submission_time': 34749.06494665146, 'accumulated_eval_time': 1439.2192113399506, 'accumulated_logging_time': 2.3478918075561523, 'global_step': 103187, 'preemption_count': 0}), (104705, {'train/accuracy': 0.7724210619926453, 'train/loss': 1.1543452739715576, 'validation/accuracy': 0.6925199627876282, 'validation/loss': 1.485355019569397, 'validation/num_examples': 50000, 'test/accuracy': 0.5617000460624695, 'test/loss': 2.1401867866516113, 'test/num_examples': 10000, 'score': 35259.32285261154, 'total_duration': 36723.48499917984, 'accumulated_submission_time': 35259.32285261154, 'accumulated_eval_time': 1460.195018529892, 'accumulated_logging_time': 2.381948232650757, 'global_step': 104705, 'preemption_count': 0}), (106224, {'train/accuracy': 0.8110052347183228, 'train/loss': 0.9716266393661499, 'validation/accuracy': 0.7090199589729309, 'validation/loss': 1.3921597003936768, 'validation/num_examples': 50000, 'test/accuracy': 0.5842000246047974, 'test/loss': 2.0155460834503174, 'test/num_examples': 10000, 'score': 35769.33108854294, 'total_duration': 37254.577083826065, 'accumulated_submission_time': 35769.33108854294, 'accumulated_eval_time': 1481.2203319072723, 'accumulated_logging_time': 2.4178240299224854, 'global_step': 106224, 'preemption_count': 0}), (107743, {'train/accuracy': 0.7952407598495483, 'train/loss': 1.0415843725204468, 'validation/accuracy': 0.7023400068283081, 'validation/loss': 1.4332274198532104, 'validation/num_examples': 50000, 'test/accuracy': 0.5763000249862671, 'test/loss': 2.0579850673675537, 'test/num_examples': 10000, 'score': 36279.43704533577, 'total_duration': 37785.853261470795, 'accumulated_submission_time': 36279.43704533577, 'accumulated_eval_time': 1502.319188117981, 'accumulated_logging_time': 2.4656083583831787, 'global_step': 107743, 'preemption_count': 0}), (109260, {'train/accuracy': 0.80961012840271, 'train/loss': 0.9733060002326965, 'validation/accuracy': 0.7177000045776367, 'validation/loss': 1.3576146364212036, 'validation/num_examples': 50000, 'test/accuracy': 0.5911000370979309, 'test/loss': 1.9605129957199097, 'test/num_examples': 10000, 'score': 36789.401881456375, 'total_duration': 38316.75459957123, 'accumulated_submission_time': 36789.401881456375, 'accumulated_eval_time': 1523.1936659812927, 'accumulated_logging_time': 2.504333019256592, 'global_step': 109260, 'preemption_count': 0}), (110778, {'train/accuracy': 0.7960578799247742, 'train/loss': 1.0393996238708496, 'validation/accuracy': 0.7094399929046631, 'validation/loss': 1.413783073425293, 'validation/num_examples': 50000, 'test/accuracy': 0.5824000239372253, 'test/loss': 2.058758020401001, 'test/num_examples': 10000, 'score': 37299.51941990852, 'total_duration': 38847.8635661602, 'accumulated_submission_time': 37299.51941990852, 'accumulated_eval_time': 1544.1275751590729, 'accumulated_logging_time': 2.539213180541992, 'global_step': 110778, 'preemption_count': 0}), (112297, {'train/accuracy': 0.8116230964660645, 'train/loss': 0.9464977383613586, 'validation/accuracy': 0.721019983291626, 'validation/loss': 1.3188493251800537, 'validation/num_examples': 50000, 'test/accuracy': 0.5951000452041626, 'test/loss': 1.9455469846725464, 'test/num_examples': 10000, 'score': 37809.78449511528, 'total_duration': 39379.07870936394, 'accumulated_submission_time': 37809.78449511528, 'accumulated_eval_time': 1565.0159449577332, 'accumulated_logging_time': 2.5778493881225586, 'global_step': 112297, 'preemption_count': 0}), (113815, {'train/accuracy': 0.8452048897743225, 'train/loss': 0.8711603879928589, 'validation/accuracy': 0.7268399596214294, 'validation/loss': 1.3569468259811401, 'validation/num_examples': 50000, 'test/accuracy': 0.5951000452041626, 'test/loss': 1.9684876203536987, 'test/num_examples': 10000, 'score': 38319.7420566082, 'total_duration': 39910.09659743309, 'accumulated_submission_time': 38319.7420566082, 'accumulated_eval_time': 1586.0181345939636, 'accumulated_logging_time': 2.6128127574920654, 'global_step': 113815, 'preemption_count': 0}), (115334, {'train/accuracy': 0.8356186151504517, 'train/loss': 0.846179187297821, 'validation/accuracy': 0.7235400080680847, 'validation/loss': 1.3148432970046997, 'validation/num_examples': 50000, 'test/accuracy': 0.5975000262260437, 'test/loss': 1.953201413154602, 'test/num_examples': 10000, 'score': 38829.9924018383, 'total_duration': 40441.21931767464, 'accumulated_submission_time': 38829.9924018383, 'accumulated_eval_time': 1606.8307423591614, 'accumulated_logging_time': 2.6497488021850586, 'global_step': 115334, 'preemption_count': 0}), (116853, {'train/accuracy': 0.8401426672935486, 'train/loss': 0.8460429906845093, 'validation/accuracy': 0.7312600016593933, 'validation/loss': 1.2911760807037354, 'validation/num_examples': 50000, 'test/accuracy': 0.6062000393867493, 'test/loss': 1.8956819772720337, 'test/num_examples': 10000, 'score': 39340.216069698334, 'total_duration': 40972.4295835495, 'accumulated_submission_time': 39340.216069698334, 'accumulated_eval_time': 1627.757281780243, 'accumulated_logging_time': 2.6862545013427734, 'global_step': 116853, 'preemption_count': 0}), (118372, {'train/accuracy': 0.8466597199440002, 'train/loss': 0.8141149878501892, 'validation/accuracy': 0.7381399869918823, 'validation/loss': 1.2623021602630615, 'validation/num_examples': 50000, 'test/accuracy': 0.6214000582695007, 'test/loss': 1.8717379570007324, 'test/num_examples': 10000, 'score': 39850.24843668938, 'total_duration': 41503.52567815781, 'accumulated_submission_time': 39850.24843668938, 'accumulated_eval_time': 1648.753799200058, 'accumulated_logging_time': 2.7309017181396484, 'global_step': 118372, 'preemption_count': 0}), (119889, {'train/accuracy': 0.8564253449440002, 'train/loss': 0.7718811631202698, 'validation/accuracy': 0.7484999895095825, 'validation/loss': 1.2095959186553955, 'validation/num_examples': 50000, 'test/accuracy': 0.6221000552177429, 'test/loss': 1.825244426727295, 'test/num_examples': 10000, 'score': 40360.281074762344, 'total_duration': 42034.37654709816, 'accumulated_submission_time': 40360.281074762344, 'accumulated_eval_time': 1669.5109317302704, 'accumulated_logging_time': 2.7693583965301514, 'global_step': 119889, 'preemption_count': 0}), (121408, {'train/accuracy': 0.8634805083274841, 'train/loss': 0.7633724212646484, 'validation/accuracy': 0.750220000743866, 'validation/loss': 1.2169595956802368, 'validation/num_examples': 50000, 'test/accuracy': 0.626800000667572, 'test/loss': 1.814558506011963, 'test/num_examples': 10000, 'score': 40870.47562837601, 'total_duration': 42565.58989524841, 'accumulated_submission_time': 40870.47562837601, 'accumulated_eval_time': 1690.4636988639832, 'accumulated_logging_time': 2.8127453327178955, 'global_step': 121408, 'preemption_count': 0}), (122926, {'train/accuracy': 0.896882951259613, 'train/loss': 0.6356571316719055, 'validation/accuracy': 0.7584199905395508, 'validation/loss': 1.1778696775436401, 'validation/num_examples': 50000, 'test/accuracy': 0.6371000409126282, 'test/loss': 1.7690379619598389, 'test/num_examples': 10000, 'score': 41380.697590112686, 'total_duration': 43096.77365708351, 'accumulated_submission_time': 41380.697590112686, 'accumulated_eval_time': 1711.363877773285, 'accumulated_logging_time': 2.851658821105957, 'global_step': 122926, 'preemption_count': 0}), (124445, {'train/accuracy': 0.8963049650192261, 'train/loss': 0.6321005821228027, 'validation/accuracy': 0.7647599577903748, 'validation/loss': 1.1482919454574585, 'validation/num_examples': 50000, 'test/accuracy': 0.6454000473022461, 'test/loss': 1.7265831232070923, 'test/num_examples': 10000, 'score': 41890.86552166939, 'total_duration': 43628.127166986465, 'accumulated_submission_time': 41890.86552166939, 'accumulated_eval_time': 1732.4812920093536, 'accumulated_logging_time': 2.8973982334136963, 'global_step': 124445, 'preemption_count': 0}), (125965, {'train/accuracy': 0.9003308415412903, 'train/loss': 0.6020267009735107, 'validation/accuracy': 0.7695800065994263, 'validation/loss': 1.123653769493103, 'validation/num_examples': 50000, 'test/accuracy': 0.6490000486373901, 'test/loss': 1.7115205526351929, 'test/num_examples': 10000, 'score': 42401.11342215538, 'total_duration': 44159.210112810135, 'accumulated_submission_time': 42401.11342215538, 'accumulated_eval_time': 1753.2567028999329, 'accumulated_logging_time': 2.934680700302124, 'global_step': 125965, 'preemption_count': 0}), (127483, {'train/accuracy': 0.9035195708274841, 'train/loss': 0.5998629331588745, 'validation/accuracy': 0.7721199989318848, 'validation/loss': 1.1213932037353516, 'validation/num_examples': 50000, 'test/accuracy': 0.6554000377655029, 'test/loss': 1.6994982957839966, 'test/num_examples': 10000, 'score': 42911.07231712341, 'total_duration': 44690.04618239403, 'accumulated_submission_time': 42911.07231712341, 'accumulated_eval_time': 1774.066725730896, 'accumulated_logging_time': 2.978921890258789, 'global_step': 127483, 'preemption_count': 0}), (129001, {'train/accuracy': 0.9015664458274841, 'train/loss': 0.6057281494140625, 'validation/accuracy': 0.7714999914169312, 'validation/loss': 1.1205312013626099, 'validation/num_examples': 50000, 'test/accuracy': 0.6547000408172607, 'test/loss': 1.7005892992019653, 'test/num_examples': 10000, 'score': 43421.583512067795, 'total_duration': 45221.447182655334, 'accumulated_submission_time': 43421.583512067795, 'accumulated_eval_time': 1794.8810048103333, 'accumulated_logging_time': 3.0320894718170166, 'global_step': 129001, 'preemption_count': 0}), (130519, {'train/accuracy': 0.9035793542861938, 'train/loss': 0.5980676412582397, 'validation/accuracy': 0.7721799612045288, 'validation/loss': 1.119748592376709, 'validation/num_examples': 50000, 'test/accuracy': 0.6549000144004822, 'test/loss': 1.6995123624801636, 'test/num_examples': 10000, 'score': 43931.55882525444, 'total_duration': 45752.34780359268, 'accumulated_submission_time': 43931.55882525444, 'accumulated_eval_time': 1815.7315328121185, 'accumulated_logging_time': 3.084369659423828, 'global_step': 130519, 'preemption_count': 0}), (132036, {'train/accuracy': 0.9046555757522583, 'train/loss': 0.5945121645927429, 'validation/accuracy': 0.7727400064468384, 'validation/loss': 1.1170986890792847, 'validation/num_examples': 50000, 'test/accuracy': 0.6539000272750854, 'test/loss': 1.6980239152908325, 'test/num_examples': 10000, 'score': 44441.67366838455, 'total_duration': 46283.51500082016, 'accumulated_submission_time': 44441.67366838455, 'accumulated_eval_time': 1836.7223308086395, 'accumulated_logging_time': 3.1229963302612305, 'global_step': 132036, 'preemption_count': 0}), (133554, {'train/accuracy': 0.9058513641357422, 'train/loss': 0.5935448408126831, 'validation/accuracy': 0.772599995136261, 'validation/loss': 1.117397427558899, 'validation/num_examples': 50000, 'test/accuracy': 0.6536000370979309, 'test/loss': 1.6978504657745361, 'test/num_examples': 10000, 'score': 44951.80677986145, 'total_duration': 46814.53054857254, 'accumulated_submission_time': 44951.80677986145, 'accumulated_eval_time': 1857.5457971096039, 'accumulated_logging_time': 3.1587300300598145, 'global_step': 133554, 'preemption_count': 0}), (135072, {'train/accuracy': 0.9068678021430969, 'train/loss': 0.5902974605560303, 'validation/accuracy': 0.7732200026512146, 'validation/loss': 1.1165798902511597, 'validation/num_examples': 50000, 'test/accuracy': 0.6538000106811523, 'test/loss': 1.698063611984253, 'test/num_examples': 10000, 'score': 45461.94407606125, 'total_duration': 47345.66832566261, 'accumulated_submission_time': 45461.94407606125, 'accumulated_eval_time': 1878.4849972724915, 'accumulated_logging_time': 3.197124719619751, 'global_step': 135072, 'preemption_count': 0}), (136590, {'train/accuracy': 0.9063496589660645, 'train/loss': 0.5868960022926331, 'validation/accuracy': 0.7733599543571472, 'validation/loss': 1.11497962474823, 'validation/num_examples': 50000, 'test/accuracy': 0.6541000604629517, 'test/loss': 1.6960680484771729, 'test/num_examples': 10000, 'score': 45971.88573241234, 'total_duration': 47876.49062347412, 'accumulated_submission_time': 45971.88573241234, 'accumulated_eval_time': 1899.3071541786194, 'accumulated_logging_time': 3.2327637672424316, 'global_step': 136590, 'preemption_count': 0}), (138108, {'train/accuracy': 0.907645046710968, 'train/loss': 0.5866842865943909, 'validation/accuracy': 0.7729199528694153, 'validation/loss': 1.1145741939544678, 'validation/num_examples': 50000, 'test/accuracy': 0.6550000309944153, 'test/loss': 1.6953842639923096, 'test/num_examples': 10000, 'score': 46481.956923007965, 'total_duration': 48407.56366467476, 'accumulated_submission_time': 46481.956923007965, 'accumulated_eval_time': 1920.2494142055511, 'accumulated_logging_time': 3.2690396308898926, 'global_step': 138108, 'preemption_count': 0}), (139627, {'train/accuracy': 0.9070870280265808, 'train/loss': 0.5904383659362793, 'validation/accuracy': 0.7734000086784363, 'validation/loss': 1.1148360967636108, 'validation/num_examples': 50000, 'test/accuracy': 0.6565000414848328, 'test/loss': 1.696603775024414, 'test/num_examples': 10000, 'score': 46991.960359573364, 'total_duration': 48938.4100048542, 'accumulated_submission_time': 46991.960359573364, 'accumulated_eval_time': 1941.032898902893, 'accumulated_logging_time': 3.3056366443634033, 'global_step': 139627, 'preemption_count': 0}), (140000, {'train/accuracy': 0.9084422588348389, 'train/loss': 0.578050971031189, 'validation/accuracy': 0.7722199559211731, 'validation/loss': 1.1122545003890991, 'validation/num_examples': 50000, 'test/accuracy': 0.655500054359436, 'test/loss': 1.6934614181518555, 'test/num_examples': 10000, 'score': 47117.04019618034, 'total_duration': 49084.18835735321, 'accumulated_submission_time': 47117.04019618034, 'accumulated_eval_time': 1961.6837706565857, 'accumulated_logging_time': 3.3469996452331543, 'global_step': 140000, 'preemption_count': 0})], 'global_step': 140000}
I0914 20:52:45.258277 139785753851712 submission_runner.py:543] Timing: 47117.04019618034
I0914 20:52:45.258334 139785753851712 submission_runner.py:545] Total number of evals: 94
I0914 20:52:45.258379 139785753851712 submission_runner.py:546] ====================
I0914 20:52:45.258617 139785753851712 submission_runner.py:614] Final imagenet_resnet score: 47117.04019618034
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment