Skip to content

Instantly share code, notes, and snippets.

@mdouze
Created August 26, 2019 15:03
Show Gist options
  • Save mdouze/ffa01fe666a9325761266fe55ead72ad to your computer and use it in GitHub Desktop.
Save mdouze/ffa01fe666a9325761266fe55ead72ad to your computer and use it in GitHub Desktop.
(faiss_1.5.2) matthijs@devfair0144:~/faiss_versions/faiss_1Tcode/faiss/benchs/distributed_ondisk$ python distributed_kmeans.py --test 0
Clustering 100000 points in 128D to 1000 clusters, redo 1 times, 20 iterations
Preprocessing in 0.03 s
Iteration 19 (0.85 s, search 0.73 s): objective=3.04666e+09 imbalance=250.288 nsplit=3
(faiss_1.5.2) matthijs@devfair0144:~/faiss_versions/faiss_1Tcode/faiss/benchs/distributed_ondisk$ python distributed_kmeans.py --test 1
Clustering 100000 points in 128D to 1000 clusters, 20 iterations seed 1234
preproc...
done
Iteration 0 (0.11 s, search 0.10 s): objective=5.19019e+09 imbalance=250.883 nsplit=500
Iteration 1 (0.16 s, search 0.15 s): objective=3.57228e+09 imbalance=250.478 nsplit=291
Iteration 2 (0.21 s, search 0.19 s): objective=3.37028e+09 imbalance=250.383 nsplit=153
Iteration 3 (0.25 s, search 0.23 s): objective=3.25659e+09 imbalance=250.334 nsplit=84
Iteration 4 (0.31 s, search 0.29 s): objective=3.18893e+09 imbalance=250.313 nsplit=51
Iteration 5 (0.36 s, search 0.33 s): objective=3.14598e+09 imbalance=250.300 nsplit=27
Iteration 6 (0.40 s, search 0.37 s): objective=3.11773e+09 imbalance=250.293 nsplit=16
Iteration 7 (0.44 s, search 0.42 s): objective=3.09851e+09 imbalance=250.290 nsplit=11
Iteration 8 (0.49 s, search 0.46 s): objective=3.08491e+09 imbalance=250.287 nsplit=7
Iteration 9 (0.54 s, search 0.51 s): objective=3.07535e+09 imbalance=250.285 nsplit=3
Iteration 10 (0.58 s, search 0.55 s): objective=3.06785e+09 imbalance=250.284 nsplit=2
Iteration 11 (0.62 s, search 0.59 s): objective=3.06222e+09 imbalance=250.284 nsplit=1
Iteration 12 (0.67 s, search 0.64 s): objective=3.05779e+09 imbalance=250.283 nsplit=1
Iteration 13 (0.71 s, search 0.68 s): objective=3.05444e+09 imbalance=250.283 nsplit=0
Iteration 14 (0.76 s, search 0.72 s): objective=3.05168e+09 imbalance=250.283 nsplit=0
Iteration 15 (0.80 s, search 0.77 s): objective=3.04984e+09 imbalance=250.282 nsplit=0
Iteration 16 (0.85 s, search 0.81 s): objective=3.04834e+09 imbalance=250.282 nsplit=0
Iteration 17 (0.89 s, search 0.85 s): objective=3.04722e+09 imbalance=250.282 nsplit=0
Iteration 18 (0.94 s, search 0.90 s): objective=3.04616e+09 imbalance=250.282 nsplit=0
Iteration 19 (0.98 s, search 0.94 s): objective=3.04522e+09 imbalance=250.282 nsplit=0
(faiss_1.5.2) matthijs@devfair0144:~/faiss_versions/faiss_1Tcode/faiss/benchs/distributed_ondisk$ python distributed_kmeans.py --test 2
Clustering 100000 points in 128D to 1000 clusters, 20 iterations seed 1234
preproc...
done
Iteration 0 (0.13 s, search 0.12 s): objective=5.19019e+09 imbalance=250.883 nsplit=500
Iteration 1 (0.19 s, search 0.18 s): objective=3.57228e+09 imbalance=250.478 nsplit=291
Iteration 2 (0.24 s, search 0.23 s): objective=3.37028e+09 imbalance=250.383 nsplit=153
Iteration 3 (0.29 s, search 0.27 s): objective=3.25659e+09 imbalance=250.334 nsplit=84
Iteration 4 (0.34 s, search 0.31 s): objective=3.18893e+09 imbalance=250.313 nsplit=51
Iteration 5 (0.38 s, search 0.36 s): objective=3.14598e+09 imbalance=250.300 nsplit=27
Iteration 6 (0.43 s, search 0.41 s): objective=3.11773e+09 imbalance=250.293 nsplit=16
Iteration 7 (0.48 s, search 0.45 s): objective=3.09851e+09 imbalance=250.290 nsplit=11
Iteration 8 (0.53 s, search 0.50 s): objective=3.08491e+09 imbalance=250.287 nsplit=7
Iteration 9 (0.58 s, search 0.55 s): objective=3.07535e+09 imbalance=250.285 nsplit=3
Iteration 10 (0.62 s, search 0.59 s): objective=3.06785e+09 imbalance=250.284 nsplit=2
Iteration 11 (0.67 s, search 0.64 s): objective=3.06222e+09 imbalance=250.284 nsplit=1
Iteration 12 (0.72 s, search 0.68 s): objective=3.05779e+09 imbalance=250.283 nsplit=1
Iteration 13 (0.76 s, search 0.73 s): objective=3.05444e+09 imbalance=250.283 nsplit=0
Iteration 14 (0.81 s, search 0.77 s): objective=3.05168e+09 imbalance=250.283 nsplit=0
Iteration 15 (0.85 s, search 0.82 s): objective=3.04984e+09 imbalance=250.282 nsplit=0
Iteration 16 (0.90 s, search 0.87 s): objective=3.04834e+09 imbalance=250.282 nsplit=0
Iteration 17 (0.95 s, search 0.91 s): objective=3.04722e+09 imbalance=250.282 nsplit=0
Iteration 18 (1.00 s, search 0.96 s): objective=3.04616e+09 imbalance=250.282 nsplit=0
Iteration 19 (1.08 s, search 1.04 s): objective=3.04522e+09 imbalance=250.282 nsplit=0
(faiss_1.5.2) matthijs@devfair0144:~/faiss_versions/faiss_1Tcode/faiss/benchs/distributed_ondisk$ python distributed_kmeans.py --test 3
using 2 GPUs
Clustering 100000 points in 128D to 1000 clusters, 20 iterations seed 1234
preproc...
done
Iteration 0 (0.10 s, search 0.09 s): objective=5.19019e+09 imbalance=250.883 nsplit=500
Iteration 1 (0.12 s, search 0.11 s): objective=3.57228e+09 imbalance=250.477 nsplit=290
Iteration 2 (0.14 s, search 0.12 s): objective=3.37025e+09 imbalance=250.384 nsplit=155
Iteration 3 (0.15 s, search 0.14 s): objective=3.25678e+09 imbalance=250.334 nsplit=85
Iteration 4 (0.17 s, search 0.15 s): objective=3.18858e+09 imbalance=250.311 nsplit=52
Iteration 5 (0.18 s, search 0.17 s): objective=3.14527e+09 imbalance=250.298 nsplit=28
Iteration 6 (0.20 s, search 0.18 s): objective=3.11771e+09 imbalance=250.292 nsplit=16
Iteration 7 (0.22 s, search 0.20 s): objective=3.0983e+09 imbalance=250.289 nsplit=11
Iteration 8 (0.23 s, search 0.21 s): objective=3.08438e+09 imbalance=250.286 nsplit=7
Iteration 9 (0.25 s, search 0.23 s): objective=3.07427e+09 imbalance=250.283 nsplit=3
Iteration 10 (0.26 s, search 0.24 s): objective=3.06677e+09 imbalance=250.282 nsplit=0
Iteration 11 (0.28 s, search 0.26 s): objective=3.06085e+09 imbalance=250.282 nsplit=0
Iteration 12 (0.29 s, search 0.27 s): objective=3.05659e+09 imbalance=250.281 nsplit=0
Iteration 13 (0.31 s, search 0.29 s): objective=3.05308e+09 imbalance=250.281 nsplit=0
Iteration 14 (0.32 s, search 0.30 s): objective=3.05027e+09 imbalance=250.281 nsplit=0
Iteration 15 (0.34 s, search 0.31 s): objective=3.04799e+09 imbalance=250.281 nsplit=0
Iteration 16 (0.35 s, search 0.33 s): objective=3.04611e+09 imbalance=250.281 nsplit=0
Iteration 17 (0.37 s, search 0.34 s): objective=3.04455e+09 imbalance=250.281 nsplit=0
Iteration 18 (0.38 s, search 0.36 s): objective=3.04338e+09 imbalance=250.281 nsplit=0
Iteration 19 (0.40 s, search 0.37 s): objective=3.04262e+09 imbalance=250.281 nsplit=0
@sosofun
Copy link

sosofun commented Aug 29, 2019

hi, can you share the raw code of distributed_kmeans.py ? i am very interested in its implementation with faiss. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment