Skip to content

Instantly share code, notes, and snippets.

@kmhofmann
Last active December 16, 2024 20:45
Show Gist options
  • Save kmhofmann/e368a2ebba05f807fa1a90b3bf9a1e03 to your computer and use it in GitHub Desktop.
Save kmhofmann/e368a2ebba05f807fa1a90b3bf9a1e03 to your computer and use it in GitHub Desktop.
Building TensorFlow from source

Building TensorFlow from source (TF 2.1.0, Ubuntu 19.10)

Why build from source?

The official instructions on installing TensorFlow are here: https://www.tensorflow.org/install. If you want to install TensorFlow just using pip, you are running a supported Ubuntu LTS distribution, and you're happy to install the respective tested CUDA versions (which often are outdated), by all means go ahead. A good alternative may be to run a Docker image.

I am usually unhappy with installing what in effect are pre-built binaries. These binaries are often not compatible with the Ubuntu version I am running, the CUDA version that I have installed, and so on. Furthermore, they may be slower than binaries optimized for the target architecture, since certain instructions are not being used (e.g. AVX2, FMA).

So installing TensorFlow from source becomes a necessity. The official instructions on building TensorFlow from source are here: https://www.tensorflow.org/install/install_sources.

What they don't mention there is that on supposedly "unsupported" configurations (i.e. up-to-date Linux systems), this can be a task from hell. In fact, building TensorFlow either way is a veritable clusterfuck. I don't know if that is due to the inherent complexity of such a framework or just lazy engineering, but the TensorFlow developers are certainly not trying to make one's life easy. My conservative guess is that quite a few developer years have been wasted out there because of the seemingly bonkers choices that have been made during TensorFlow development.

Or should I say: Building TensorFlow is as intuitive as using its API? ;-)

Described configuration

I am describing the steps necessary to build TensorFlow in (currently) the following configuration:

  • Ubuntu 19.10
  • NVIDIA driver 440.44
  • CUDA 10.2 / cuDNN v7.6.5
  • TensorFlow v2.1.0

At the time of writing (2020-01-11), these were the latest available versions.

Note that I am not interested in running an outdated Ubuntu version (this includes the truly ancient 18.04 LTS), installing a CUDA/cuDNN version that is not the latest, or using a TensorFlow version that is not the latest. Regressing to either of these is nonsensical to me. Therefore, the below instructions may or may not be useful to you. Please also note that the instructions are likely outdated, since I only update them occasionally. Don't just copy these instructions, but check what the respective latest versions are and use these instead!

Prerequisites

Installing the NVIDIA driver

Download and install the latest NVIDIA graphics driver from here: https://www.nvidia.com/en-us/drivers/unix/. Note that every CUDA version requires a minimum version of the driver; check this beforehand. Ubuntu 19.10 offers installation of the NVIDIA driver version 435.00 through its built-in 'Additional Drivers' mechanism, but CUDA 10.2 requires a newer version that cannot be obtained this way.

The CUDA runfile also includes a version of the NVIDIA graphics driver, but I like to separate installing either, as installing them in combination can be more brittle on "unsupported" distributions for CUDA.

Installing CUDA

Download the latest CUDA version here. For example, I downloaded:

$ wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run

Here's the first roadblock: Ubuntu 19.10 ships with GCC 9.2.1 by default, but CUDA 10.2 pretends to only support Ubuntu 18.04 and GCC versions up to version 8. When trying to install CUDA on an up-to-date system, it will fail. Uhm... this is insane. I understand when code needs to be built with a certain minimum version of a compiler, but no well written piece of software ever should specify a maximum version. You would now think that you can simply install GCC 8 (something along the lines of sudo apt install gcc-8 and running CC=$(which gcc-8) CXX=$(which g++-8) ./cuda_10.2.89_440.33.01_linux.run as root) and be happy, but alas, no. The CUDA installer conveniently disregards any such set environment variables.

Time for more desperate measures. Go ahead and install CUDA like this:

$ sudo sh cuda_10.2.89_440.33.01_linux.run --override

The --override flag overrides the compiler check, and you can now go on. Deselect the driver if it was installed earlier, but install the rest. Try to build the samples. You will notice that this fails, again with a message such as

unsupported GNU version! gcc versions later than 8 are not supported!

Thanks for nothing, NVIDIA. Thankfully we can disable this error by commenting out the #error pragma in /usr/local/cuda/include/crt/host_config.h. Do so. This is what it looks like for me:

#if defined(__GNUC__)

#if __GNUC__ > 8

//#error -- unsupported GNU version! gcc versions later than 8 are not supported!

#endif /* __GNUC__ > 8 */

I have no idea what the implications are, but so far I haven't found any. There's a similar section on Clang just below, in case you decide to compile TensorFlow with Clang. (I have not tried yet, but it should be a good adventure.)

Installing cuDNN

Just go here and follow the instructions. You'll have to log in, so downloading of the right cuDNN binary packages cannot be easily automated. Meh.

System packages

According to the official instructions, TensorFlow requires Python and pip:

$ sudo apt install python3-dev python3-pip

Installing Bazel

Bazel is Google's monster of a build system and is required to build TensorFlow.

Google apparently did not want to make developers' lives easy and use a de-facto standard build system such as CMake. Life could be so nice. No, Google is big and dangerous enough to force their own creation upon everyone and thus make everyone else's life miserable. I wouldn't complain if Bazel was nice and easy to use. But I don't think there was a single time when I built TensorFlow and did not have issues with Bazel.

And oh my, there are some issues right here: There are instructions on how to install Bazel using Ubuntu's APT repository mechanism. Forget those, they won't work for our purposes. Neither will compiling the latest Bazel version (2.0.0 at the time of writing) from source. This is because TensorFlow actually requires a pretty old version of Bazel (0.29.1, as opposed to 2.0.0 or greater) to be built with. I don't know if this says more about the state of Bazel or TensorFlow, but either way, it's not confidence inducing.

Okay, so let's just try to build the latest supported version of Bazel, 0.29.1, from source. We simply install the prerequisites mentioned in the instructions above, download the respective distribution build, compile it with env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh, and... it doesn't compile. :-( There's a beautiful error message saying error: ambiguating new declaration of 'long int gettid()'.

Long story short: some dependency of Bazel (gRPC) used some function names it shouldn't have been using, and fails in combination with glibc 2.30. This was fixed upstream several months ago, but Bazel developers didn't bother to fix it in a maintenance release (0.29.2 anyone?). They only updated the dependency in Bazel 2.0.0, which does not work with TensorFlow. D'oh.

Anyway. The easiest way to use Bazel for compiling TensorFlow 2.1.0 on Ubuntu 19.10 that I know of is to download a pre-built binary, e.g. using

$ wget https://github.com/bazelbuild/bazel/releases/download/0.29.1/bazel-0.29.1-linux-x86_64
$ mv bazel-0.29.1-linux-x86_64 bazel   # and make sure this is on the PATH

This is utterly sad.

Building TensorFlow

Guess what: not fun either. Actually, the same issue of the gRPC dependency that plagued us with Bazel is coming back here. And this time, we have no choice but to actually fix it.

Cloning and patching

First clone the sources, and check out the desired branch. At the time of writing, v2.1.0 was the latest version; adjust if necessary.

  $ git clone https://github.com/tensorflow/tensorflow
  $ cd tensorflow
  $ git checkout v2.1.0

If we now just went ahead and tried to build TensorFlow, we would soon hit the same beautiful error message again as we hit when trying to compile Bazel 0.29.1.

To fix this, I have recreated the proposed fix on the sources that get downloaded by Bazel. See the resulting patch file in the Appendix below. Create a file named grpc_gettid_fix.patch and add it to the ./third_party directory of the TensorFlow repository.

We now need to add the information that the patch needs to be applied to the Bazel workspace file. See the Appendix for the diff, which also fixed another issue that would hit us during the build step. Apply this diff manually - it's only two lines in two files. (I'm not providing a complete, unified patch file here, because it's likely only valid and applicable for a short amount of time.)

Configuration

Create a Python 3 virtual environment, if you have not done this yet. For example:

  $ python3 -m venv ~/.virtualenvs/tf_dev

Activate it with source ~/.virtualenvs/tf_dev/bin/activate. This can later be deactivated with deactivate.

Install the Python packages mentioned in the official instructions:

$ pip install -U pip six numpy wheel setuptools mock 'future>=0.17.1'
$ pip install -U keras_applications --no-deps
$ pip install -U keras_preprocessing --no-deps

(If you choose to not use a virtual environment, you'll need to add --user to each of the above commands.)

Now run the TensorFlow configuration script

  $ ./configure

We all like interactive scripts called ./configure, don't we? (Whoever devised this atrocity has never used GNU tools before.)

Carefully go through the options. You can leave most defaults, but do specify the required CUDA compute capabilities (as below, or similar):

  CUDA support -> Y
  CUDA compute capability -> 5.2,6.1,7.0

Some of the compute capabilities of popular GPU cards might be good to know:

  • Maxwell TITAN X: 5.2
  • Pascal TITAN X (2016): 6.1
  • GeForce GTX 1080 Ti: 6.1
  • Tesla V100: 7.0

(See here for the full list.)

Building

Now we can start the TensorFlow build process.

$ bazel build --config=opt -c opt //tensorflow/tools/pip_package:build_pip_package

Totally intuitive, right? :-D This command will build TensorFlow using optimized settings for the current machine architecture.

  • Add -c dbg --strip=never in case you do not want debug symbols to be stripped (e.g. for debugging purposes). Usually, you won't need to add this option.

  • Add --compilation_mode=dbg to build in debug instead of release mode, i.e. without optimizations. You shouldn't do this unless you really want to.

This will take some time. Have a coffee, or two, or three. Cook some dinner. Watch a movie.

Building & installing the Python package

Once the above build step has completed without error, the remainder is now easy. Build the Python package, which the build_pip_package script puts into a predefined location (outside of the build tree, yay! </s>).

  $ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

And install the build wheel package:

  $ pip install /tmp/tensorflow_pkg/tensorflow-2.1.0-cp37-cp37m-linux_x86_64.whl

Testing the installation

Google suggests to test the TensorFlow installation with the following command:

$ python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

This does not make explicit use of CUDA yet, but will emit a whole bunch of initialization messages that can give an indication whether all libraries could be loaded. And it should print that requested sum.

It worked? Great! Be happy and hope you won't have to build TensorFlow again any time soon...


Appendix

Content of the grpc_gettid_fix.patch file, to be added to the third_party directory

diff -rc grpc/src/core/lib/gpr/log_linux.cc grpc-patched/src/core/lib/gpr/log_linux.cc
*** grpc/src/core/lib/gpr/log_linux.cc  2019-04-03 21:06:27.000000000 +0200
--- grpc-patched/src/core/lib/gpr/log_linux.cc  2019-12-22 17:17:02.000000000 +0100
***************
*** 40,46 ****
  #include <time.h>
  #include <unistd.h>

! static long gettid(void) { return syscall(__NR_gettid); }

  void gpr_log(const char* file, int line, gpr_log_severity severity,
               const char* format, ...) {
--- 40,46 ----
  #include <time.h>
  #include <unistd.h>

! static long sys_gettid(void) { return syscall(__NR_gettid); }

  void gpr_log(const char* file, int line, gpr_log_severity severity,
               const char* format, ...) {
***************
*** 70,76 ****
    gpr_timespec now = gpr_now(GPR_CLOCK_REALTIME);
    struct tm tm;
    static __thread long tid = 0;
!   if (tid == 0) tid = gettid();

    timer = static_cast<time_t>(now.tv_sec);
    final_slash = strrchr(args->file, '/');
--- 70,76 ----
    gpr_timespec now = gpr_now(GPR_CLOCK_REALTIME);
    struct tm tm;
    static __thread long tid = 0;
!   if (tid == 0) tid = sys_gettid();

    timer = static_cast<time_t>(now.tv_sec);
    final_slash = strrchr(args->file, '/');
diff -rc grpc/src/core/lib/gpr/log_posix.cc grpc-patched/src/core/lib/gpr/log_posix.cc
*** grpc/src/core/lib/gpr/log_posix.cc  2019-04-03 21:06:27.000000000 +0200
--- grpc-patched/src/core/lib/gpr/log_posix.cc  2019-12-22 17:17:30.000000000 +0100
***************
*** 31,37 ****
  #include <string.h>
  #include <time.h>

! static intptr_t gettid(void) { return (intptr_t)pthread_self(); }

  void gpr_log(const char* file, int line, gpr_log_severity severity,
               const char* format, ...) {
--- 31,37 ----
  #include <string.h>
  #include <time.h>

! static intptr_t sys_gettid(void) { return (intptr_t)pthread_self(); }

  void gpr_log(const char* file, int line, gpr_log_severity severity,
               const char* format, ...) {
***************
*** 86,92 ****
    char* prefix;
    gpr_asprintf(&prefix, "%s%s.%09d %7" PRIdPTR " %s:%d]",
                 gpr_log_severity_string(args->severity), time_buffer,
!                (int)(now.tv_nsec), gettid(), display_file, args->line);

    fprintf(stderr, "%-70s %s\n", prefix, args->message);
    gpr_free(prefix);
--- 86,92 ----
    char* prefix;
    gpr_asprintf(&prefix, "%s%s.%09d %7" PRIdPTR " %s:%d]",
                 gpr_log_severity_string(args->severity), time_buffer,
!                (int)(now.tv_nsec), sys_gettid(), display_file, args->line);

    fprintf(stderr, "%-70s %s\n", prefix, args->message);
    gpr_free(prefix);
diff -rc grpc/src/core/lib/iomgr/ev_epollex_linux.cc grpc-patched/src/core/lib/iomgr/ev_epollex_linux.cc
*** grpc/src/core/lib/iomgr/ev_epollex_linux.cc 2019-04-03 21:06:27.000000000 +0200
--- grpc-patched/src/core/lib/iomgr/ev_epollex_linux.cc 2019-12-22 17:18:12.000000000 +0100
***************
*** 1103,1109 ****
  }

  #ifndef NDEBUG
! static long gettid(void) { return syscall(__NR_gettid); }
  #endif

  /* pollset->mu lock must be held by the caller before calling this.
--- 1103,1109 ----
  }

  #ifndef NDEBUG
! static long sys_gettid(void) { return syscall(__NR_gettid); }
  #endif

  /* pollset->mu lock must be held by the caller before calling this.
***************
*** 1123,1129 ****
  #define WORKER_PTR (&worker)
  #endif
  #ifndef NDEBUG
!   WORKER_PTR->originator = gettid();
  #endif
    if (grpc_polling_trace.enabled()) {
      gpr_log(GPR_INFO,
--- 1123,1129 ----
  #define WORKER_PTR (&worker)
  #endif
  #ifndef NDEBUG
!   WORKER_PTR->originator = sys_gettid();
  #endif
    if (grpc_polling_trace.enabled()) {
      gpr_log(GPR_INFO,

git diff of the TensorFlow repository, identifying modified files

$ git diff
diff --git a/tensorflow/workspace.bzl b/tensorflow/workspace.bzl
index 77e605fe76..d2dcef48d7 100755
--- a/tensorflow/workspace.bzl
+++ b/tensorflow/workspace.bzl
@@ -514,6 +514,7 @@ def tf_repositories(path_prefix = "", tf_repo_name = ""):
     # WARNING: make sure ncteisen@ and vpai@ are cc-ed on any CL to change the below rule
     tf_http_archive(
         name = "grpc",
+        patch_file = clean_dep("//third_party:grpc_gettid_fix.patch"),
         sha256 = "67a6c26db56f345f7cee846e681db2c23f919eba46dd639b09462d1b6203d28c",
         strip_prefix = "grpc-4566c2a29ebec0835643b972eb99f4306c4234a3",
         system_build_file = clean_dep("//third_party/systemlibs:grpc.BUILD"),
diff --git a/third_party/nccl/build_defs.bzl.tpl b/third_party/nccl/build_defs.bzl.tpl
index 5719139855..5f5c3a1008 100644
--- a/third_party/nccl/build_defs.bzl.tpl
+++ b/third_party/nccl/build_defs.bzl.tpl
@@ -113,7 +113,6 @@ def _device_link_impl(ctx):
             "--cmdline=--compile-only",
             "--link",
             "--compress-all",
-            "--bin2c-path=%s" % bin2c.dirname,
             "--create=%s" % tmp_fatbin.path,
             "--embedded-fatbin=%s" % fatbin_h.path,
         ] + images,
@RubinXnibu
Copy link

YES !!!! I have TF 2.2 and TF 1.15.3 both running with GPU and the same CUDA and drivers on the same box!!! OH YEA OH YEA !!!! Thank you for your post!

(To anyone wanting to duplicate this, I had no problems building TF 2.2 with above instructions, and for building TF 1.15.3 it is almost as easy, you just have to take care of a couple issues. You just change the checkout command to "git checkout v1.15.3", install and use bazel 0.26.1, and deal with some compile errors. Yes, the source required fixing four things and one fix in bazel 0.26.1's cache: The four fixes are here:
themikem/tensorflow@cc5645f
And for the cache, edit this file:
~/.cache/bazel/bazel//external/grpc/src/core/lib/gpr/log_linux.cc
and rename the function static long gettid(void) to mygettid(void) to avoid a conflict with system function of the same name.

Thanks again kmhofmann !! )

@jarlostensen
Copy link

jarlostensen commented Jun 8, 2020

Great post, thanks for this. My experience was that it was "not straight forward" but that's to be expected.
I used pipenv btw.

A couple of things I had to fix were;

  • A header file somewhere (I forgot where, sorry) throws a compile time error if the GCC version > 8 (unbelievable) but this line can simply be commented out.

  • I had no end of trouble getting keras_application and keras_preprocessing installed, tbh, and in the end I had to install it in the "host" environment (i.e. I ran pip install keras) outside of the pipenv I was running. Perhaps my choice of virtenv was the problem, I don't know, but when I did this the dogging "import keras_preprocessing" error went away.

  • You need TIME, at least 12 Gigs of RAM and ideally as many cores as you can find. Otherwise you will wait. For a long time.

  • Consider putting the PATH setup and bazel command line in a shell script, you will rerun frequently so having that handy is good.

  • Be strong.

@stanwin00
Copy link

Hello, great tutorial! but i keep getting this gcc compile error. Anyway I can fix this?
ensorflow/compiler/mlir/tensorflow/BUILD:175:1: C++ compilation of rule '//tensorflow/compiler/mlir/tensorflow:tensorflow' failed (Exit 1)
gcc: fatal error: Killed signal terminated program cc1plus

@xenon3dfx
Copy link

xenon3dfx commented Jul 26, 2020

Hello everybody.

First of all, thank you Hoffman and the people who commented here for your contributions. This page is one of the best resources I have found, if not the best.

The GPU in my new intel i7 setup is an RTX 2060 SUPER. I would like to make Cuda work with Tensorflow. I have been for 6 days following this guide, actually doing nothing else.

  • I went through 4 complete fresh installs of Ubuntu 20.04. I tried different combinations of drivers: Nvidia 440.100, 450.51 and 450.57 , CUDA 10.1, 10.02 and 11.0, cuDNN 7 and 8. And everything installed from apt or from the downloaded packages.

In one of these 4 fresh installs, with 450 and Cuda 11 and cuDNN 8 I was able to successfully compile TF 2.2. I successfully installed the wheel file with pip, but unfortunately it did not work: I run different training processes, but when I opened nvidia-smi , the python process did not show, and just the CPU resources were available. Therefore, I understood that my driver/cuda/cudnn installation was broken. I was not able to fix it, so I formatted and started again. Unfortunately, I was not able to compile it again. I tried different ways to install the nVidia drivers, Cuda and cuDNN, to no avail.

I would like to start again, taking attention on the very basic, in a 5th fresh install. But before formatting, I would like to know if I was doing anything wrong.

  1. Could you please let me know a combination of versions that worked for you in Ubuntu 20.04?
  • NVIDIA drivers:
  • CUDA drivers
  • cuDNN:
  1. Could you please let me know exactly, how did you install the nVidia drivers after a fresh install? Same for the Cuda drivers and cuDNN.

If you let me know how you installed these three items, I think I can take over and compile tensorflow again, to see if it works.

Again, thank you so much.

@kmhofmann
Copy link
Author

Hi @xenon3dfx,
The exact versions of CUDA, cuDNN, and NVIDIA drivers that I used are mentioned above in the article (currently: NVIDIA driver 440.82, CUDA 10.2, cuDNN v7.6.5). As far as I remember I installed the NVIDIA driver through the Ubuntu-native built-in 'Additional Drivers' mechanism in this instance (since it was equal to the latest available version at the time). I installed the CUDA and cuDNN libraries using the files downloaded from the NVIDIA website (i.e. no apt!), as described here.

Note that the last update to the text was made in May, so apparently versions are outdated by now. Unfortunately I don't have time for another thorough update or any further testing at the moment, so this will have to wait until a later point in time (at least August).

If you use the documented (un)installation instructions for driver and CUDA/cuDNN libraries, you shouldn't have to do any fresh install of Ubuntu 20.04. Uninstalling using the official scripts (do consult the NVIDIA docs) will be sufficient to remove all parts, if necessary.

@jucendrero
Copy link

Thank you for this amazing guide! I didn't expect that building TF from source would be so unintuitive. I've just installed TF 2.3 following your steps and everything worked perfectly fine for me in Ubuntu 20.04 (NVIDIA driver 450.51.05, CUDA 11, cuDNN v8.0.2).

@rsuprun
Copy link

rsuprun commented Aug 28, 2020

Thank the heavens! This works!! This is the 4th time I've built TF from source, and everytime is an exploration in masochism. Just when I think I've figured out the gotchas, there's something new lying in wait. Thank you so much for such thorough instructions.

@FHermisch
Copy link

Whoa, got it! For me it worked nearly completely as described, although I started with a slightly different setup: Ubuntu 18.04, Nvidia-Driver 450.51.06, CUDA 11.0.3, cuDNN 8.0.3.33, GCC 7.5.0.
And then there is the last step(!!!): I tried to execute the "python -c "import tensorflow as tf;" from the directory I was in (because I executed all the steps before there) - it failed for me. Some Googling brought up, that Python will mess with the directory structure and the imports and ends up with "ImportError: cannot import name 'function_pb2'".
Anyone who is stuck there: keep calm, change directory, (to ~ or something) and try again ;-)

@piotrv
Copy link

piotrv commented Sep 27, 2020

I failed compiling TF 2.3.1 (segmentation faults), but with TF 2.3.0, I got it right.
The build process took ages with my I7-4770, almost 4 hours.
I choose bazelisk instead of directly bazel, as it is more convenient to setup

 $ bazel --version
 bazel 3.5.0

This version runs fine.

My config file:

 cat  .tf_configure.bazelrc

build --action_env PYTHON_BIN_PATH="/usr/bin/python3"
build --action_env PYTHON_LIB_PATH="/usr/lib/python3.8/site-packages"
build --python_path="/usr/bin/python3"
build --config=xla
build --action_env TF_CUDA_VERSION="11"
build --action_env TF_CUDNN_VERSION="8"
build --action_env TF_NCCL_VERSION=""
build --action_env TF_CUDA_PATHS="/opt/cuda,/usr/lib,/usr/include"
build --action_env CUDA_TOOLKIT_PATH="/opt/cuda"
build --action_env TF_CUDA_COMPUTE_CAPABILITIES="6.1"
build --action_env LD_LIBRARY_PATH="/opt/cuda/extras/CUPTI/lib64:/opt/cuda/extras/CUPTI/lib64"
build --action_env GCC_HOST_COMPILER_PATH="/usr/bin/gcc-9"
build --config=cuda
build:opt --copt=-march=native
build:opt --copt=-Wno-sign-compare
build:opt --host_copt=-march=native
build:opt --define with_default_optimizations=true
test --flaky_test_attempts=3
test --test_size_filters=small,medium
test --test_env=LD_LIBRARY_PATH
test:v1 --test_tag_filters=-benchmark-test,-no_oss,-no_gpu,-oss_serial
test:v1 --build_tag_filters=-benchmark-test,-no_oss,-no_gpu
test:v2 --test_tag_filters=-benchmark-test,-no_oss,-no_gpu,-oss_serial,-v1only
test:v2 --build_tag_filters=-benchmark-test,-no_oss,-no_gpu,-v1only
build --action_env TF_CONFIGURE_IOS="0"

I always like to (roughly) compare CPU and GPU, so I "embellished" the little Python test a bit:

import time

import tensorflow as tf

cpu_slot = 0
gpu_slot = 0
random_values = [24000, 24000]   # lower values if you get an OOM error
gpus = tf.config.experimental.list_physical_devices('GPU')


def tensor_ops(values):
    return tf.reduce_sum(tf.random.normal(values))


if not gpus:
    print("Sorry, no GPU available")
else:
    # Using CPU at slot 0
    with tf.device('/cpu:' + str(cpu_slot)):
        # Starting a timer
        start = time.monotonic()
        # Doing operations on CPU
        tensor_ops(random_values)
        # Printing how long it took with CPU
        end_CPU = time.monotonic() - start
    # Using the GPU at slot 0
    with tf.device('/gpu:' + str(gpu_slot)):
        # Starting a timer
        start = time.monotonic()
        # Doing operations on CPU
        tensor_ops(random_values)
        # Printing how long it took with CPU
        end_GPU = time.monotonic() - start
        print(f"Executing {random_values[0]} x {random_values[1]} tensor operation:\n")
        print("CPU took:", end_CPU)
        print("GPU took:", end_GPU)
        print(f"\nGPU is {end_CPU / end_GPU:.3f} times faster than CPU !")

Result on my "old" desktop machine with a GTX 1070:

Executing 24000 x 24000 tensor operation:

CPU took: 2.706507972000054
GPU took: 0.11730803399996148

GPU is 23.072 times faster than CPU !

@Jack-KW
Copy link

Jack-KW commented Sep 28, 2020

Thanks! Helped a lot! Very detailed, comfortable to read article, and very easy to implement instructions.

@github-jeff
Copy link

I think I am quite close. This is an excellent article. CUDA, and cuDNN appears to be installed, but I cannot seem to figure this error out when I am compiling via bazel 3.1.0. I'm hoping I just have a header file in the wrong location.

Loading: 0 packages loaded
currently loading: tensorflow/tools/pip_package
Fetching @local_config_cuda; fetching
ERROR: An error occurred during the fetch of repository 'local_config_cuda':
Traceback (most recent call last):

@sbatururimi
Copy link

sbatururimi commented Nov 26, 2020

I was unable to compile v2.3.1 nor v2.3.0 with

  1. CUDA/cuDNN version: 11.1, 8.0.5, driver 455.45.01
  2. RTX2080Ti
    Always getting
ERROR: An error occurred during the fetch of repository 'local_config_cuda':

But seems to work with v2.4.0rc3 (still in progress). Any advice about how I could use 2.3.1 with the above settings?

Much appreciated!

Updates

  1. I was been able to build v2.4.0rc3 with success using the above Cuda/CuDNN and driver 455.46.01
  2. If, like me, you need an up to date version of tensorflow-text also, then follow the guide above and build v2.3.0 with the above drivers (worked for me)

@becageuse
Copy link

Thanks for this! I'll keep that in mind if I ever need to use tf2 + cuda11.
I needed to use tf2 but my machine has cuda10.2. A friend suggested me to install tensorflow with conda and it worked like a charm with gpu support! Apparently it automatically installs a cudatoolkit 10.1 locally in your conda virtual env.

@Abilay99
Copy link

Abilay99 commented Dec 4, 2020

building tensorflow from source how long will it take and how many iterations will I have now 18674/19018

@sbatururimi
Copy link

building tensorflow from source how long will it take and how many iterations will I have now 18674/19018

It took me around 2.5h for the whole process

@Abilay99
Copy link

Abilay99 commented Dec 4, 2020

building tensorflow from source how long will it take and how many iterations will I have now 18674/19018

It took me around 2.5h for the whole process

Cool!, What devices do you have? I have gpu: gtx 1650ti for notebook cpu: core i5 10th gen ram: 8gb. I started the installation process yesterday, it hasn't finished yet

@Abilay99
Copy link

Abilay99 commented Dec 4, 2020

building tensorflow from source how long will it take and how many iterations will I have now 18674/19018

It took me around 2.5h for the whole process

Cool!, What devices do you have? I have gpu: gtx 1650ti for notebook cpu: core i5 10th gen ram: 8gb. I started the installation process yesterday, it hasn't finished yet

It took me around 13.5h for the whole process

@sbatururimi
Copy link

Cool!, What devices do you have? I have gpu: gtx 1650ti for notebook cpu: core i5 10th gen ram: 8gb. I started the installation process yesterday, it hasn't finished yet
A PC build almost as a server (or Game PC configuration almost) :), RTX2080Ti, 64gb RAM,

Intel Core i9-9900KS 4 GHz 8-Core Processor

@jediRey
Copy link

jediRey commented Dec 10, 2020

Hi! Thank you for the complete instructions!
I compiled bazel successfully, however when building tensorflow (this step: bazel build --config=opt -c opt //tensorflow/tools/pip_package:build_pip_package), I get the following error:

RROR: An error occurred during the fetch of repository 'eigen_archive':
java.io.IOException: Error downloading [https://storage.googleapis.com/mirror.tensorflow.org/gitlab.com/libeigen/eigen/-/archive/386d809bde475c65b7940f290efe80e6a05878c4/eigen-386d809bde475c65b7940f290efe80e6a05878c4.tar.gz, https://gitlab.com/libeigen/eigen/-/archive/386d809bde475c65b7940f290efe80e6a05878c4/eigen-386d809bde475c65b7940f290efe80e6a05878c4.tar.gz] to /home/rey/.cache/bazel/_bazel_rey/7a2ccf9885a6b6731b7d3780dd19d183/external/eigen_archive/eigen-386d809bde475c65b7940f290efe80e6a05878c4.tar.gz: GET returned 406 Not Acceptable

Any ideas? I'm using ubuntu 20.4, bazel 3.1.0, and I have tried with both tensorflow 2.3.0 & 2.3.1
Thank you,
Rey

@pribadihcr
Copy link

pribadihcr commented Dec 18, 2020

Whoa, got it! For me it worked nearly completely as described, although I started with a slightly different setup: Ubuntu 18.04, Nvidia-Driver 450.51.06, CUDA 11.0.3, cuDNN 8.0.3.33, GCC 7.5.0.
And then there is the last step(!!!): I tried to execute the "python -c "import tensorflow as tf;" from the directory I was in (because I executed all the steps before there) - it failed for me. Some Googling brought up, that Python will mess with the directory structure and the imports and ends up with "ImportError: cannot import name 'function_pb2'".
Anyone who is stuck there: keep calm, change directory, (to ~ or something) and try again ;-)

Hi I still got the same error even I have changed the directory

@manojec054
Copy link

Uffff, Finally completed. Thanks for the detailed explanation. Had to spend most time in adjusting bazel parameters because of less RAM.
Here is few details if anyone trying to build tensorflow with 8GB RAM.

- add "--jobs=2  --local_ram_resources 2048" to limit RAM consumption.
- Had to use gcc 7.0 
- If your GPU is Geforce GTX 1650 use compute capabilities 7.5. Its not listed in https://developer.nvidia.com/cuda-gpus

@iaroslavragel
Copy link

Hi @kmhofmann, how do run all those pip command? Package python-pip is no longer present in ubuntu 20.04. So is your pip pointing to pip3 actually? Or did you install it not from ubuntu repo?

@EnziinSystem
Copy link

Build Tensorflow from source code is a real nightmare.
Also, I have CPU Core i7 and 8 cores with 16GB RAM but I halt built after 6 hours, my computer hangs.

@fouvy
Copy link

fouvy commented Aug 10, 2021

for cuda 11.0 version and up version. include and lib file has been moved to
/usr/local/cuda-11.0/targets/x86_64-linux
so the fastest way to solve build error of find cuda.h error or cusolver_common.h error and so on, is to doing this:

cp -r /usr/local/cuda-11.0/targets/x86_64-linux/lib/* /usr/local/cuda-11.0/lib64/
cp -r /usr/local/cuda-11.0/targets/x86_64-linux/include/* /usr/local/cuda-11.0/include/

@pnheinsohn
Copy link

For any of those who are having trouble with @local_config_cuda because of some version incompatibility with libcuddart.11.x.y, I suggest this Link: Step 3: Errors, where OP changes the version to 11.0 manually.

@mhoangvslev
Copy link

I made a dockerised tensorflow-compiler that allows you to compile from source with minimal input from the end user
https://github.com/mhoangvslev/tensorflow-compiler

@clockzhong
Copy link

I follow the instructions, but found the following errors:


WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/tensorflow/runtime/archive/093ed77f7d50f75b376f40a71ea86e08cedb8b80.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found
DEBUG: /home/clock/.cache/bazel/_bazel_clock/3e6fc66f8ed17456d842535e756a7be2/external/bazel_tools/tools/cpp/lib_cc_configure.bzl:118:10: 
Auto-Configuration Warning: 'TMP' environment variable is not set, using 'C:\Windows\Temp' as default
WARNING: Download from https://mirror.bazel.build/github.com/bazelbuild/rules_cc/archive/081771d4a0e9d7d3aa0eed2ef389fa4700dfb23e.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found


Then no any progress going

@lakpa-tamang9
Copy link

lakpa-tamang9 commented Dec 22, 2022

I have the following error while installing the tensorflow package at the final step. Please help me resolve this.

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

The error started from here.

Collecting tb-nightly<2.4.0a0,>=2.3.0a0
  Using cached tb_nightly-2.3.0a20200722-py3-none-any.whl (6.8 MB)
Collecting google-pasta>=0.1.8
  Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Collecting absl-py>=0.7.0
  Using cached absl_py-1.3.0-py3-none-any.whl (124 kB)
Collecting scipy==1.4.1
  Using cached scipy-1.4.1.tar.gz (24.6 MB)
  Installing build dependencies ... error
  error: subprocess-exited-with-error
  
  × pip subprocess to install build dependencies did not run successfully.
  │ exit code: 1
  ╰─> [327 lines of output]
      Ignoring numpy: markers 'python_version == "3.5" and platform_system != "AIX"' don't match your environment
      Ignoring numpy: markers 'python_version == "3.6" and platform_system != "AIX"' don't match your environment

@g588928812
Copy link

g588928812 commented Apr 16, 2023

Great guide! Just made TF work with CUDA 12.1 and a RTX 3090. Thanks!

@ilan1987
Copy link

Thanks, you helped me a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment