TensorFlow Install

snow chuai汇总、整理、撰写---2020/07/26


1. TensorFlow安装与验证
1) 确认主机支持avx指令
[root@srv1 ~]# cat /proc/cpuinfo | grep avx
2) 安装Python 3.6 [root@srv1 ~]# yum install python3 -y
3) 安装一些软件包 [root@srv1 ~]# yum install python3-devel python3-virtualenv gcc gcc-c++ make -y
4) 安装TensorFlow [root@srv1 ~]# su - snow [snow@srv1 ~]$ virtualenv --system-site-packages -p python3 ./venv Running virtualenv with interpreter /usr/bin/python3 Using base prefix '/usr' New python executable in /home/snow/venv/bin/python3 Also creating executable in /home/snow/venv/bin/python Installing setuptools, pip, wheel...done.
[snow@srv1 ~]$ source ./venv/bin/activate (venv) [snow@srv1 ~]$
(venv) [snow@srv1 ~]$ pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade tensorflow==2.0.0b1
# 在TensorFlow 2.0及以上版本中,如果numpy高于1.17则会出现许多警告。因此需要降级numpy至1.16版本 (venv) [snow@srv1 ~]$ pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade numpy==1.16.0 (venv) [snow@srv1 ~]$ pip3 show numpy Name: numpy Version: 1.16.0 Summary: NumPy is the fundamental package for array computing with Python. Home-page: https://www.numpy.org Author: Travis E. Oliphant et al. Author-email: None License: BSD Location: /home/cent/venv/lib/python3.6/site-packages Requires: Required-by: tensorflow, tb-nightly, Keras-Preprocessing, Keras-Applications, h5py
5) 验证TensorFlow (venv) [snow@srv1 ~]$ python3 -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))" 2020-07-26 23:46:23.047533: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2801355000 Hz 2020-07-26 23:46:23.048084: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5633e28df970 executing computations on platform Host. Devices: 2020-07-26 23:46:23.048148: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined> tf.Tensor(545.2815, shape=(), dtype=float32)
(venv) [snow@srv1 ~]$ python3 -c "import tensorflow as tf; hello = tf.constant('Hello, TensorFlow World'); tf.print(hello)" 2020-07-26 23:48:13.047533: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2801355000 Hz 2020-07-26 23:48:13.815812: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5633f4d64910 executing computations on platform Host. Devices: 2020-07-26 23:48:13.815899: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined> Hello, TensorFlow World
2. TensorFlow支持GPU
1) 安装CUDA 10.1
2) 安装Python 3.8
3) 下载cuDNN(CUDA Deep Neural Network library) https://developer.nvidia.com/rdp/cudnn-download
4) 安装cuDNN [root@srv1 ~]# tar zxvf cudnn-10.1-linux-x64-v7.6.5.32.tgz [root@srv1 ~]# cp ./cuda/include/cudnn.h /usr/local/cuda-10.1/include/ [root@srv1 ~]# cp -a ./cuda/lib64/libcudnn* /usr/local/cuda-10.1/lib64/ [root@srv1 ~]# ldconfig [root@srv1 ~]# echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.1/extras/CUPTI/lib64' >> /etc/profile.d/cuda101.sh [root@srv1 ~]# source /etc/profile.d/cuda101.sh
5) 安装一些必要的软件 [root@srv1 ~]# yum install centos-release-scl-rh centos-release-scl -y
[root@srv1 ~]# sed -i -e "s/\]$/\]\npriority=10/g" /etc/yum.repos.d/CentOS-SCLo-scl.repo [root@srv1 ~]# sed -i -e "s/\]$/\]\npriority=10/g" /etc/yum.repos.d/CentOS-SCLo-scl-rh.repo
[root@srv1 ~]# sed -i -e "s/enabled=1/enabled=0/g" /etc/yum.repos.d/CentOS-SCLo-scl.repo [root@srv1 ~]# sed -i -e "s/enabled=1/enabled=0/g" /etc/yum.repos.d/CentOS-SCLo-scl-rh.repo
[root@srv1 ~]# yum --enablerepo=centos-sclo-rh install rh-python38-python-devel gcc gcc-c++ make -y [root@srv1 ~]# pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple virtualenv
6) 安装TensorFlow 2.2 [root@srv1 ~]# su - snow [snow@srv1 ~]$ virtualenv --system-site-packages -p python3 ./venv
[snow@srv1 ~]$ source ./venv/bin/activate (venv) [snow@srv1 ~]$ pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade tensorflow==2.2.0
7) 验证 (venv) [snow@srv1 ~]$ python3 -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))" 2020-07-26 23:58:07.037218: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 ...... ......
3. TensorFlow:使用Docker镜像---CPU
1) 安装Docker-CE
[root@srv1 ~]# yum install yum-utils -y
[root@srv1 ~]# yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
[root@srv1 ~]# yum install docker-ce -y
[root@srv1 ~]# systemctl enable --now docker
[root@srv1 ~]# vim /etc/docker/daemon.json { "registry-mirrors": ["https://3laho3y3.mirror.aliyuncs.com"] }
[root@srv1 ~]# systemctl restart docker
2) 下载Tensorflow [root@srv1 ~]# su - snow [snow@srv1 ~]$ podman pull tensorflow/tensorflow:2.0.0-py3
[snow@srv1 ~]$ podman images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/tensorflow/tensorflow 2.0.0-py3 90f5cb97b18f 9 months ago 1.09 GB
3) 运行Tensorflow容器 [snow@srv1 ~]$ podman run --rm tensorflow/tensorflow:2.0.0-py3 \ python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
2020-07-27 00:23:15.128039: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2801480000 Hz ...... ......
[snow@srv1 ~]$ vim hello_tensorflow.py import tensorflow as tf hello = tf.constant('Hello, TensorFlow World!') tf.print(hello)
[snow@srv1 ~]$ podman run --rm -v $PWD:/tmp -w /tmp tensorflow/tensorflow:2.0.0-py3 python ./hello_tensorflow.py 2020-07-27 00:25:03.110934: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2801480000 Hz 2020-07-27 00:25:03.380641: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4add050 executing computations on platform Host. Devices: 2020-07-27 00:25:03.936027: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version Hello, TensorFlow World!
4) 设置SELinux [root@srv1 ~]# vim my-python.te module my-python 1.0;
require { type user_home_t; type container_t; type user_home_dir_t; class file { create ioctl open read unlink write }; class dir { add_name remove_name write }; }
#============= container_t ============== allow container_t user_home_dir_t:dir { add_name remove_name write }; allow container_t user_home_dir_t:file { create ioctl open read unlink write }; allow container_t user_home_t:file { ioctl open read };

[root@srv1 ~]# checkmodule -m -M -o my-python.mod my-python.te [root@srv1 ~]# semodule_package --outfile my-python.pp --module my-python.mod [root@srv1 ~]# semodule -i my-python.pp
5) 安装并运行带有Jupyter Notebook的TensorFlow镜像 [snow@srv1 ~]$ podman pull tensorflow/tensorflow:2.0.0-py3-jupyter
[snow@srv1 ~]$ podman images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/tensorflow/tensorflow 2.0.0-py3-jupyter c652a4fc8a4f 9 months ago 1.24 GB docker.io/tensorflow/tensorflow 2.0.0-py3 90f5cb97b18f 9 months ago 1.09 GB
[snow@srv1 ~]$ podman run -dt --name jn -p 8888:8888 tensorflow/tensorflow:2.0.0-py3-jupyter a20ffeda3160ea59d1efd4182fb96a93bae4df4c764befaadc57c1ee77fcbcd1
[snow@srv1 ~]$ podman exec jn bash -c "jupyter notebook list" Currently running servers: http://0.0.0.0:8888/?token=e72eefb6fa80c74d9640ec9db68a25fd68484d527126bdf6 :: /tf
6) 访问Jupyter Notebook 复制"http://0.0.0.0:8888/?token=e72eefb6fa80c74d9640ec9db68a25fd68484d527126bdf6",并将0.0.0.0改成本机IP地址 如: http://192.168.10.111:8888/?token=e72eefb6fa80c74d9640ec9db68a25fd68484d527126bdf6

4. TensorFlow:使用Docker镜像---GPU
1) 主机支持avx并安装好NVIDIA显卡驱动
2) 安装好Docker-CE
3) 安装NVIDIA Container Toolkit
[root@srv1 ~]# curl https://nvidia.github.io/nvidia-docker/centos8/nvidia-docker.repo > /etc/yum.repos.d/nvidia-docker.repo
[root@srv1 ~]# yum install nvidia-container-toolkit -y
4) 设置SELinux [root@srv1 ~]# vim nvidiasmi.te module my-python 1.0;
module nvidiasmi 1.0;
require { type container_runtime_tmpfs_t; type container_t; type xserver_misc_device_t; class file { open read }; class chr_file { getattr ioctl open read write }; }
#============= container_t ============== allow container_t container_runtime_tmpfs_t:file { open read }; allow container_t xserver_misc_device_t:chr_file { getattr ioctl open read write };

[root@srv1 ~]# checkmodule -m -M -o nvidiasmi.mod nvidiasmi.te [root@srv1 ~]# semodule_package --outfile nvidiasmi.pp --module nvidiasmi.mod [root@srv1 ~]# semodule -i nvidiasmi.pp
5) 使用nvidia-smi [root@srv1 ~]# docker run --gpus all --rm nvidia/cuda:11.0-base nvidia-smi +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.57 Driver Version: 450.57 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce GTX 1070 Off | 00000000:05:00.0 Off | N/A | | 27% 35C P5 27W / 180W | 0MiB / 8119MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
root@62ffc368879:/# nvidia-smi +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.57 Driver Version: 450.57 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce GTX 1070 Off | 00000000:05:00.0 Off | N/A | | 27% 36C P5 25W / 180W | 0MiB / 8119MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
root@62ffc368879:/# exit
[root@srv1 ~]# podman images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/nvidia/cuda 11.0-base 27c1a1745519 4 days ago 125 MB docker.io/nvidia/cuda 10.0-base 841d44dd4b3c 7 months ago 113 MB
6) 下载CUDA/TensorFlow镜像 [root@srv1 ~]# docker pull nvidia/cuda:10.1-base [root@srv1 ~]# docker pull tensorflow/tensorflow:2.1.0-gpu-py3
[root@srv1 ~]# docker images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/tensorflow/tensorflow 2.1.0-gpu-py3 e2a4af785bdb 6 months ago 4.13 GB docker.io/nvidia/cuda 10.1-base 3b55548ae91f 7 months ago 109 MB
7) 验证 [root@srv1 ~]# docker run --gpus all --rm nvidia/cuda:10.1-base nvidia-smi +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.57 Driver Version: 450.57 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce GTX 1070 Off | 00000000:05:00.0 Off | N/A | | 27% 35C P5 25W / 180W | 0MiB / 8119MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
8) 验证并运行TensorFlow [root@srv1 ~]# docker run --gpus all --rm tensorflow/tensorflow:2.1.0-gpu-py3 \ python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))" 2020-07-27 00:48:25.131105: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6 ...... ......

 

如对您有帮助,请随缘打个赏。^-^

gold