CUDA, cuDNN, Run Commands, ipykernel, TensorFlow

CUDA Version

cat /usr/local/cuda/version.txt

CUDA Archive

https://developer.nvidia.com/cuda-toolkit-archive

wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda_9.0.176_384.81_linux-run

chmod +x cuda_9.0.176_384.81_linux-run

./cuda_9.0.176_384.81_linux-run --extract='/shared/why16gzl/'

./cuda-linux.9.0.176-22781540.run

In [6]:
import os
cuda_version_origin = os.popen('cat /usr/local/cuda/version.txt')
cuda_version_origin.read()
Out[6]:
'CUDA Version 10.1.168\n'
In [12]:
cuda_version_new = os.popen('nvcc --version')
cuda_version_new.read().split('\n')[-2]
Out[12]:
'Cuda compilation tools, release 9.0, V9.0.176'
In [20]:
os.popen('cat /shared/why16gzl/cuda-9.0/version.txt').read()
Out[20]:
'CUDA Version 9.0.176\n'
In [24]:
os.popen('cat /shared/why16gzl/cuda/cuda-9.2/version.txt').read()
Out[24]:
'CUDA Version 9.2.148\n'

cuDNN version

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

cuDNN Install

tar -xzvf cudnn-9.0-linux-x64-v7.1.tgz

cp cuda/include/cudnn.h /usr/local/cuda-9.0/include

cp cuda/lib64/libcudnn* /usr/local/cuda-9.0/lib64

chmod a+r /usr/local/cuda-9.0/include/cudnn.h /usr/local/cuda-9.0/lib64/libcudnn*

Note: If you want to install under '/shared/why16gzl/cuda-9.0/',

replace '/usr/local/cuda-9.0/' with '/shared/why16gzl/cuda-9.0/'

In [42]:
os.popen('cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2').read().split('\n')
Out[42]:
['#define CUDNN_MAJOR 7',
 '#define CUDNN_MINOR 6',
 '#define CUDNN_PATCHLEVEL 3',
 '--',
 '#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)',
 '',
 '#include "driver_types.h"',
 '']
In [23]:
os.popen('cat /shared/why16gzl/cuda/cuda-9.2/include/cudnn.h | grep CUDNN_MAJOR -A 2').read().split('\n')
Out[23]:
['#define CUDNN_MAJOR 7',
 '#define CUDNN_MINOR 6',
 '#define CUDNN_PATCHLEVEL 4',
 '--',
 '#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)',
 '',
 '#include "driver_types.h"',
 '']

Run Commands (.bashrc)

In [32]:
os.popen('more ~/.bashrc').read().split('\n')[-11:-7]
Out[32]:
['#cuda-9.2',
 'export PATH=/shared/why16gzl/cuda/cuda-9.2/bin:$PATH',
 'export LD_LIBRARY_PATH=/shared/why16gzl/cuda/cuda-9.2/lib64',
 'export PATH="$PATH:$HOME/bin"']
In [30]:
os.popen('more ~/tf_1.12/bin/activate').read().split('\n')[-5:-1]
Out[30]:
['#cuda-9.0',
 'export PATH=/shared/why16gzl/cuda-9.0/bin${PATH:+:${PATH}} # new',
 'export LD_LIBRARY_PATH=/usr/local/lib:/shared/why16gzl/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}:/usr/lib # new',
 'export CUDA_HOME=/shared/why16gzl/cuda-9.0  #new']

ipykernel (using jupyter notebooks with a virtual environment)

Inside this folder create a new virtual environment:

python -m venv tf_1.12

Then activate it:

source tf_1.12/bin/activate

Now, from inside the environment install ipykernel using pip:

pip install ipykernel

And now install a new kernel:

ipython kernel install --user --name=tf_1.12.0

Using ipykernel without installation inside the env: python is belong to miniconda3

In [39]:
os.popen('more /home1/w/why16gzl/.local/share/jupyter/kernels/tf_1.12/kernel.json').read().split('\n')[4:6]
Out[39]:
[' "argv": [', '  "/home1/w/why16gzl/miniconda3/bin/python",']

Using ipykernel installed inside the env: python is belong to venv (tf_1.12)

In [40]:
os.popen('more /home1/w/why16gzl/.local/share/jupyter/kernels/tf_1.12.0/kernel.json').read().split('\n')[4:6]
Out[40]:
[' "argv": [', '  "/mnt/castor/seas_home/w/why16gzl/tf_1.12/bin/python",']

Install TensorFlow in Virtual Environment (tf_1.12)

Now TensorFlow does not support: 'pip install tensorflow-gpu==1.12.0'

cd /shared/why16gzl

wget https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.12.0-cp36-cp36m-linux_x86_64.whl

mv tensorflow_gpu-1.12.0-cp36-cp36m-linux_x86_64.whl tensorflow_gpu-1.12.0-cp37-cp37m-linux_x86_64.whl

pip install tensorflow_gpu-1.12.0-cp37-cp37m-linux_x86_64.whl

In [41]:
import tensorflow as tf
tf.__version__
/mnt/castor/seas_home/w/why16gzl/tf_1.12/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/mnt/castor/seas_home/w/why16gzl/tf_1.12/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/mnt/castor/seas_home/w/why16gzl/tf_1.12/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/mnt/castor/seas_home/w/why16gzl/tf_1.12/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/mnt/castor/seas_home/w/why16gzl/tf_1.12/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/mnt/castor/seas_home/w/why16gzl/tf_1.12/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
/home1/w/why16gzl/miniconda3/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.6 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.7
  return f(*args, **kwds)
Out[41]:
'1.12.0'

Refer to

https://gist.github.com/zhanwenchen/e520767a409325d9961072f666815bb8

https://gist.github.com/niderhoff/52c514f20337516500084dd7b17af7e2

https://medium.com/ibm-data-science-experience/markdown-for-jupyter-notebooks-cheatsheet-386c05aeebed

Remember to use rsync instead of scp to transfer files

rsync -e "ssh -p 2212" -avgpolr ./Downloads/cudnn-9.0-linux-x64-v7.1.tar.zip why@junnan:/u01/why/Downloads

Use git-lfs:

git init

git lfs track "*.hdf5"

git add .

git add .gitattributes

git commit -m "add *.hdf5"

git remote add origin https://github.com/why2011btv/elmo_files.git

git push -u origin master

If you fail to push, try to rm all embedded .git files