I want to write Machine Learning (ML) applications and so I need to select tools that will help me do this. I have very little practical experience and so the purpose of this post is to show my discovery process in the hope that it also helps others on the same path.
I have a strong desire to one day take advantage of Google’s Tensor Processing Unit (TPU). Clearly Google is aggressively re-positioning itself as a Artificial Intelligence (AI) company; they are hiring a large number of top resources, they are leading certain types of research (such as with DeepMind), they are creating cloud-based services such as Google Cloud ML, they are incorporating ML into all their products, and with TensorFlow they have created an open-source software library for Machine Intelligence. To me, the attraction of using TensorFlow and the TPU includes:
- It leverages Google’s AI work and experience.
- It simplifies the development process by having the same company providing the library, processors etc.
- I can leverage the significant ML cloud infrastructure being built by Google.
- Google’s technologies are largely platform neutral and I would like to have that flexibility.
- Given the presence of Google, open source resources will gravitate to TensorFlow and therefore accelerate its on-going development.
Alternatives to TensorFlow
Major alternatives to this includes Torch and Theano. Torch is framework written in Lua and is extensively used at Facebook’s AI Research Laboratory (FAIR), Twitter Cortex and NVIDIA. Facebooks’s Yann LeCun is an important AI pioneer and his research papers, such as the latest on Tracking The World State with Recurrent Entity Networks, are very important in the development of AI. NVIDIA are key in providing GPUs and supercomputers for AI projects (such as the NVIDIA DGX-1 recently given to OpenAI). Torch and NVIDIA are a very strong combination and should seriously be considered for ML development. If following this path, it would be advisable to use Ubuntu as most of the NVIDIA tools, such as DIGITS and their Deep Learning Software, use Ubuntu as the preferred operating environment.
Theano is a Python library and is used by the Montreal Institute for Learning Algorithms (University of Montreal), one of the leading AI research universities. Theano is one of the oldest complete libraries available; TensorFlow is considered the next generation improvement over Theano.
These libraries have their attractions, with Torch being the most interesting. So, although I will initially gravitate towards TensorFlow, it will also be important to continuously monitor developments in Torch and NVIDIA and no doubt this will become part of my toolkit.
My software development environment includes a MacBook Pro (Retina, 15-inch, Mid 2015) running macOS Sierra version 10.12.3 and homebrew. This version of the MacBook Pro does not have a NVIDIA GPU therefore I cannot install a version of TensorFlow with GPU support.
In the longer term I plan to use Google’s Go Programming Language for software development. There is a Go binding to TensorFlow but this is currently experimental; the TensorFlow Python API is the most complete therefore I will use that. The TensorFlow Python API supports Python 2.7 and Python 3.3+ so it is assumed you have already installed Python and pip via homebrew. The steps to install TensorFlow (without GPU support) are:
$ sudo easy_install pip $ sudo easy_install --upgrade six
$ pip install tensorflow
Once this is completed then you should be able to test the TensorFlow implementation by running:
$ python ... >>> import tensorflow as tf >>> hello = tf.constant('Hello, TensorFlow!') >>> sess = tf.Session() >>> print(sess.run(hello)) Hello, TensorFlow! >>> a = tf.constant(10) >>> b = tf.constant(32) >>> print(sess.run(a + b)) 42 >>>
If you choose to install from source you will need to:
Bazel is an open source tool that allows for the automation of building and testing of software. Google uses Blaze as its internal build tool and released and open-sourced part of the Blaze tool as Bazel. To install Bazel and the Python dependencies, run the following:
$ brew install bazel $ brew upgrade bazel $ sudo easy_install -U six $ sudo easy_install -U numpy $ sudo easy_install wheel $ sudo easy_install ipython
Next you will need to clone and configure the repository as follows:
$ git clone https://github.com/tensorflow/tensorflow
$ bazel build -c opt //tensorflow/tools/pip_package:build_pip_package $ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg $ sudo pip install /tmp/tensorflow_pkg/tensorflow-0.xx.x-py2-none-any.whl
You will need to ensure that the name of the .whl file matches the current version. You can then test the installation as follows:
bazel build -c opt //tensorflow/tools/pip_package:build_pip_package mkdir _python_build cd _python_build ln -s ../bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/org_tensorflow/* . ln -s ../tensorflow/tools/pip_package/* . python setup.py develop
You can then test this implementation by running:
$ cd /usr/local/lib/python2.7/site-packages/tensorflow/models/image/mnist $ python convolutional.py