Artificial Intelligence (AI) Tools: Where do I start?
New to AI and want to apply the power of deep learning to your tough challenges? Riverside Research can help you get started. Our machine learning experts surveyed 10 current and up-and-coming deep learning frameworks and have the quick facts and factors for each.
It’s a great time to invest in AI, and there are more frameworks available than ever to get you on your way. The software framework that’s best for your organization depends on several factors. What’s most important is that you find one that meets your needs and then get comfortable using it. Of course it must be compatible with your IT environment, but there is a lot of flexibility. We hope this overview helps you choose the tool that’s best for your organization. With a little bit of understanding, we don’t think you’ll go wrong.
Key Factors When Selecting Deep Learning Tools
Each framework is based on different design choices that make it more or less favorable to specific tasks. First, you need to know what platform the system runs on during prototyping, training, and deployment. Second, you need to know the API options. Most deep learning tools operate natively on Linux, and all of the frameworks we include here are Linux compatible. If you need to run on Windows, you’ll want to avoid PyTorch, and if you use OS X, you’ll want to avoid CNTK. Similarly, the de facto machine language API is Python, which is used by every tool we surveyed except Torch and Deeplearning4j. Central to design is the neural network itself, both how it is defined and how it processes data. Most frameworks use procedural scripting (traditional programming) to define the architecture. However, in Caffe, the architecture is defined by a static configuration file. The frameworks’ approaches to processing are differentiated by their error backpropagation approaches, their means of parameter representation, and whether they use static or dynamic computation graphs. Some frameworks (Caffe, Caffe2, and CNTK) don’t support Restricted Boltzmann machines or deep belief networks (RBMs/DBNs). We’ll discuss the highlights of what this means in the summary.
Central to design is the neural network itself, both how it is defined and how it processes data.
With the complexity of neural networks, performance is difficult to compare. Just as human effectiveness is hard to quantify, our machines are becoming increasingly unique and complex. Each framework is optimized for different purposes, so a standardized test would inherently favor some tools over others. AI performance is also highly hardware-dependent with even more variability than traditional computing architectures. Performance is highly dependent on the specifics of the machine, the configuration, and whether a given framework supports computation libraries for your hardware. Performance is not just measured by the ability to run processing, but also by the ability to train it, so your needs will depend on the intended application. For deployment, forward pass (processing and recognition) performance is important; for experimentation and training, backward pass (backpropagation computations and parameter updating) is more important. Two useful measures we looked at when assessing performance were hardware utilization and speed—in particular, the ability of a framework to operate in a distributed manner.
Performance needs differ for deployment versus experimentation and training.
Due to the fast-paced nature of AI development, community adoption is critical. In addition to the importance of collaboration and available modules, a framework that falls from common use could become outdated within months. For example, between our initial study and the publication of this article, it was announced that Theano would no longer be supported. So while the applicability of the tool to your project and your environment are both important, you’ll want to verify that the tool is being used by others, so that there is a community of users to share and improve methodologies. For those tools in robust use, you’ll want to note which organizations are using them so you can determine which collaboration sphere is best for your goals. For our survey, we focused on which communities (e.g. industry, academic, or scientific) are using the platform and also how it interfaces with a Model Zoo. Model Zoos are repositories of pre-trained networks (topologies and trained parameters), making it easier to join a community and leverage existing efforts to get you coding even smarter and faster.
You'll want to verify that the tool is being used by others and has a community of users to share and improve methodologies.
Those are the basics. One caveat: we don’t want to underemphasize the complexity of AI—reliably creating, training, and evolving deep learning capability takes experience and expertise. But tools with robust online help and documentation are increasingly available, so there’s never been a better time to get started.
Caffe is a framework developed by UC Berkeley AI Research and other contributors. It was created with expression, speed, and modularity in mind. Its design separates network configuration and implementation, and is well known for a robust Model Zoo with dozens of pre-trained networks. In fact, every framework in this survey is compatible with the Caffe Model Zoo. Because Caffe networks are defined by configuration files rather than scripting, it is not ideal for large or complex networks. Additionally, we have a sense that while the academic-focused Caffe grew in popularity for early adoption, it’s on its way out, being surpassed by the industry-supported Caffe2.
Caffe2 is a follow-up framework to Caffe developed by Facebook. It was developed to optimize production and deployment, with speed and scalability as primary focuses. It maintains the same Model Zoo concept as Caffe, and Caffe models can be easily converted to run with Caffe2. While Caffe2 provides massive improvements on Caffe and is great for production capability, it’s not really intended for more experimental use or novel development. In fact, many adopters are taking a hybrid approach: using PyTorch for experimental design and Caffe2 for more mature production. And with Facebook and Microsoft’s new Open Neural Network Exchange (ONNX) format, converting trained PyTorch networks (or CNTK or MXNet, for that matter) to Caffe2 for inference is fairly straightforward.
Deeplearning4j is a framework developed and maintained by Skymind and other contributors. It’s unique in this survey because it’s written for the Java virtual machine (JVM) and is commercially supported by its main developer, Skymind, so this is a solid choice for a framework with robust industry support or for organizations favoring Java.
The Microsoft Cognitive Tool Kit (CNTK) is an AI framework developed by Microsoft. Originally developed by their speech group and optimized for sequences, or recurrent neural networks (RNNs), CNTK has become a robust framework with solid performance, scalability, and extensibility. CNTK is optimized for pattern tasks such as natural language processing. One note: the 1-bit SGD component that enables its excellent distributed training performance is licensed under a non-commercial license. While CNTK looks great on paper, we’ve noted that its adoption is disproportionately low, so that’s another point to consider when looking longer-term.
MXNet was developed by the Distributed (Deep) Machine Learning Community and was recently accepted into the Apache Incubator. It’s focused on scalability and is optimized for both distributed and cloud computing, as well as for deployment of trained models on low-end devices, making this tool optimized for deployable applications. Additionally, its new Gluon API abstracts some of the lower level complexity of the framework, providing a more user-friendly experience. We found no real downside to MXNet; it’s a capable framework with solid adoption. MXNet is unique because the popular frameworks are generally backed by a single industry giant, but MXNet was started in, and remains in, academia. So it may be a good choice for those not looking to be tied to an industry development base.
TensorFlow is a toolkit developed by Google. It is a library for numerical computations using data flow graphics. In 2017 it surpassed Caffe to become the most popular framework in the deep learning community and now leads community adoption by a wide margin. It is scalable from distributed server to mobile devices with a single API, and is optimized for Google’s cloud-based Tensor Processing Units. TensorFlow is a good choice for new deep learning practitioners given its high adoption, the intuitive Keras front-end, and the extensive amount of help available online.
Theano was developed at the University of Montreal. It was originally developed to efficiently evaluate mathematical expressions involving multidimensional arrays. Even though it’s the oldest framework in the survey, it has been well-maintained and widely used for years. In our original study, we concluded it was a good choice for those looking for a stable and well-understood framework; however, it was just announced that Theano development has been discontinued. So it’s best to stay clear of this option and put it in the history books as a notable pioneer of deep learning capability.
Torch is developed and maintained by Ronan Collobert (Facebook), Koray Kavukcuoglu (Google), Clement Farabet (Twitter), Soumith Chintala (Facebook), and others at many other research labs and companies. It implements a wide range of ML algorithms with a focus on parallel GPU performance. Its math-oriented interfaces may be preferred by some, but its lack of a Python interface is likely a downside for most ML or deep learning programmers.
PyTorch is developed and maintained primarily by researchers at Facebook AI. Its name may suggest a Python wrapper for Torch, but it’s actually a completely new framework with tight integration between Python and its C/C++ backend. Its dynamic computation graph is well-suited to tasks like natural language processing that use varied length inputs. Even though it’s in beta, the community is favoring PyTorch for prototype efforts due to its high level of flexibility. PyTorch, along with Keras and MXNet’s Gluon, are good starting points for general deep learning functions and new users.
Keras is unique in this survey as a high-level API wrapper that may be leading the charge on AI standardization. It is developed by François Chollet, a Google employee. Keras runs on top of TensorFlow, Theano, MXNet, CNTK, and Deeplearning4j. It allows fast prototyping of networks and has modular, user-friendly implementations of a wide range of neural layers, cost functions, optimizers, initialization schemes, activation functions, and regularization schemes. As a wrapper, Keras is a great choice for starting out on deep learning because, like MXNet’s Gluon, it is simple and easy to use. However, for more novel networks, you’ll want to be able to get under the hood, so for those networks, we’d steer you toward one of the underlying frameworks.
That concludes our high-level look at AI deep learning frameworks. Note that this field is moving quickly, and this information, current as of December 2017, will continue to change. It may be fast moving, but there is no time like the present to get started. We hope this will help you do that.
Keep in mind that even with the accessibility and ease of use of these frameworks, it still takes practice and expertise to effectively implement AI. That’s one reason why adoption is so important, and the online community is a great resource. For more targeted support, our machine learning experts specialize in designing custom solutions for challenging needs.
To follow the latest news from Riverside Research’s machine learning experts and from our Open Innovation Center, follow us:
This article is based on the research of Mr. Jason Demeter, University of Dayton, started under his internship with Riverside Research.