Review: Amazon SageMaker scales deep learning

Amazon SageMaker, a machine learning development and deployment service introduced at re:Invent 2017, cleverly sidesteps the eternal debate about the “best” machine learning and deep learning frameworks by supporting all of them at some level. While AWS has publicly supported Apache MXNet, its business is selling you cloud services, not telling you how to do your job.

SageMaker, as shown in the screenshot below, lets you create Jupyter notebook VM instances in which you can write code and run it interactively, initially for cleaning and transforming (feature engineering) your data. Once the data is prepared, notebook code can spawn training jobs in other instances, and create trained models that can be used for prediction. SageMaker also sidesteps the need to have massive GPU resources constantly attached to your development notebook environment by letting you specify the number and type of VM instances needed for each training and inference job.

Trained models can be attached to endpoints that can be called as services. SageMaker relies on an S3 bucket (that you need to provide) for permanent storage, while notebook instances have their own temporary storage.

SageMaker provides 11 customized algorithms that you can train against your data. The documentation for each algorithm explains the recommended input format, whether it supports GPUs, and whether it supports distributed training. These algorithms cover many supervised and unsupervised learning use cases and reflect recent research, but you aren’t limited to the algorithms that Amazon provides. You can also use custom TensorFlow or Apache MXNet Python code, both of which are pre-loaded into the notebook, or supply a Docker image that contains your own code written in essentially any language using any framework. A hyperparameter optimization layer is available as a preview for a limited number of beta testers.

Source: InfoWorld Big Data