Stopping the cluster means that all the data in Hadoop /user directory is going to be backed up and then all the resources are going to be unallocated from the cloud. You are going to be able to start the cluster again once it is stopped and all the backed up data is going to be loaded again.

Do you still want to proceed?

Big SQL and Text Analytics Sandbox

Big SQL and Text Analytics Sandbox is a large, shared environment for data science. You can use it to run R, SQL, Spark, and Hadoop jobs. It is a high performance cluster demonstrating the advantages of parallelized processing of big data sets.

Getting Started with SQL on Hadoop

Big SQL Technology Sandbox is a large, shared environment for data science. You can use it to run R, SQL, Spark, and Hadoop jobs. It is a high performance cluster demonstrating the advantages of parallelized processing of big data sets.

Find a wealth of valuable information at the IBM Hadoop Developer Site and Getting Started with Hadoop.

Follow the Big SQL on Hadoop tutorial in the Tech Sandbox environment. Some steps may not apply in the shared environment.

🔗 Hadoop Developer Site

🔗 Getting started for Hadoop

🔗 Big SQL on Hadoop tutorial

Data Scientist Workbench

Data Scientist Workbench is a collection of tools making open source data science easy. You can connect from many of the tools in Data Scientist Workbench to the Big SQL Technology Sandbox environment in order to access powerful, parallelized computing capacity.

🔗 Data Scientist Workbench

🔗 Execute R jobs from R Studio IDE

🔗 Work with Big SQL data in Jupyter notebooks