Salla Rönkä
Janne Hurskainen
30.08.2018 · Salla Rönkä · Janne Hurskainen

How to set up JupyterLab in AWS

If you want to give JupyterLab a try, these instructions help you get it up and running.

The previous post served as an introduction to JupyterLab. In this one, we will give you hands-on instructions on setting it up, either with Docker or Amazon Deep Learning AMI.

 

Setting up JupyterLab in Docker

If you’re not familiar with Docker, it’s officially described as an open platform for developers and sysadmins to build, ship, and run distributed applications, whether on laptops, data-center VMs, or the cloud. Docker containers let a programmer or a data engineer to isolate and package an application with all the dependencies it needs (files, libraries, etc).

One of the biggest pain points in data science is sharing and reproducing experiments. The challenges are different operating systems, versioning and lack of process. Docker can help with these problems and could be, along with Git, another useful tool in a data scientist’s toolbox, in addition to actual data handling, analytics and machine learning skills.

If you now want to give JupyterLab a try, here is one way to get it up and running.

We created a t2.small instance in AWS. Add the settings according to your own environment. Once you are done ssh to the ec2 machine.( ssh [email protected]”add IPv4 Public IP here”).

Set up Docker with these commands:

  1. sudo yum update -y
  2. sudo yum install -y docker
  3. sudo service docker start
  4. sudo usermod -a -G docker ec2-user

Now log out and back in again to pick up the new Docker group permissions. Verify that the ec2-user can run Docker commands without sudo by running:

docker info

Now, create a password. If you have IPython installed on your local machine, you can run python on command line.

python

In the console that opens up, run the following commands:

from IPython import lib

lib.passwd(”give your password here”)

You will be given a hashed password that you will need in the next phase. Optionally you can go here to obtain the hashed password.

Now you can move on to setting up JupyterLab. First, you need to select an image you want to use. You can find the options here. We chose the datascience-notebook. To set up JupyterLab you first need to get the user ID (uid).

It can be found with the command id and must be changed to the Docker run command. Otherwise a user 1000 is used which doesn’t have the required rights.

id
uid=500(ec2-user) gid=500(ec2-user) groups=500(ec2-user),10(wheel),497(docker)

docker run -d --rm --name give_good_instance_name --user 500 --group-add users -e JUPYTER_ENABLE_LAB=yes -p 8888:8888 -v /home/ec2-user/notebooks/:/home/jovyan jupyter/datascience-notebook start-notebook.sh --NotebookApp.password=sha1:eccd81fb997f:687eb21737928881c8cdd5fbee419434ba6060be

Note: If you copy-paste the Docker run command from this blog and it doesn’t work, the hyphens might be different compared to your own terminal. Try changing them. :)

Congratulations! JupyterLab is now running.

 

Here are some explanations for the parameters used:

-d = daemon, a background process which is running as long as the server is running.

--rm When the process is stopped the Docker instance is removed. Don’t add this parameter if you want that the instance can be later restarted!

Some other useful commands with Docker are:

docker ps: shows all the running instances

docker ps –a: shows also the stopped instances

docker stop give_good_instance_name: stops the chosen instance

docker start give_good_instance_name: starts the chosen stopped instance

 

Accessing JupyterLab

Finally, you can access JupyterLab by going to your browser and typing the address to your ec2 instance’s public DNS and add :port number to the end. Now, log in with the password you defined. You should see the launcher window:

 

 

Pick e.g. the Python 3 Notebook, upload a csv file and start playing! Here’s another picture of how a view can be quickly organized:

 

 

Have fun! If you don’t want to use Docker here is another option for you.

 

Setting up JupyterLab with Amazon Deep Learning AMI

Amazon Deep Learning AMI already contains Jupyter, but some additional software needs to be installed. Create an EC2 instance (use at least t2.small, otherwise the instance may run out of memory) and run following commands in it:

conda config --system --prepend channels conda-forge
conda update -n base conda
conda install jupyterhub
conda update jupyterlab
conda update notebook
jupyter labextension install @jupyterlab/[email protected]^0.8.1
mkdir ssl
cd ssl
openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout "cert.key" -out "cert.pem" -batch

Then open ~/.jupyter/jupyter_notebook_config.py and paste following at the end of the file:

c.NotebookApp.certfile = u'/home/ubuntu/ssl/cert.pem' # path to the certificate we generated
c.NotebookApp.keyfile = u'/home/ubuntu/ssl/cert.key' # path to the certificate key we generated
c.NotebookApp.ip = '*' # Serve notebooks locally.
c.NotebookApp.open_browser = False # Do not open a browser window by default when using notebooks.
c.NotebookApp.password = 'sha1:fc216:3a35a98ed980b9...'

Replace the c.NotebookApp.password contents with the hashed password that you generated earlier. You can start JupyterLab with command jupyter lab.

The notebooks are saved to the instance’s local disk, but if you want you can easily add S3 as a storage location for the notebooks. First run pip install s3contents and then modify ~/.jupyter/jupyter_notebook_config.py file by adding following at the end:

c.NotebookApp.contents_manager_class = S3ContentsManager
c.S3ContentsManager.bucket = "<name of your S3 bucket>"
c.S3ContentsManager.prefix = "<optional directory inside the bucket>"

Give it a try!

Category
Analytics Technology

Latest blog posts