The previous post served as an introduction to JupyterLab. In this one, we will give you hands-on instructions on setting it up, either with Docker or Amazon Deep Learning AMI.
Setting up JupyterLab in Docker
If you’re not familiar with Docker, it’s officially described as an open platform for developers and sysadmins to build, ship, and run distributed applications, whether on laptops, data-center VMs, or the cloud. Docker containers let a programmer or a data engineer to isolate and package an application with all the dependencies it needs (files, libraries, etc).
One of the biggest pain points in data science is sharing and reproducing experiments. The challenges are different operating systems, versioning and lack of process. Docker can help with these problems and could be, along with Git, another useful tool in a data scientist’s toolbox, in addition to actual data handling, analytics and machine learning skills.
If you now want to give JupyterLab a try, here is one way to get it up and running.
We created a t2.small instance in AWS. Add the settings according to your own environment. Once you are done ssh to the ec2 machine.( ssh [email protected]”add IPv4 Public IP here”).
Set up Docker with these commands:
sudo yum update -y
sudo yum install -y docker
sudo service docker start
sudo usermod -a -G docker ec2-user
Now log out and back in again to pick up the new Docker group permissions. Verify that the ec2-user can run Docker commands without sudo by running:
Now, create a password. If you have IPython installed on your local machine, you can run python on command line.
In the console that opens up, run the following commands:
from IPython import lib lib.passwd(”give your password here”)
You will be given a hashed password that you will need in the next phase. Optionally you can go here to obtain the hashed password.
Now you can move on to setting up JupyterLab. First, you need to select an image you want to use. You can find the options here. We chose the datascience-notebook. To set up JupyterLab you first need to get the user ID (uid).
It can be found with the command
id and must be changed to the Docker run command. Otherwise a user 1000 is used which doesn’t have the required rights.
id uid=500(ec2-user) gid=500(ec2-user) groups=500(ec2-user),10(wheel),497(docker) docker run -d --rm --name give_good_instance_name --user 500 --group-add users -e JUPYTER_ENABLE_LAB=yes -p 8888:8888 -v /home/ec2-user/notebooks/:/home/jovyan jupyter/datascience-notebook start-notebook.sh --NotebookApp.password=sha1:eccd81fb997f:687eb21737928881c8cdd5fbee419434ba6060be
Note: If you copy-paste the Docker run command from this blog and it doesn’t work, the hyphens might be different compared to your own terminal. Try changing them. :)
Congratulations! JupyterLab is now running.
Here are some explanations for the parameters used:
-d = daemon, a background process which is running as long as the server is running.
--rm When the process is stopped the Docker instance is removed. Don’t add this parameter if you want that the instance can be later restarted!
Some other useful commands with Docker are:
docker ps: shows all the running instances
docker ps –a: shows also the stopped instances
docker stop give_good_instance_name: stops the chosen instance
docker start give_good_instance_name: starts the chosen stopped instance
Finally, you can access JupyterLab by going to your browser and typing the address to your ec2 instance’s public DNS and add
:port number to the end. Now, log in with the password you defined. You should see the launcher window:
Pick e.g. the Python 3 Notebook, upload a csv file and start playing! Here’s another picture of how a view can be quickly organized:
Have fun! If you don’t want to use Docker here is another option for you.
Setting up JupyterLab with Amazon Deep Learning AMI
Amazon Deep Learning AMI already contains Jupyter, but some additional software needs to be installed. Create an EC2 instance (use at least t2.small, otherwise the instance may run out of memory) and run following commands in it:
conda config --system --prepend channels conda-forge conda update -n base conda conda install jupyterhub conda update jupyterlab conda update notebook jupyter labextension install @jupyterlab/[email protected]^0.8.1 mkdir ssl cd ssl openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout "cert.key" -out "cert.pem" -batch
Then open ~/.jupyter/jupyter_notebook_config.py and paste following at the end of the file:
c.NotebookApp.certfile = u'/home/ubuntu/ssl/cert.pem' # path to the certificate we generated c.NotebookApp.keyfile = u'/home/ubuntu/ssl/cert.key' # path to the certificate key we generated c.NotebookApp.ip = '*' # Serve notebooks locally. c.NotebookApp.open_browser = False # Do not open a browser window by default when using notebooks. c.NotebookApp.password = 'sha1:fc216:3a35a98ed980b9...'
Replace the c.NotebookApp.password contents with the hashed password that you generated earlier. You can start JupyterLab with command
The notebooks are saved to the instance’s local disk, but if you want you can easily add S3 as a storage location for the notebooks. First run
pip install s3contents and then modify ~/.jupyter/jupyter_notebook_config.py file by adding following at the end:
c.NotebookApp.contents_manager_class = S3ContentsManager c.S3ContentsManager.bucket = "<name of your S3 bucket>" c.S3ContentsManager.prefix = "<optional directory inside the bucket>"
Give it a try!