JupyterHub for Teaching¶
Version: 1.0
Date: Jan 22, 2018
Abstract¶
This deployment is designed for teaching a small to medium group of trusted users.
As a simple, reusable JupyterHub deployment for your reference, this repository enables installation and deployment of JupyterHub and nbgrader on a single server. The reference deployment follows best practices and has been used by Professor Brian Granger when teaching “Introduction to Data Science”.
Contents¶
Design goals¶
Instructors and maintainers¶
When using this repository to deploy JupyterHub and nbgrader, individuals should be able to have a deployment that is as simple as possible:
- No Docker use.
- NGINX as a frontend proxy, serving static assets, and a termination point for SSL/TLS.
- A single server.
- Ansible for configuration.
- Optionally, use Let’s Encrypt for generating SSL certificates.
JupyterHub¶
- Start from:
- An empty Ubuntu latest stable server with SSH key based access.
- A valid DNS name.
- A formatted and mounted directory to use for user home directories.
- The assumption that all users of the system will be “trusted,” meaning that you would given them a user-level shell account on the server.
- Always have SSL/TLS enabled.
- Specify local drives to be mounted.
- Manage the running of jupyterhub and nbgrader using supervisor.
- Optionally, monitor the state of the server and set email alerts using NewRelic. The built-in monitoring of your cloud provider may also be used.
- Specify admin users of JupyterHub.
- Add the public SSH keys of GitHub users who need to be able to
ssh
to the server asroot
for administration. - Manage users and authentication using either:
- Regular Unix users and PAM (Pluggable authentication modules)
- GitHub OAuth
nbgrader¶
- Run nbgrader and configure:
- The course name.
- The instructors username.
- Graders’ usernames.
- The location of the nbgrader config.
Students¶
End users of this deployment should be able to:
- Use the following Jupyter kernels.
- Python version 3 using the IPython kernel with the main Python libraries for data science.
- Bash kernel <https://github.com/takluyver/bash_kernel>
- Sign in using their GitHub or Unix credentials.
- Have a persistent home directory.
- Have outbound network access.
Installation Guide¶
Prerequisites¶
Start a server running latest Ubuntu version.
Enable password-less SSH access for root user.
Partition and format any local disks you want to mount.
Verify a valid DNS entry for the server.
Choose an SSL certificate source. Use either of these options:
- Let’s Encrypt
- obtain a trusted SSL certificate and key for the server at that FQDN.
Checkout the latest version of the repository including the
ansible-conda
submodule:$ git clone --recursive https://github.com/jupyterhub/jupyterhub-deploy-teaching.git
Create the hosts group¶
- Edit the
./hosts
file to lists the FQDN’s of the hosts in thejupyterhub_hosts
group. - Create for each host a file in
./host_vars
directory with the name of the host, starting from./host_vars/hostname.example
.
Secure your deployment¶
Create a cookie secret file,
./security/cookie_secret
, using:$ openssl rand -hex 1024 > ./security/cookie_secret
For additional information, see the cookie secret file section in the JupyterHub documentation.
If you are using Let’s Encrypt, skip this step. Otherwise, install your SSL private key
./security/ssl.key
and certificate as./security/ssl.crt
.
Deploy with Ansible¶
Run
ansible-playbook
for the main deployment:$ ansible-playbook -i hosts deploy.yml
Verify your deployment¶
SSH into the server:
$ ssh root@{hostname}
substituting your hostname for {hostname}. For example, ssh root@jupyter.org
.
Reload supervisor:
$ supervisorctl reload
Configuring nbgrader¶
The nbgrader package will be installed with the reference deployment.
To run nbgrader’s formgrade application or use its notebook extensions, additional steps are needed.
Deploy formgrade¶
First, edit the deploy_formgrade.yml
file with the information
for each course you want to start formgrade for. Each course should have a
unique nbgrader_course_id and nbgrader_port.
Second, make sure that each main instructor (the nbgrader_owner for each course) has logged into JuptyerHub at least once. This ensures that their home directory has been created. The home directory of the main instructor is used for the main nbgrader course files. It is assumed that the main instructor will be running the nbgrader command line programs.
Third, run the ansible-playbook to deploy formgrade:
$ ansible-playbook -i hosts deploy_formgrade.yml
Fourth, SSH into the JupyterHub server:
$ ssh {user}@{hostname}
Finally, restart jupyterhub and nbgrader by doing:
$ supervisorctl reload
Configuration notes¶
To limit the deployment to certain hosts, add the
-l hostname
to the commands:$ ansible-playbook -i hosts -l hostname deploy.yml
The logs for jupyterhub are in
/var/log/jupyterhub
.The logs for nbgrader are in
/var/log/nbgrader
.If you are not using GitHub OAuth, you will need to manually create users using adduser:
$ adduser --gecos "" username
Change the ansible configuration by editing
./ansible_cfg
.To manage the jupyterhub and nbgrader services by SSH to the server and run:
$ supervisorctl jupyterhub { start, stop, restart }
Troubleshooting: Saving and restoring users¶
In some situations, you may remount your user’s home directories into a new instance that doesn’t have their user accounts, but has their home directories. When recreating the same users it is important that they all have the same uids so the new users have ownership of the home directories.
This is only relevant when using GitHub OAuth for users and authentication.
To save the list of usernames and uids in {{homedir}}/saved_users.txt:
$ ansible-playbook -i hosts saveusers.yml
Then, when you run deploy.yml, it will look for this file and if it exists, will create those users with those exact uids and home directories.
You can also manually create the users by running:
$ python3 create_users.py
in the home directory.
Using nbgrader¶
With the reference deployment, instructors can start to use nbgrader. This section contains a rough sketch of what that looks like. For full details see the nbgrader documentation.
Preparing class assignments - Instructor¶
To use nbgrader, an instructor will primarily use the nbgrader command line program.
Create a list of students and assignments¶
Before doing this, the instructor will need to edit the
nbgrader_config.py
file with a list of students and assignments as
follows:
c.NbGrader.db_assignments = [dict(name="ps1")]
c.NbGrader.db_students = [
dict(id="bitdiddle", first_name="Ben", last_name="Bitdiddle"),
dict(id="hacker", first_name="Alyssa", last_name="Hacker"),
dict(id="reasoner", first_name="Louis", last_name="Reasoner")
]
You can also add an email
field to each student and a duedate
field to
each assignment.
Remember to add new assignments to the nbgrader_config.py
file as the
assignments are created.
Create an assignment directory¶
Create a directory for each assignment’s source:
$ cd ~/nbgrader/<course>
$ mkdir source/<assignment>
Copy notebooks into assignment directory¶
Copy notebooks into the assignment directory:
$ cp ~/Problem1.ipynb ~/nbgrader/<course>/source/<assignment>
$ cp ~/Problem2.ipynb ~/nbgrader/<course>/source/<assignment>
Create a student version of an assignment¶
These notebooks should be prepared using the nbgrader “Create Assignment Cell toolbar”. Now create the assignment:
$ nbgrader assign <assignment>
After creating the student versions of the notebooks, put them into the
~/nbgrader/<course>/release/<assignment>
directory, and remember
to remove your solutions.
Working with an assignment - Students¶
Fetch the assignment¶
At this point, students can fetch the assignment by doing:
$ nbgrader fetch --course <course> <assignment>
That will give students a copy of the assignment directory with all of the notebooks.
Submit an assignment solution¶
When students are done working the notebooks, they can submit the assignment by doing:
$ nbgrader submit --course <course> <assignment>
Grading the assignments - Instructor¶
Collect student assignments¶
You can collect submitted assignments by doing:
$ nbgrader collect <assignment>
This puts the students submitted work into the
~/nbgrader/<course>/submitted/<assignment>
directory.
Grade the assignments¶
To enter those notebooks into the nbgrader web grading system, run:
$ nbgrader autograde <assignment>
By default, this will rerun all of the students notebooks.
If you don’t want to run them:
$ nbgrader autograde --no-execute <assignment>
Next steps¶
To see the full command line options for nbgrader, run:
$ nbgrader <subcommand> --help
Some other things you can do with nbgrader:
- Run collect and autograde commands for a single student or notebook.
- Collect a single assignment multiple times and regrade all or parts selectively.
Checklist for a JupyterHub teaching deployment¶
Documentation for teaching deployment: https://jupyterhub-deploy-teaching.readthedocs.io
Documentation for JupyterHub: https://jupyterhub.readthedocs.io
Notes¶
1. Prepare the server¶
- [ ] Server: running latest Ubuntu version
- [ ] SSH: enable password-less SSH for
ubuntu
user - [ ] Local disks: partition and format
- [ ] DNS (domain name): valid entry for server
2. Install JupyterHub source¶
[ ] Source: Clone latest
jupyterhub-deploy-teaching
repo using--recursive
(needed foransible-conda
) submodule$ git clone --recursive https://github.com/jupyterhub/jupyterhub-deploy-teaching.git
3. Secure before deployment¶
[ ] cookie secret file: Create
./security/cookie_secret
$ openssl rand -hex 1024 > ./security/cookie_secret
[ ] SSL:
- Let's Encrypt: No additional steps as Ansible will install for you.
- Third Party SSL trusted source: Install SSL private key
./security/ssl.key
and certificate as./security/ssl.crt
.
4. Create JupyterHub hosts group¶
- [ ]
./hosts
file: Edit file to lists the FQDN's of the hosts in thejupyterhub_hosts
group. - [ ] hostname files: Use
./host_vars/hostname.example
as a template for creating and editing a hostname file for each host and place hostname files in./host_vars
directory.
5. Configure admins¶
- [ ] List of admins is configured in
jupyterhub_admin_users
in the config file. Public SSH keys will be retrieved from GitHub.
6. Configure users¶
- [ ] If using PAM (Pluggable authentication
modules), you will need to
manually create users using adduser:
adduser --gecos "" username
. - [ ] If using GitHub OAuth, add
usernames to
jupyterhub_users
list.
7. Add optional services¶
- [ ] Monitoring: New Relic
- [ ] Analytics: Google Analytics
- [ ] Assignment distribution and collection: nbgrader
- [ ] Grading: nbgrader
8. Deploy with Ansible¶
[ ] Deploy: Run
ansible-playbook
for the main deployment.$ ansible-playbook -i hosts deploy.yml
9. Verify deployment and reload supervisor¶
[ ] Verify: SSH into the server:
$ ssh root@{hostname}
substituting your hostname for {hostname}. For example,
ssh root@jupyter.org
.
10. JupyterLab¶
Acknowledgment¶
Prof. Brian Granger, Cal Poly San Luis Obispo, authored this repository’s code to deploy JupyterHub for the course, DATA 301, “Introduction to Data Science.”
Thank you Brian Granger and Jonathan Fredric, co-author of an earlier code prototype, for sharing their work.
Repository Contents¶
Ansible application¶
ansible.cfg¶
Custom configuration settings for the Ansible application
- We use to customize root access, root privileges, and ssh connection length.
ansible-conda¶
Git submodule for ansible-conda
application