Overview#
Main features#
There is one main command,
kosu, which execute one of several sub-commands:build— Build a course or courses. Reads the course’s YAML file, finds and processes the notebooks inprod, compiles the course directory, and compresses everything into a ZIP file for course participants. This command has several options, seekosu build --help.clean— Remove the build files associated with a course or courses.init— Initialize a directory to start using kosu.publish— Publish a course or courses to AWS.test— Test that a course builds.
You may only ever need to use
initonce, when you first start usingkosu.There is a global control file,
.kosu.yaml, which contains some parameters you will want to set and maintain.All of the other commands take either a single course name, or the
--allflag, which applies the command to all the courses listed in.kosu.yamlunder theallkey.There is one control file per course, e.g.
geocomp.yaml. This file contains the metadata for the course, including the curriculum and a list of its notebooks.There is one main, common environment file,
environment.yaml. This contains packages to be installed for (i.e. common to) all courses. A course’s YAML control file lists any other packages to install for that class.
Course project structure#
The tool is designed to be used in a directory with the following structure:
foo
├── environment.yaml
├── example_course.yaml
├── images
│ └── agile_logo.png
├── .kosu.yaml
├── prod
│ ├── Interesting_notebook.ipynb
│ ├── Intro_to_Matplotlib.ipynb
│ ├── Intro_to_NumPy.ipynb
│ └── Intro_to_Python.ipynb
├── references
│ └── useful.pdf
├── scripts
│ └── example.py
└── templates
└── README.md
What happens during course compilation?#
The course folder is built by kosu build example_course in the following way:
A course folder is created in a (new if necessary) folder called
build.A course README is built by placing the
curriculumin the README template.The notebooks in the curriculum are processed as described below.
Images required by the notebooks (detected automatically) are copied into an
imagesfolder.Scripts are copied into
masterandnotebooksso they can be imported into any notebook.References are copied over into a
referencesfolder.We almost always import data from Amazon S3, making data files easy to detect. Accordingly, we check that data files mentioned in the notebooks are present in the S3 bucket.
If some data files are to be included in the course repo, they are downloaded from the bucket and unzipped if necessary.
A
condaenvironment file is constructed from the various requirements in the global env file and the course control YAML file. (Dependencies are not yet detected from the notebooks.)The course folder is optionally zipped and optionally uploaded to S3.
Build files are optionally cleaned up.
Jupyter Notebooks in the prod directory are processed in various ways:
Two copies of the Notebooks are included in the course folder: one goes into the
masterfolder, the other intonotebooks. Students use the latter in the class.Cells with
exercisetags are highlighted in green in the student notebooks.Cells with
infotags are highlighted in blue in the student notebooks.Cells with
hidetags are hidden in the student notebooks.Cell outputs are deleted in the student notebooks.