Running shell commands in Beaker Notebook

Data Science Notebooks like Beaker Notebook are a great way to not only explore and analyze data but also record the steps, so that the next Data Scientist can reproduce the results — just by clicking “Run”.

Few people like to spend time meticulously documenting their data analysis steps so to the degree that Data Science Notebooks can be “Self Documenting” — it greatly makes things a lot easier.

As Beaker Notebook can mix many programming languages within one Notebook like R, Python, and Groovy — most steps can be captured completely in a Notebook.

One case that is missing from Beaker however is the command line shell.  Bash is the default on Macintosh OSX, but the same is true for other shells, including the Windows shell or other Unix shells such as “csh” or “tcsh”.

Oftentimes shell commands are used for running data manipulation programs (like awk, perl, or sed), or running compiler processes (like maven or ant).

At Vital AI we use shell commands to run “vitalsigns” which compiles a data model into code, which is then used in data analysis, database queries, and machine learning processes (running inside Apache Spark).

It’s nice to run these within Beaker as not only a convenient way to avoid switching from Beaker to a terminal screen and back, but also to document these steps for reproducibility.

Fortunately with a little helper class it’s easy to run shell commands from Groovy cells in the Beaker Notebook.

The helper class is “RunBash.groovy” and is found on github here:
https://github.com/vital-ai/vital-data-utils/blob/master/src/main/groovy/ai/vital/data/utils/RunBash.groovy

Once a jar with this class is made available to Groovy via the Language Manager (see screenshot below), it can be used in a Groovy cell to run Bash scripts, like so:

import ai.vital.data.utils.RunBash

RunBash.enable() // hook .bash() to strings

//bash script begins here, in a Groovy multi-line string:
"""
echo \$VITAL_HOME

vitalsigns generate -o \${VITAL_HOME}/domain-ontology/vital-samples-0.1.0.owl -or
"""
.bash() // this runs the script</blockquote>

Here’s a screenshot of that running in Beaker:

runbash

Here’s a screenshot of the Language Manager (from the Notebook menu), with jars added to the Beaker Notebook classpath for Groovy.

beaker-languagemanager

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s