Ok, the ideal thing is that if you are writing some up intellectual property (IP) documents for say patents or papers, you need a good base to do this. Right now, I'm trying to set up a workflow that is efficient and easy for a patent attorney for instance to understand. So here is the ideal approach:
- Be able to have a mix of code, workflows and texts mixed together
- Make sure this has revision control and you can recover what you wrote.
- Be able to reuse pieces
- Have a nice PDF or editable document output for non-technical people and a way for them to comment and provide revisions back
- Allow collaboration between different authors
This is actually pretty hard to achieve. When we started doing this five years ago, the solution was:
- Check everything into a GitHub repo
- Use Sphinx as a document assembly tool that allows imports
- Use Graphviz to create simple workflow and documents
- Generate Word or PDFs out of this.
This worked pretty well, but it was technical and also you could not really integrate code into the explanation. And it certainly wasn't a graphical environment that made it easy for someone to just enter and look at things.
So in 2021, there are some great new methods for doing this. The biggest change has been the emergence of Jupyter as nice interface that allows at least technical folks to look at things and it is graphical. This still leaves the problem of what non-technical person can do. So, there a solution might be:
- Github storage. As before, Github is a great way to keep track of revisions and connect things together.
- Jupyter Notebooks and soon JupyterLab. Particularly with JupyterLab, there are so many plugins now available that allow the integration of extensions. Right now, you can get structured workflows there but not arbitrary drawn graphics.
- Deepnote or Google Colab. This is a really nice collaboration tool that let's two people see keystroke by keystroke edits. It does cost money compared with say Google Colab, but that is pretty cool.
- Google Docs after initial creation. This seems like a good compromise for attorneys and others to edit and comment.
A Solution: Deepnote with Jupyter Notebook and Graphviz
The folks are DeepNote are super responsive and answered all these questions
- You can't login to this forum from Safari when you are trying to use your Deepnote login to authenticate. It just loops. It does work on Google Chrome on MacOS Intel though. Pretty strange and seems intermittent. I sent a bug over to them.
- Also, you get caught by the Popup blocker on Safari when trying to do a Github integration. This does work with Chrome though
- If you install a GitHub link after you have a running browser then in Terminal, the new directory does not appear. You need a hard restart to get the new directory to appear in `/work` so you might want to note that in the documentation. This is a real bug and they know this.
- Documentation-wise, it is not obvious how you take existing files and put them into the Github repo. I think you may want to add to the documentation that you need to do a `mv * _your_new_repo_` and then the normal git push and so forth. as expected to work like Google Colab as it's not in the UI.
- Final point is on Graphviz integration, you have a slightly different way of handling output that is PNG, so you might want to put that somewhere, you have to import Image and then you can render what graphviz produces. They say there is known bug with SVG output but the PNG trick does work. You just have to set the output format to PNG
For another 14 days how to use JupyterLab on DeepNote
Well in about 14 days, JuyterLab becomes the default replacing Jupyter Notebook which is really nice. In the meantime though, you need to feed it an image that has JUPYTER_ENABLE_LAB=yes set as an environment variable.
It is super cool that you let any arbitrary Docker Image to run, is there any documentation on how to do this, I'd love to be able to run JupyterLab in your infrastructure but I don't see anyway to pass an envi8ronment flag to it. As points out you need to add `-e` to the invocation or have some way to set a shell variable when doing the docker run.
The trick here is to use Github to fork a docker file repo from Jupyter, then to build with different environment parameters and finally, to specify that docker file to DeepNote.
- Go to Docker hub and click on Github integration when you are creating your image.
- Fork the GitHub of the docker images that you want. For instance, Jupyter has a complete collection of blessed Dockerfiles for Notebooks like the Tensorflow Notebook which I just cloned and copied. Then just by taking upstream downloads, I have a private copy of it for use at Richt's Docker Hub.
- has different stacks all in one place. You don't need to have one repo per stack. So for instance, if you have a
srcmono repo, then you can just point to the Dockerfile location.
- You can then connect it to a repository and specify the location of the Dockerfile. This means that a repo that has lots of docker files, just put in the source location.
- One thing that is very nice is that you can add the Environment variables as a separate place, so you don't need to futz with the Dockerfile. This makes it trivial to add the command,
JUPYTER_ENABLE_LABand set that to True and suddenly you have Jupyter Labs everywhere.
- Finally, you can tell DeepNote to run that docker image and you are in Jupyter Lab! Or for that matter whatever you want to configure to run in! This is in the environment section when you click on the Settings icon. Note that unless you are a paid plan, you need to use a public image, but they do support Docker Hub, gcr.io and Amazon ECR. One small bug here is that when you click on add image, it looks like nothing happens even though the image gets added.
- Deepnote also runs a
init.ipynb which lets you run various code ahead of time
Jupyter Lab running it locally with extensions
For the many of us on Jupyter Notebook, what's the big deal about Jupyter Lab, well it brings a real debugger and other tools to Jupyter. In many ways it catches up to the forks that are DeepNote and Google Colab. The main thing is that the extensions are are much more robust and it is much easier to add new modules. So here are some notes on running it:
- On a Mac, you can do a
brew install jupyterlaband get it
- But you probably want an environment to run it like conda or pipenv, so crank those up and install jupyterlab and then the extensions listed with
pipenv install jupyterlaband the
jupyter labto run it.
- One note it looks like Jupyter lab does not run correctly in Safari I'm pretty sure this has to do with some whitelisting I need to do as it works fine in Chrome. I tried whitelisting it with Ghostery and an inspection of the source shows that there is nothing coming from the Jupyter server
There best extensions are easy to install but require that you pip install outside of Jupyter Lab and also modify the user interface. Much of this won't actually work with DeepNote as they take over the user interface, but it will work on your local installations, so here goes. The most confusing thing about extensions is that some are node applications that you can load with
jupyter labextension install in JupterLab 2.x, but with Jupyter 3.x they are all pip packages
- JupyterLab LSP. At last you make Jupyter a full IDE, so now you get syntax and definition jumping to do this, you will need to do
pip install jupyterlab-lsp python-language-server[all]. If this works, when you start Jupyter you should see extensions successfully loaded. The all is important.
- Jupyter System Monitor. This just shows you the CPU usages and other facts about your system in the toolbar
- Jupyter Debugger. This along with the LSP are the two things I really want which is basically a way to do single step debugging. While Jupyter Notebooks were ok with their REPL execution, this is really convenient. Many tools like DeepNote and Colab already have versions of this, but it is a pain to setup but nice to have.
- Jupyter Git. Again, this brings the standalone Jupyter up to the Colab and Deepnote levels so that you can commit things directly from Jupyter. Basically you get a pane and can open git repo items directly from the interface.
- JupyterLab NBDime. This let's you do diff and merges of Jupyter notebooks so from the command line you can do a
nbdirr first.ipynb second.ipynband it looks sensical
- Jupyter LaTeX. This would be great but it doesn't supoort the latest version of Jupyter
- JupyterLab Table of Contents. This is another convenience so you can see where you are in a big notebook. Colab has this and it's nice.
- JupyterLab Collapsible Headings. So you can fold and unfold your stuff with short cut keys.
- JupyterLab Vim bindings. Ok, for us power nerds, you have to have this! But :w and :s work