Building on a Jupyter Notebooks foundation, the new toolkit is designed to help reduce model development tasks complexity.
IBM today announced Elyra, a set of open-source artificial intelligence-centric extensions to Jupyter Notebooks, and, more specifically, the new JupyterLab user interface. The tools can be used by data scientists in model development.
Such tools are in growing demand. The reason: Use of Jupyter Notebooks is becoming a de facto requirement in many data science and artificial intelligence (AI) projects. They are the digital equivalent of the lab notebook for research scientists. They capture all aspects of a data science project. Specifically, the open-source web application lets data scientists create and share documents that include live code, equations, visualizations, and annotative notes. The notebooks are widely used for projects, including data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and more.
See also: Data Analytics Shortcuts Reduce the Need for Roomfuls of Data Scientists
Elyra provides a visual editor for building Notebook-based AI pipelines, simplifying the conversion of multiple notebooks into batch jobs or workflows. By leveraging cloud-based resources to run their experiments faster, the data scientists, machine learning engineers, and AI developers are then more productive, allowing them to spend their time using their technical skills. The initial release of Elyra includes:
- Notebook Pipelines visual editor
- Ability to run notebooks as batch jobs
- Hybrid runtime support (based on Jupyter Enterprise Gateway)
- Python script execution capabilities within the editor
- Notebook versioning based on Git integration
- Reusable configuration for runtimes
Key features and capabilities
Previously, IBM’s work on Jupyter Enterprise Gateway addressed the challenges around scaling enterprise workloads. Elyra addresses the challenges of making workload development easier.
In particular, Elyra takes advantage of the work IBM has done with Jupyter Enterprise Gateway to enable Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes, and OpenShift.
It simplifies the task of running the notebooks interactively on cloud machines, so you can use the power of cloud-based resources that enable the use of specialized hardware such as GPUs and TPUs.
Leveraging hybrid runtime support, Elyra exposes Python Scripts as first-class citizens, allowing users to locally edit their scripts and execute them against local or cloud-based resources seamlessly.
Additionally, Elyra supports versioning based on Git integrations. The integrated support for Git repositories makes tracking changes easier, allowing users to roll back to working versions of the code, backups, and, most importantly, sharing among team members. This collaborative working environment fosters productivity.
Data scientists using IBM Watson Studio can easily make use of Elyra. IBM recently announced new releases of IBM Cloud Pak for Data and Watson Studio, which added JupyterLab as a richer way to work with Notebooks, in addition to classic Jupyter Notebooks. The current version of Watson Studio already provides versioning based on Git integration and is continuing to work with the open-source community to incorporate other Elyra extensions.