Notebooks

This module provides a playground to the users who want to write their own code for ETL pipeline or any other purpose. In a notebook there can be multiple code blocks in which user can write his code and that can run/execute independently to solve a purpose. Following are the main features that are available under this module.

  1. Write your own code.

  2. Multiple languages support.

  3. ETL pipeline (Stream/Batch).

  4. Code Snippets for periodic execution.

  5. Scheduler (CRON Jobs).

To create a Notebook go to Bigdata > Notebooks and click on Add New, fill in the basic details - Name & description then click on the Next button.

BigData > Notebooks > Add New

To write your code snippets click on Add New CodeBlock. Configure the details - language, ETL, type, sample code; of the code block according to your use case and proceed with the coding.

In this example, we will read data from the MongoDB collection using Apache Spark SQL in Scala.

BigData > Notebooks > Create > MongoDB Read

Click on the save button after writing the code and click on *Run button available in actions of code block and then debug button to see the output of the code written.

BigData > Notebooks > Create > MongoDB Read > Debug

Once you are satisfied with the code and its output then save the code block and save the notebook for future access.

Also, you can schedule the execution of a code block. All you need to do is click on the Schedule button available in actions of the code block then define the scheduler details - Hourly, Daily, or Custom Cron expression, and click on the Save button.

BigData > Notebooks > CodeBlock > Schedule

Then to enable the configured scheduler click on the green play button i.e. Schedule Job button in the top left side of the code block. Once this is enabled then the code block will start to execute periodically according to the defined scheduler.

BigData > Notebooks > CodeBlock > Scheduler > Enable

Last updated