Overview

This document describes what users should do to proceed in their Data Management journey with Streams by Datanexions, CLI version.

Streams by Datanexions - CLI version - is the version of our no-code Data Management solution that is used in a terminal window(cmd, Powershell for Windows systems / Terminalfor MacOs), while the enterprise edition is used from a web user interface.

Hence, this document goes through the different files a CLI version user manages to proceed with designing a target database, loading it with a combination of data from many sources, and automatically update the metadata into a Data Governance Tool.

Folder Organization

In the Installation part, all command line options are explained. They won't be explained again in here. Conversely, it's important to describe the content of each file that the user will have to modify or create.

Under streams-<version>-cli/ folder extracted from streams-<version>-cli.zipfile, there is a template myStreamsEnvironment/folder, provided for starting your Data Management environment, dedicated to a specific Target Database. It can be renamed and copied anywhere in your file system.

We will call <environmentFolderPath> the full file system path to that folder, and <environment folder>the name of the folder itself, what ever the name you decide to give. The following structure show all sub-folders and, properties and configuration files, that may exist through time :

└── <environment folder>/
    ├── environment.json
    ├── target.connection.json
    ├── config/
    │   ├── glossary.<globalLabel>.json
    │   ├── variants.json
    │   └── objects.json
    ├── inputFiles/
    └── projects/
        ├── <project folder>/
        │   ├── project.json
        │   └── config/
        │       ├── glossary.<localLabel>.json
        │       ├── variants.json
        │       ├── objects.json
        │       ├── <connection 1 Name>.connection.json
        │       ├── <connection 1 Name>.scope.json
        │       ├── <connection 1 Name>.stream.<streamName1>.json
        │       ├── <connection 1 Name>.stream.<streamName2>.json
        │       ├── ...
        │       ├── <connection 1 Name>.stream.<streamNameN>.json
        │       ├── <connection 2 Name>.connection.json
        │       ├── <connection 2 Name>.scope.json
        │       ├── <connection 2 Name>.stream.<streamName1>.json
        │       ├── <connection 2 Name>.stream.<streamName2>.json
        │       ├── ...
        │       ├── <connection 2 Name>.stream.<streamNameN>.json
        │       └── ...
        ├── <project 2 folder>/
        │   └── ...
        └── <project N folder>/
            └── ...

Explanation of Key Elements:

<environment folder>/: The home directory of projects aiming a specific target database.
environment.json: The configuration file for the target database. It contains its type and sub-type.
target.connection.json: It contains detailed settings for connecting to the target database.
inputFiles/: Optional folder containing source files ingested by Streams before transformation (Excel files for example)
<project n folder>/: Those folders holds Streams content regarding a functional subset of your data journey for the target database.
project.json: The configuration file for the (buckets, scopes and collections)/(databases, instances and tables)/etc in the target database, for storing metadata (data models) and data.

Environment Configuration

environment.json

Defines the target database type for your Streams environment.

Location: {environment-folder}/environment.json

Structure

{
  "TARGET_DATABASE": {
    "DATABASE_TYPE": "NOSQL",
    "DATABASE_NAME": "COUCHBASE"
  }
}

Configuration Options

DATABASE_TYPE

Value: "NOSQL"

Currently, only NoSQL databases are supported.

DATABASE_NAME

Supported Values:

"COUCHBASE"
"MONGODB"
"CASSANDRA"

Examples

Couchbase

{
  "TARGET_DATABASE": {
    "DATABASE_TYPE": "NOSQL",
    "DATABASE_NAME": "COUCHBASE"
  }
}

MongoDB

{
  "TARGET_DATABASE": {
    "DATABASE_TYPE": "NOSQL",
    "DATABASE_NAME": "MONGODB"
  }
}

Cassandra

{
  "TARGET_DATABASE": {
    "DATABASE_TYPE": "NOSQL",
    "DATABASE_NAME": "CASSANDRA"
  }
}

Usage

This file is read at the beginning of every Streams operation to determine:

How to interpret target.connection.json
How to structure project.json
Which database driver to use

Folder Organization​

Explanation of Key Elements:​

Environment Configuration

environment.json​

Structure​

Configuration Options​

DATABASE_TYPE​

DATABASE_NAME​

Examples​

Couchbase​

MongoDB​

Cassandra​

Usage​