Open in Colab

Load Data

Listed here are all the modalitites by which you can load data inside the nnodely framework. There are three modalities to load a dataset inside nnodely:

  1. Using a directory, each file represents a simulation, with time coherence between lines.

  2. Using a dictionary, each element in the dictionary represents a variable.

  3. Using a pandas dataframe.

[ ]:
# uncomment the command below to install the nnodely package
#!pip install nnodely

from nnodely import *
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>-- nnodely_v1.5.0 --<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

In the following lines a network is created.

[2]:
in1 = Input('in1')
target = Input('target')
relation = Fir(in1.tw(0.05))
output = Output('out', relation)

model = Modely(visualizer=TextVisualizer())
model.addMinimize('out', output, target.last())
model.neuralizeModel(0.01)
================================ nnodely Model =================================
{'Constants': {},
 'Functions': {},
 'Info': {'SampleTime': 0.01,
          'nnodely_version': '1.5.0',
          'ns': [5, 0],
          'ntot': 5,
          'num_parameters': 5},
 'Inputs': {'in1': {'dim': 1, 'ns': [5, 0], 'ntot': 5, 'tw': [-0.05, 0]},
            'target': {'dim': 1, 'ns': [1, 0], 'ntot': 1, 'sw': [-1, 0]}},
 'Minimizers': {'out': {'A': 'Fir2', 'B': 'SamplePart4', 'loss': 'mse'}},
 'Outputs': {'out': 'Fir2'},
 'Parameters': {'PFir3W': {'dim': 1,
                           'tw': 0.05,
                           'values': [[0.7577804327011108],
                                      [0.1862850785255432],
                                      [0.5226411819458008],
                                      [0.8208074569702148],
                                      [0.10860830545425415]]}},
 'Relations': {'Fir2': ['Fir', ['TimePart1'], 'PFir3W', None, 0],
               'SamplePart4': ['SamplePart', ['target'], -1, [-1, 0]],
               'TimePart1': ['TimePart', ['in1'], -1, [-0.05, 0]]}}
================================================================================

Load a dataset using a directory

Load a dataset inside the framework using a directory.

You must specify a name for the dataset, the folder path and also the structure of the data so that the framework will know which column must be used for every input of the network.

[3]:
train_folder = 'data'
data_struct = ['in1', '', 'target']
model.loadData(name='dataset', source=train_folder, format=data_struct)
============================ nnodely Model Dataset =============================
Dataset Name:                 dataset
Number of files:              1
Total number of samples:      28
Shape of target:              (28, 1, 1)
Shape of in1:                 (28, 5, 1)
================================================================================

you can also specify various parameters such as the number of lines to skip, the delimiter to use between data and if you want to include the header of the file.

[4]:
model.loadData(name='dataset_2', source=train_folder, format=data_struct, skiplines=4, delimiter='\t', header=None)
============================ nnodely Model Dataset =============================
Dataset Name:                 dataset_2
Number of files:              1
Total number of samples:      24
Shape of target:              (24, 1, 0)
Shape of in1:                 (24, 5, 1)
================================================================================

Load a dataset from a custom dictionary

you can build your own dataset with a dictionary containing all the necessary inputs of the network and passing it to the ‘source’ attribute

[5]:
import numpy as np
data_x = np.array(range(10))
data_a = 2
data_b = -3
dataset = {'in1': data_x, 'target': (data_a*data_x) + data_b}

model.loadData(name='dataset_3', source=dataset)
============================ nnodely Model Dataset =============================
Dataset Name:                 dataset_3
Number of files:              1
Total number of samples:      6
Shape of target:              (6, 1, 1)
Shape of in1:                 (6, 5, 1)
================================================================================

Load a dataset from a pandas DataFrame

you can also use a pandas dataframe as source for loading a dataset inside the nnodely framework

[6]:
import pandas as pd
# Create a DataFrame with random values for each input
df = pd.DataFrame({
    'in1': np.linspace(1,100,100, dtype=np.float32),
    'target': np.linspace(1,100,100, dtype=np.float32)})

model.loadData(name='dataset_4', source=df)
============================ nnodely Model Dataset =============================
Dataset Name:                 dataset_4
Number of files:              1
Total number of samples:      96
Shape of target:              (96, 1, 1)
Shape of in1:                 (96, 5, 1)
================================================================================

Resampling a pandas DataFrame

if you have a column representing time you can also use those values to resample the dataset using the sample time of the neuralized network

[7]:
df = pd.DataFrame({
    'time': np.array([1.0,1.5,2.0,4.0,4.5,5.0,7.0,7.5,8.0,8.5], dtype=np.float32),
    'in1': np.linspace(1,10,10, dtype=np.float32),
    'target': np.linspace(1,10,10, dtype=np.float32)})

model.loadData(name='dataset_resampled', source=df, resampling=True)
============================ nnodely Model Dataset =============================
Dataset Name:                 dataset_resampled
Number of files:              1
Total number of samples:      747
Shape of target:              (747, 1, 1)
Shape of in1:                 (747, 5, 1)
================================================================================

Get Samples from the Dataset

Once a dataset is loaded, you can use it to get random samples from the dataset. Set the ‘window’ argument to choose the number of samples to get from the specific dataset, and ‘index’ for selecting a specific time instant.

[8]:
sample = model.getSamples(dataset='dataset_4', window=5)
model(sample, sampled=True)
[8]:
{'out': [49.65475082397461,
  52.050872802734375,
  54.44699478149414,
  56.843116760253906,
  59.23923873901367]}