Skip to content

Dataset

Combines all tables and the task definition into a dataset ready for training.

NameTypeDescriptionOptional
connectorSnowflakeConnectorThe connector object used for sending requests to the GNN reasoner.No
dataset_namestrA user-defined name for the dataset, which must comply with Snowflake object identifier rules.No
tableslist of GNNTable instancesA collection of table objects that constitute the dataset.No
task_descriptionNodeTask or LinkTaskThe target task assigned to the GNNNo

An instance of the Dataset class.

from relationalai_gnns import Dataset
dataset = Dataset(
connector=connector,
dataset_name="my_first_dataset",
tables=[table_with_ckey_1, table_with_ckey_2, table_with_foreign_keys],
task_description=node_task
)
NameDescriptionType
experiment_nameA dataset is uniquely identified by its experiment name, which is automatically generated in the format dataset_name_task_type_task_name. The experiment name is also visible through the JobMonitor.str
metadata_dictA dictionary describing the dataset’s tables and task.dict
NameDescriptionReturns
visualize_datasetGenerates a visual representation of the dataset schema, encompassing its tables and the defined task.pydot.core.dot
print_data_configPrints in json format a dictionary describing the dataset’s tables and task.None

Generates a visual representation of the dataset schema, encompassing its tables and the defined task.

NameTypeDescriptionOptional
show_dtypesboolWhether to show the data types of each column. Default is False.Yes

A graph visualization object of the dataset of class pydot.core.dot

from IPython.display import Image, display
graph = dataset.visualize_dataset()
plt = Image(graph.create_png())
display(plt)

Prints in json format a dictionary describing the dataset’s tables and task.

dataset.print_data_config()