Dataset
Combines all tables and the task definition into a dataset ready for training.
Parameters
Section titled “Parameters”| Name | Type | Description | Optional |
|---|---|---|---|
connector | SnowflakeConnector | The connector object used for sending requests to the GNN reasoner. | No |
dataset_name | str | A user-defined name for the dataset, which must comply with Snowflake object identifier rules. | No |
tables | list of GNNTable instances | A collection of table objects that constitute the dataset. | No |
task_description | NodeTask or LinkTask | The target task assigned to the GNN | No |
Returns
Section titled “Returns”An instance of the Dataset class.
Example
Section titled “Example”from relationalai_gnns import Dataset
dataset = Dataset( connector=connector, dataset_name="my_first_dataset", tables=[table_with_ckey_1, table_with_ckey_2, table_with_foreign_keys], task_description=node_task)Attributes
Section titled “Attributes”| Name | Description | Type |
|---|---|---|
experiment_name | A dataset is uniquely identified by its experiment name, which is automatically generated in the format dataset_name_task_type_task_name. The experiment name is also visible through the JobMonitor. | str |
metadata_dict | A dictionary describing the dataset’s tables and task. | dict |
Methods
Section titled “Methods”| Name | Description | Returns |
|---|---|---|
visualize_dataset | Generates a visual representation of the dataset schema, encompassing its tables and the defined task. | pydot.core.dot |
print_data_config | Prints in json format a dictionary describing the dataset’s tables and task. | None |
.visualize_dataset()
Section titled “.visualize_dataset()”Generates a visual representation of the dataset schema, encompassing its tables and the defined task.
Parameters
Section titled “Parameters”| Name | Type | Description | Optional |
|---|---|---|---|
show_dtypes | bool | Whether to show the data types of each column. Default is False. | Yes |
Returns
Section titled “Returns”A graph visualization object of the dataset of class pydot.core.dot
Example
Section titled “Example”from IPython.display import Image, display
graph = dataset.visualize_dataset()plt = Image(graph.create_png())display(plt)from graphviz import Source
graph = dataset.visualize_dataset()# Experiment with font size and plot size to get a good visualizationfor node in graph.get_nodes(): font_size = node.get_attributes()['fontsize'] font_size = "16" node.set('fontsize', font_size)
graph.set_graph_defaults(size="10,10!") # Increase graph size
src = Source(graph.to_string())src # Display in notebook.print_data_config()
Section titled “.print_data_config()”Prints in json format a dictionary describing the dataset’s tables and task.
Example
Section titled “Example”dataset.print_data_config()