LinkTask
Defines a link prediction GNN learning task.
Parameters
Section titled “Parameters”| Name | Type | Description | Optional |
|---|---|---|---|
connector | SnowflakeConnector | The connector object used for sending requests to the GNN engine. | No |
name | str | The name of the task, can be anything describing the task at hand. The name must comply with Snowflake object identifier rules. | No |
task_data_source | Dict | A dictionary mapping split names to table paths. For training, "train" and "validation" keys are required, while "test" is optional. For inference-only workflows, only "test" is required. Each value is a fully qualified Snowflake table name in Database.Schema.Table format. Multiple splits may reference the same table. | No |
source_entity_column | ForeignKey | A foreign key that specifies the name of the source entity column in the training, validation, and test tables, and references the corresponding source entity GNNTable and its column. The column identified by this foreign key represents the source node in the task. The node IDs contained in the foreign key’s column_name must match, or be a subset of, the values in the column specified by the foreign key’s link_to attribute. | No |
target_entity_column | ForeignKey | A foreign key that specifies the name of the target entity column in the training, validation, and test tables, and references the corresponding target entity GNNTable and its column. The column identified by this foreign key represents the target node in the task. The node IDs contained in the foreign key’s column_name must match, or be a subset of, the values in the column specified by the foreign key’s link_to attribute. (Users can choose not to provide targets for the test table.) | No |
task_type | TaskType | The type of the link task, it can be one of TaskType.LINK_PREDICTION or TaskType.REPEATED_LINK_PREDICTION | No |
time_column | str | If the dataset includes a time-based dimension, you can specify a timestamp column to incorporate temporal dependencies. Only one time column is supported. For details, see the Time Columns section. | Yes |
evaluation_metric | EvaluationMetric | The name of the evaluation metric that we want to optimize for | Yes |
current_time | bool | If set to False the current time of the task table will be reduced by one time unit. Useful when the time column at the task table does not need to see the values from the database tables at the same timestamp | Yes |
Examples
Section titled “Examples”Link Prediction Task With Time
Section titled “Link Prediction Task With Time”As shown in the figure, this dataset contains three tables:
customerswith candidate keycustomer_idarticles(products) with candidate keyarticle_idtransactionswith two foreign keys:customer_idlinking to thecustomerstable, andarticle_idlinking to thearticlestable, as well as a time columnt_dat.
Each row in the transactions table shows that a specific customer (customer_id) buying a specific product (article_id) on a specific date (t_dat).
Our task (purchase_task) is a recommendation task (link_prediction): given a customer and a date, we want to recommend articles the customer is likely to purchase. The time column is required so that the model does not see future transactions of a customer.
In this example, the source_entity_column links to the customers table, since we are making predictions about customers, and the target_entity_column links to the articles table, since we are predicting which articles the customers are likely to purchase next.

In this case, the task can be defined as follows:
from relationalai_gnns import LinkTask, TaskType, ForeignKey
# The task_data_source maps each dataset split to the corresponding table name.
link_task = LinkTask( connector=connector, name="recommendation_task", task_data_source={ "train": "DATABASE.SCHEMA.TRAIN", "test": "DATABASE.SCHEMA.TEST", "validation": "DATABASE.SCHEMA.VALIDATION" }, source_entity_column=ForeignKey(column_name='customer_id', link_to='customers.customer_id'), target_entity_column=ForeignKey(column_name='article_id', link_to='articles.article_id'), time_column="timestamp", task_type=TaskType.LINK_PREDICTION)Repeated Link Prediction Task Setting Evaluation Metric
Section titled “Repeated Link Prediction Task Setting Evaluation Metric”from relationalai_gnns import LinkTask, TaskType, ForeignKeyfrom relationalai_gnns import EvaluationMetric
rep_link_task = LinkTask( connector=connector, name="my_link_task", task_data_source={ "train": "DATABASE.SCHEMA.TRAIN", "test": "DATABASE.SCHEMA.TEST", "validation": "DATABASE.SCHEMA.VALIDATION" }, source_entity_column=ForeignKey(column_name='source_ids', link_to='TableWithCKey1.Id1'), target_entity_column=ForeignKey(column_name='target_ids', link_to='TableWithCKey2.Id2'), task_type=TaskType.REPEATED_LINK_PREDICTION, evaluation_metric=EvaluationMetric(name="link_prediction_map", eval_at_k=12))Inference-Only Task
Section titled “Inference-Only Task”If you only need to run inference (no training), you can create a task with just a "test" split:
from relationalai_gnns import LinkTask, TaskType, ForeignKey
link_task = LinkTask( connector=connector, name="inference_recommendation_task", task_data_source={ "test": "DATABASE.SCHEMA.TEST" }, source_entity_column=ForeignKey(column_name='source_ids', link_to='TableWithCKey1.Id1'), time_column="timestamp", task_type=TaskType.LINK_PREDICTION)Inference-only tasks can be used with trainer.predict() (by passing the task inside a new Dataset) but cannot be passed to trainer.fit() or trainer.fit_predict().
Methods
Section titled “Methods”LinkTask inherits from the GNNTable, so it has the same methods. It additionally provides a show_task() method:
.show_task()
Section titled “.show_task()”Prints the task metadata schema and task details.
Example
Section titled “Example”rep_link_task.show_task()Attributes
Section titled “Attributes”.source_entity_column
Section titled “.source_entity_column”Retrieves source_entity_column. Cannot be set after initialization. It is read-only.
.target_entity_column
Section titled “.target_entity_column”Retrieves target_entity_column. Cannot be set after initialization. It is read-only.
.current_time
Section titled “.current_time”Retrieves current_time. Cannot be set after initialization. It is read-only.