Emgraph

Emgraph (Embedding graphs) is a Python library for graph representation learning.

It provides a simple API for design, train, and evaluate graph embedding models. You can use the base models to easily develop your own model.

Installation

Install the latest version of Emgraph:

$ pip install emgraph

Quick start

Embedding wordnet11 graph using TransE model:

from sklearn.metrics import brier_score_loss, log_loss
from scipy.special import expit
from emgraph.datasets import BaseDataset, DatasetType
from emgraph.models import TransE


def train_transe(data):
    
    model = TransE(batches_count=64, seed=0, epochs=20, k=100, eta=20,
                   optimizer='adam', optimizer_params={'lr': 0.0001},
                   loss='pairwise', verbose=True, large_graphs=False)
    model.fit(data['train'])
    scores = model.predict(data['test'])
    return scores
    

if __name__ == '__main__':
    
    wn11_dataset = BaseDataset.load_dataset(DatasetType.WN11)
    
    scores = train_transe(data=wn11_dataset)
    print("Scores: ", scores)
    print("Brier score loss:", brier_score_loss(wn11_dataset['test_labels'], expit(scores)))

Evaluating ComplEx model after training:
```python import numpy as np from emgraph.datasets import BaseDataset, DatasetType from emgraph.models import ComplEx from emgraph.evaluation import evaluate_performance def complex_performance(data): model = ComplEx(batches_count=10, seed=0, epochs=20, k=150, eta=1, loss='nll', optimizer='adam') model.fit(np.concatenate((data['train'], data['valid']))) filter_triples = np.concatenate((data['train'], data['valid'], data['test'])) ranks = evaluate_performance(data['test'][:5], model=model, filter_triples=filter_triples, corrupt_side='s+o', use_default_protocol=False) return ranks if __name__ == '__main__': wn18_dataset = BaseDataset.load_dataset(DatasetType.WN18) ranks = complex_performance(data=wn18_dataset) print("ranks {}".format(ranks)) ```

More examples

Embedding wordnet11 graph using DistMult model:

```python from sklearn.metrics import brier_score_loss, log_loss from scipy.special import expit from emgraph.datasets import BaseDataset, DatasetType from emgraph.models import DistMult def train_dist_mult(data): model = DistMult(batches_count=1, seed=555, epochs=20, k=10, loss='pairwise', loss_params={'margin': 5}) model.fit(data['train']) scores = model.predict(data['test']) return scores if __name__ == '__main__': wn11_dataset = BaseDataset.load_dataset(DatasetType.WN11) scores = train_dist_mult(data=wn11_dataset) print("Scores: ", scores) print("Brier score loss:", brier_score_loss(wn11_dataset['test_labels'], expit(scores))) ```

**Algorithms table**
	Model	Reference
1	`TransE`	Translating Embeddings for Modeling Multi-relational Data
2	`ComplEx`	Complex Embeddings for Simple Link Prediction
3	`HolE`	Holographic Embeddings of Knowledge Graphs
4	`DistMult`	Embedding Entities and Relations for Learning and Inference in Knowledge Bases
5	`ConvE`	Convolutional 2D Knowledge Graph Embeddings
6	`ConvKB`	A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network

Call for Contributions

The Emgraph project welcomes your expertise and enthusiasm!

Ways to contribute to Emgraph:

Writing code
Review pull requests
Develop tutorials, presentations, and other educational materials
Translate documentation and readme contents

Issues

If you happened to encounter any issue in the codes, please report it here. A better way is to fork the repository on Github and/or create a pull request.

[//]: # (

Metrics

) [//]: # (

Metrics that are calculated during evaluation:

) [//]: # () [//]: # (> * For further usages and calculating different metrics) [//]: # () [//]: # (

Dataset format

) [//]: # (

Your dataset should be in the following format (Exclude the 'Row' column):

)

Features

- [x] Support CPU/GPU - [x] Vectorized operations - [x] Preprocessors - [x] Dataset loader - [x] Standard API - [x] Documentation - [x] Test driven development

If you find this project helpful, please consider giving it a :star:.

License

Released under the BSD license

Credit

This repository is a transformation of the AmpliGraph library for TensorFlow 2, with a modular architecture implementation. It also draws inspiration from PyKEEN and Spectral. Credit is extended to these exceptional projects.

Contact

Copyright © 2019-2024 Emgraph Developers
Soran Ghaderi (soran.gdr.cs@gmail.com)   follow me   
Taleb Zarhesh (taleb.zarhesh@gmail.com)  follow me