Emgraph (Embedding graphs) is a Python library for graph representation learning.

It provides a simple API for design, train, and evaluate graph embedding models. You can use the base models to easily develop your own model.

Install the latest version of Emgraph:

$ pip install emgraph

Quick start

Embedding wordnet11 graph using TransE model:

from sklearn.metrics import brier_score_loss, log_loss
from scipy.special import expit
from emgraph.datasets import BaseDataset, DatasetType
from emgraph.models import TransE

def train_transe(data):
    model = TransE(batches_count=64, seed=0, epochs=20, k=100, eta=20,
                   optimizer='adam', optimizer_params={'lr': 0.0001},
                   loss='pairwise', verbose=True, large_graphs=False)['train'])
    scores = model.predict(data['test'])
    return scores

if __name__ == '__main__':
    wn11_dataset = BaseDataset.load_dataset(DatasetType.WN11)
    scores = train_transe(data=wn11_dataset)
    print("Scores: ", scores)
    print("Brier score loss:", brier_score_loss(wn11_dataset['test_labels'], expit(scores)))

Evaluating ComplEx model after training:
```python import numpy as np from emgraph.datasets import BaseDataset, DatasetType from emgraph.models import ComplEx from emgraph.evaluation import evaluate_performance def complex_performance(data): model = ComplEx(batches_count=10, seed=0, epochs=20, k=150, eta=1, loss='nll', optimizer='adam')['train'], data['valid']))) filter_triples = np.concatenate((data['train'], data['valid'], data['test'])) ranks = evaluate_performance(data['test'][:5], model=model, filter_triples=filter_triples, corrupt_side='s+o', use_default_protocol=False) return ranks if __name__ == '__main__': wn18_dataset = BaseDataset.load_dataset(DatasetType.WN18) ranks = complex_performance(data=wn18_dataset) print("ranks {}".format(ranks)) ```

More examples

Embedding wordnet11 graph using DistMult model:

```python from sklearn.metrics import brier_score_loss, log_loss from scipy.special import expit from emgraph.datasets import BaseDataset, DatasetType from emgraph.models import DistMult def train_dist_mult(data): model = DistMult(batches_count=1, seed=555, epochs=20, k=10, loss='pairwise', loss_params={'margin': 5})['train']) scores = model.predict(data['test']) return scores if __name__ == '__main__': wn11_dataset = BaseDataset.load_dataset(DatasetType.WN11) scores = train_dist_mult(data=wn11_dataset) print("Scores: ", scores) print("Brier score loss:", brier_score_loss(wn11_dataset['test_labels'], expit(scores))) ```
Algorithms table
Model Reference
1 TransE Translating Embeddings for Modeling Multi-relational Data
2 ComplEx Complex Embeddings for Simple Link Prediction
3 HolE Holographic Embeddings of Knowledge Graphs
4 DistMult Embedding Entities and Relations for Learning and Inference in Knowledge Bases
5 ConvE Convolutional 2D Knowledge Graph Embeddings
6 ConvKB A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network

Call for Contributions

The Emgraph project welcomes your expertise and enthusiasm!

Ways to contribute to Emgraph:


If you happened to encounter any issue in the codes, please report it here. A better way is to fork the repository on Github and/or create a pull request.

- [x] Support CPU/GPU - [x] Vectorized operations - [x] Preprocessors - [x] Dataset loader - [x] Standard API - [x] Documentation - [x] Test driven development

Released under the BSD license


This repository is a transformation of the AmpliGraph library for TensorFlow 2, with a modular architecture implementation. It also draws inspiration from PyKEEN and Spectral. Credit is extended to these exceptional projects.
