A Joint Embedding Method of Relations and Attributes for Entity Alignment

Entity alignment is to link the entities that point to same objects in the real world among different knowledge graphs (KGs). Existing kn10owledge-embedding-based entity alignment methods mostly regard KG as relation triples, while ignoring attributes and attribute values in KG. However, attribute information provides a valid information supplement for relation triple, alleviates relation triple's relation universality problem and information incompleteness problem, and improves accuracy of entity alignment task. In this paper, we make the first attempt towards combing relation and attribute triples for entity alignment. We divide a KG into relation triples and attribute triples, use parameter sharing (PS) joint method and translation-based knowledge embedding methods to embed them jointly. In addition, we design two strategies: direct accumulation and weight assignment strategy, to explore the effect of relation and attribute triple's embedding on experiment performance. The experimental results show that our method has significantly improved Hits@1, Hits@10 and Mean Rank metrics compared to baseline, and is the state of the arts on entity alignment task. The source code for this paper is available from https://github.com/ChengRui536/RAKRL.


I. INTRODUCTION
Knowledge graph (KG) is a knowledge organization form being adopted to describe various concepts, entities and corresponding relations/attributes in the real world. It is usually expressed as triples, which is further divided into relation triples and attribute ones. The existing largescale KGs, mainly FreeBase [1], DBpedia [2] and Wikidata [3], have been widely used in various applications, e.g., search engines, intelligent assistants, translation systems, question and answer systems and so on.
However, the existing single KG is usually low information coverage, incomplete knowledge description and low knowledge quality, different KGs usually have strong heterogeneity and many knowledge repetitions, which are not conducive to data sharing and integration. So how to fuse different KGs forming a KG with wide knowledge coverage and high knowledge correctness, has become an urgent problem for the applications based on KGs. Entity alignment is the most critical technology in knowledge fusion.
Some methods have been used for entity alignment task. The most traditional methods are supervised methods, which learn model from annotated data [4], [5]. Nevertheless, these models are typically inextensible and inflexible, and heavily rely on annotated data, thus they are usually time consuming and labor intensive. To avoid these disadvantages, traditional unsupervised methods based on probabilistic methods [6], similarity methods [7] and hierarchical graph models are proposed. However, traditional unsupervised methods usually based on some assumptions or predefined similarity metrics. To compensate for the flaws of supervised methods and traditional unsupervised methods, unsupervised knowledge embedding methods [8]- [13] were proposed. However, these methods all regard KG as relation triples, ignoring attribute ones in it. But attribute information is very useful. For example, for triple (China, area, 9634057), because'9634057' is not a entity, it's not a relation triple, so existing unsupervised knowledge embedding methods will abandon this data, but it's very useful for entity 'china' alignment.
To the best of our knowledge, most of the existed entity alignment methods via knowledge embedding are incapable of distinguishing relation triples from attribute ones.
To solve the problem, we make the first attempt towards combing relation and attribute triples for entity alignment, divide a KG into relation triples and attribute triples, use parameter sharing joint method and translation-based knowledge embedding methods to embed them jointly. Hence, our method is able to fully utilize the internal rich data in KGs without data discarding and flexible to be used in various practical application scenarios.
The main challenges of our method are: how to embed relation and attribute triples separately; how to combine embedding results of relation and attribute triples to achieve better performance. In this paper, according to the characteristics of relation and attribute triples, we use the most main stream and most widely used translation-based knowledge embedding methods (TransE and PTransE) to embed them; and to achieve better performance, we design two strategies for combining embedding results of relation and attribute triples, it's direct accumulate and weight assignment strategy.
The main contributions of this paper are:  We make the first attempt towards combing relation and attribute triples for entity alignment task, divide KG into relation triples and attribute triples, embed them simultaneously using translation-based knowledge embedding methods, this method fully utilizes the rich internal data of KG, has practicality and applicability.
 Two strategies, i.e., direct accumulation strategy and weight assignment strategy, are designed to explore the effect of relation and attribute triple's embedding results on entity alignment experiments.  We evaluated our model in entity alignment tasks, they show that our method has significantly improvedHits@1, Hits@10 and Mean Rank metrics compared to the baseline, and obtained the state of the arts results. The rest of this paper is organized as follows: Section II briefly introduces some related works; our method is described explicitly in Section III, while the obtained results are illustrated in Section IV, and finally Section V concludes the paper and gives several open lines of future research.

A. Entity Alignment
Depending on whether use annotated data, methods of entity alignment can be divided into supervised methods and unsupervised methods [14].
Supervised Methods. Supervised methods learn models from annotated data, which mainly include attribute-comparison-based methods, cluster-based methods and active learning methods. [4] proposed a method to automatically generates nonlinear attribute weights to align instances. Nevertheless, Supervised methods heavily rely on annotated datas, and are not extensible and flexible.
Traditional Unsupervised Methods. Traditional unsupervised methods mainly include probabilistic methods, similarity-based methods and hierarchical graph models. [6] proposed a PARIS system based on probabilistic methods, which can align knowledge bases without adjusting parameters and training data, but cannot deal with structural heterogeneity. [7] proposed an unsupervised framework CoLink for user identity link problem. The framework models both attribute based alignment model and relation based alignment model as binary classifiers, and training them in an iterative co-training manner. However, traditional unsupervised methods often require effective metrics based on some assumptions or predefined similarities, with some limitations.

Unsupervised Knowledge Embedding Methods.
Knowledge embedding method encodes information of KG into a continuous lowdimensional semantic space, in which have same meaning or related entities are often close to each other. This method is suitable for largescale KG alignment task. [9] solved cross language or heterogeneous entity alignment problem by jointly embedding global structure information of heterogeneous KGs. However, this method requires rich relational and structural information and does not have applicability. [10] constructed transformations between different vector spaces using TransE for multi-language knowledge alignment. [11] proposed cross KG to simultaneously learn embeddings of two different KGs. [12] proposed a joint embedding method to iteratively align entities, but this method ignores attribute triples. [14] proposed an iterative entity alignment method SEEA based on self-learning and knowledge embedding. This method clearly distinguishes attributes and relations in KG, but only attribute triples are embedded in it's experiment. [15] divided knowledge embedding methods into translation distance model and semantic matching model. The former uses a distance-based scoring function, and the latter uses a similarity-based scoring function. This paper mainly investigates the translation distance model including TransE and PTransE. TransE [16] embeds entities and relations into lowdimensional vector spaces, treats in each triple (ℎ, , ) as a translation from ℎ to , makes ℎ + ≈ as much as possible by constantly adjusting ℎ, , . However, this method only considers direct relation of triples in knowledge embedding, ignores rich inference relations that exist between entities. Therefore, [17] proposed a path-based embedding model PTransE, add multistep path embedding between entities to TransE. This paper is based on unsupervised knowledge embedding methods, we use TransE and PTransE for entity alignment experiments.

III. OUR METHOD
We explain the symbols we used in this paper, as shown in Table I. Without loss of generality, we experimented on two KGs which have different structures, they are expressed as 1 = ( 1 , 1 , 1 , 1 , 1 , 1 ) and 2 = ( 2 , 2 , 2 , 2 , 2 , 2 ). We input , 1 and 2 , 1 and 2 to perform entity alignment task. A. Overall Structure Our method has three parts: relation triples joint embedding part, attribute triples joint embedding part, and co-iterative entity alignment part. Showed as in Fig. 1.
In order to explore the influence of relation triples and attribute triples on entity alignment performance, we design two objective function calculation strategies, namely direct accumulation and weight assignment strategy. If using the direct accumulation strategy, the objective function is defined as: If using the weight assignment strategy, the objective function is: , and represent respectively the score of relation triples joint embedding part, attribute triples joint embedding part and co-iterative entity alignment part, ∈ (0,1) represent the weight of relation triples joint embedding part.

B. Relation Triples Joint Embedding
In this part, we use parameter sharing (PS) method, TransE and PTransE for relation triples joint embedding. This part can also be implemented by other joint and knowledge embedding methods. We define the score function of this part as: and represent the scores of PS joint model and relation triples embedding method respectively.

Parameter Sharing Joint Model
Since aligned entities have same meaning in different KGs, they can intuitively share same embedding. Formally, foreach aligned entity pair ( , ′ ), we define ≡ ′ . The parameter sharing (PS) model [18] calibrates entities in 1 and 2 into same semantic space simply and effectively. Therefore the score function of this model is:

Relation Triples Embedding
Based on the PS joint method part, in this part, we use TransE and PTransE to embed relation triples into a semantic vector space.

C. Attribute Triples Joint Embedding
Same as 4.2 section, since attribute triple = ( , , ) is composed of entity, attribute and attribute value, there is no multi-step relation path information, so we use PS method and TransE for attribute triples joint embedding. We define the score function this part as: and represent the scores of PS joint model and attribute triples embedding method respectively.

Parameter Sharing Joint Model
Same as the PS joint model in 4.2. The score function is:

Attribute Triples Embedding
Based on PS joint method, in this part, we use TransE to embed attribute triples of different KGs into a semantic vector space.
TransE. Similarly, we also regard in each triple ( , , ) as a translation from to , by constantly adjusting , , to make + ≈ as much as possible. So its energy function is defined as: Also, we use a margin-based score function as training target, which is defined as: Similarly, − is derived by randomly replacing one of three components with others in the attribute triple ( , , ) randomly.

D. Co-iterative Entity Alignment
Based on relation triples joint embedding part and attribute triples joint embedding part, we can perform entity alignment according to semantic distance between entities in the unified semantic space. The semantic distance is defined as: ( 1 , 2 ) =∥ 1 − 2 ∥ 1/ 2 , ∀ 1 ∈ 1 , 2 ∈ 2 . Thus, for an unaligned entity 1 in a KG, we can find the nearest entity ^2 in another KG, i.e.: ^2 = arg min 2 ( ( 1 , 2 )). In addition, we use a distance threshold , and consider that if is very likely to be the aligned entity cannot be the aligned entity of 1 . Because relation and attribute triples in same KG have same entity set , the newly aligned entities can help update relation triples joint embedding space and attribute triples joint embedding space simultaneously, so more aligned entities can be found. Therefore, we use two iterative strategies for joint embedding and entity alignment from baseline [12], it's hard alignment (HA) and soft alignment (SA).
Hard Alignment. This strategy simply uses PS model to newly aligned entities, i.e., 2 = 1 , adds each newly aligned entity pair ( 1 , 2 ) to the aligned seeds set , then updates relation triples joint embedding space and attribute triples joint embedding space to perform further entity alignment. Since the newly aligned entity pair is directly added to , it's score function is: When there is a wrong newly aligned entity pair, HA will cause an error accumulation problem. So the SA strategy assigns a reliability score to each newly aligned entity pair, that is, for each newly aligned entity pair ( 1 , 2 ), a mapping : ( 1 , 2 )→[0,1] is defined to calculate its reliability score: ( 1 , 2 ) = ( ( − ( 1 , 2 ))) (19) (·) is a sigmoid function, is a hyperparameter satisfying ∈ +.
The score function of SA strategy using direct accumulation strategy is defined as: While using weight assignment strategy is defined as: where, .

B. Evaluation Metrics
There are two evaluation metrics for entity alignment task: (1) average order of the correct entity (i.e., Mean Rank); (2) correct answer ranked in the top 10 and the top 1 (i.e., Hits@10 and Hits@1). Usually, higher Hits@10, Hits@1and lower Mean Rank are preferred which indicates better alignment.

C. Datasets
The relation triple dataset in this paper is from FB15k [16], which contains 14,951 entities, 1,345 relations and 592,213 triples. The attribute triple dataset is from FB15K-237 [20], which is a subset of FB15k, with train set, validation set and test set, respectively containing 272,115, 17,535, and 20,466 triples. We named the relation triple dataset as RDFB and the attribute triple dataset as ADFB.
RDFB. We construct this dataset by randomly dividing the triples in FB15k into two same size subsetsRT1andRT2, and making the overlap ratio O= 0.5 between RT1 and RT2. The entity set E and relation set R in two KGs are same, and the alignment seed set L is selected from the most common entities.
ADFB. We integrated train set, validation set and test set inFB15K237 to form a dataset which contains 310,116 triples, 310,116 attributes and 310,116 attribute values. Same as the RDFB dataset, we randomly divide these attribute triples into two same size subsetsAT1andAT2, and make the overlap ratio O= 0.5betweenAT1andAT2. The entity set E, attribute set A, and attribute value set V in two KGs are same. The entity set E of RDFB and ADFB is same. The settings of datasets are shown in Table II.   TABLE II: DATASET SETTING TABLE   Dataset   1  2  1  2   RDFB  1345  14951  444159  444160  ----5000  0.5  ADFB  -14951  --310116  310116  232587 232587 -0.5

D. Experiment Settings
For comparison, we followed the experiment settings in baseline, using stochastic gradient descent (SGD) as the optimizer; knowledge embeddings E={e|e∈E}, R={r|r∈R}, A={a|a∈A} and V={v|v∈V} are initialized to a normal distribution; the dissimilarity measure in TransE is realized by L1 norm; all models are based on same dimension n= 50 and epochs=3000, and embeddings of entities, relations, attributes, and attribute values are all the same dimensions. For the hyperparameters, we set γ= 1.0, k= 1.0, learning rate λ= 0.001, for hard alignment, θ= 1.0, and soft alignment θ= 3.0. We set every 500 iterations to perform soft alignment starting from the 1000th iteration.

E. Experiment Results
Here, the obtained experiment results are illustrated in Table III. As uneviled, we can draw the following conclusions: (1) Our method significantly improves entity alignment performance, and achieves the state of the arts results. This shows that using attribute triples in KG will significantly improve entity alignment performance. (2) The results of direct accumulation strategy and weight assignment strategy are not much different. This may be because relation triples and attribute triples are initialized same when knowledge embedding, and optimized method is also the same, so the difference between them is small. (3) The PTransE based approach outperforms the corresponding TransE based approach, indicating that more comprehensive knowledge embedding will result in more accurate alignment. (4)Soft alignment strategy is better than the corresponding hard alignment strategy, this maybe because hard alignment strategy cause error accumulation problem.  Furthermore, we show the results of baseline method and our method under different iterations in Table IV and Fig. 2, we find that: (1) Regardless of iterations number, our method is better than baseline method correspondingly. Even HA model of our method is better than SA model of baseline method, the results of our method at 1000th epoch is better than baseline method at 3000th epoch. These prove that the adding of attribute triples is superior to the transformation of iterative alignment strategies and the increase of iteration number. (2) The performance of all methods improves with the number of iterations increasing. From the 500th iteration to the 3000th iteration, the metrics of soft alignment strategy have a relatively stable upward or downward trend, while hard alignment's metrics are unstable. This may be due to hard alignment's error accumulation problem. To balance experiment's performance and efficiency, we provide 3000thiteration results finally.

V. CONCLUSION AND FUTURE WORK
This paper makes the first attempt towards combing relation and attribute triples for entity alignment task, by dividing KG into relation triples and attribute triples, using parameter sharing joint method and translation-based knowledge embedding methods to embed them jointly. Moreover, to verify the influence of relation and attribute triples' embedding results on experiment performance, we design direct accumulate strategy and weight assignment strategy. The experimental results show that our method obviously improves the performance of entity alignment task, and is the state of the arts. The source code for this paper can be obtained fromhttps://github.com/ChengRui536/RAKRL.
In the future, we will explore the following research directions: (1) consideration of the rich external information in KGs (such as descriptive text) for entity alignment; (2) application of our model in real medical knowledge graph entity alignment tasks; (3) analyzing the performance of other effective knowledge embedding models with the adoption of our method.