^{1}

^{2}

^{3}

^{1}

^{3}

^{1}

^{2}

^{3}

The similarity of concepts is a basic task in the field of artificial intelligence, e.g., image retrieval, collaborative filtering, and public opinion guidance. As a powerful tool to express the uncertain concepts, similarity measure based on cloud model (SMCM) is always utilized to measure the similarity between two concepts. However, current studies on SMCM have two main limitations: (1) the similarity measures based on conceptual intension lack interpretability for merging the numerical characteristics and cannot discriminate some different concepts. (2) The similarity measures based on conceptual extension are always instable and inefficient. To address the above problems, an uncertain distribution-based similarity measure of cloud model (UDCM) is proposed in this paper. By analyzing the definition of the CM, we propose a new complete uncertainty including first-order and second-order uncertainty to calculate the uncertainty more accurately. Then, based on the difference between the complete uncertainty of two concepts, the computing process of UDCM and its some properties are introduced. Finally, we exhibit its advantages by comparing with other methods and verify its validity by experiments.

The similarity of concepts is a basic sense for human cognition, which is also a fundamental task in artificial intelligence. It plays a crucial role in semantic information retrieval systems [

From the perspective on conceptual intension, the similarity of the numerical characteristics can depict the similarity of concepts. To express the complex form of uncertainty, different numerical characteristics express different meaning for an uncertain concept. There lacks a reasonable method of merging numerical characteristics to measure similarity.

Figure

On the contrary, the similarity measure is always computed by random realization, which directly reflects the similarity from abundant samples of concepts. The more the samples are generated, the higher the accuracy of SMCM is. Therefore, we have to spend excessive computing time acquiring accurate SMCM. Besides, due to random realization, the results of SMCM are different each time, which can be illustrated in the following example:

Shooting results of three shooters.

Supposing two cloud models

From the above analyses, it can be seen that SMCM still needs further study. To address the above problems, in this paper, we propose a new notion called complete uncertainty to depict the whole uncertainty in the process from numerical characteristics to the conceptual extension. Then, a new SMCM is presented based on completed uncertainty, which reflects the similarity of the uncertain distribution of two concepts. Compared with the SMCM based on extension, the new SMCM has an invariable result. Besides, it has a more reasonable method to merge numerical characteristics compared with the SMCM based on intension. Moreover, because that new SMCM reflects the complete uncertainty of the CM, it can acquire more accurate similarity results between two concepts.

The remainder of this paper is organized as follows. The related definitions of the CM and current SMCM are introduced in Section

Results of 10 similarity measures of

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
---|---|---|---|---|---|---|---|---|---|---|

Similarity measure | 0.9858 | 0.9766 | 0.9522 | 0.9912 | 0.9832 | 0.9732 | 0.9655 | 0.9557 | 0.9321 | 0.9815 |

In this section, we review relative concepts of the CM and current methods for similarity of the cloud model.

Let

As a crucial model of CM, the Gaussian cloud model is applied widely due to the universality of the Gaussian distribution, and we only discuss the Gaussian cloud model in this paper. Gaussian cloud model introduces three numerical characteristics including

In cognitive computing, cloud drops are called conceptual extension, i.e., samples of a concept. The numerical characteristics, expectation

Characteristic curves of

To describe the similarity between two cloud concepts, many SMCM are proposed currently. Generally speaking, a suitable similarity measure should assure the correct conclusions in specific situations and require discriminability, efficiency, stability, and interpretability. Zhang et al. [

A list of similarity measures of CM.

CS | LICM | MCM | ECM | |
---|---|---|---|---|

Discriminability | High | Low | Low | Low |

Efficiency | Low | High | High | High |

Stability | Low | High | High | High |

Interpretability | High | Low | Medium | Medium |

From Table

From discussion in Section

Two uncertainties and relationships among numerical characteristics.

Based on formula (

Let

In Definition

Figure

Next, we try to calculate

Firstly, we should find the relationship between membership functions of two fuzzy sets when their elements are equal. For each

Then,

So, (

It is obvious that membership functions of both fuzzy sets

The remainder is to calculate the integral in formula (

Function

If two intervals

In order to explain calculation of Definition

We divide

Then, we calculate

There is a special situation of Definition

Illustration of Definition

Results of

−0.1000 | 0.1222 | 0.3444 | 0.5667 | 0.7889 | 1.0111 | 1.2333 | 1.4556 | 1.6778 | 1.9000 | |
---|---|---|---|---|---|---|---|---|---|---|

0.4462 | 0.1638 | 0.0089 | 8.9e-4 | 0.0442 | 3.1e-7 | 5.8e-4 | 0.0048 | 0.0131 | 0.0233 |

UDCM with variation of

Let

Since

Hence, the similarity of

The UDCM of

In order to illustrate UDCM has a high ability to distinguish two different concepts, we have the following theorem.

Let

To prove this lemma, we must employ measure theory. Sufficiency: suppose that

Necessary: suppose

Let

Obviously, sufficiency can be proved by Theorem

In order to demonstrate high discriminability of UDCM clearly as Theorem

Similarity values of LICM, ECM, MCM, and UDCM.

LICM | ECM | MCM | UDCM | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1.0000 | 0.9986 | 0.9601 | 1.0000 | 0.4898 | 0.5714 | 1.0000 | 0.6065 | 0.8732 | 1.0000 | 1.0000 | 0.1634 | 0.4000 | 0.2149 | |||

— | 1.0000 | 0.9986 | 0.9601 | — | 1.0000 | 0.4898 | 0.2665 | — | 1.0000 | 0.6190 | 0.6065 | — | 1.0000 | 0.0851 | 0.1230 | |

— | — | 1.0000 | 0.9530 | — | — | 1.0000 | 0.5714 | — | — | 1.0000 | 0.8732 | — | — | 1.0000 | 0.1091 | |

— | — | – | 1.0000 | — | — | — | 1.0000 | — | — | — | 1.0000 | — | — | — | 1.0000 |

An appropriate SMCM is important for the time series classification. BCT reserves uncertain features in processing reduction of the dimensions. CM has been applied in many relative domains of time series [

As shown in Figure

Comparison among SMCM.

In order to verify similarity results in accordance with the uncertainty distribution of concepts, we measure the similarity of four shooters’ performance. We suppose cloud models of four shooters’ performance

Score statistics of 100 times shooting simulated by FCT.

0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
---|---|---|---|---|---|---|---|---|---|---|---|

0 | 0 | 2 | 2 | 4 | 3 | 12 | 17 | 19 | 29 | 12 | |

42 | 6 | 7 | 5 | 9 | 9 | 6 | 5 | 5 | 5 | 1 | |

37 | 2 | 5 | 4 | 2 | 9 | 9 | 13 | 10 | 9 | 0 | |

2 | 7 | 4 | 10 | 11 | 11 | 11 | 18 | 10 | 10 | 6 |

Results of shooting and histograms of score statistics.

Table

Comparison among different similarity measures.

Similarity measure | |||
---|---|---|---|

Extension distribution | 0.2586 | ||

LICM | 0.6111 | 0.9820 | |

ECM | 0.4809 | 0.7780 | |

UDCM | 0.1614 | 0.1714 |

In multicriteria group decision-making (MCDM), the linguistic variable is a good choice to express personal sense. There are various linguistic variables applied widely in different fields, e.g., 2-tuple linguistic model [

(continued from Example

To acquire a more accurate similarity, Wang et al. [

FDCM with variation of

Stability of FDCM and UDCM.

Comparison of FDCM and UDCM in MCDM.

Method | Final group decision matrix | Overall score of different | Ranking | Time cost | |||||
---|---|---|---|---|---|---|---|---|---|

FDCM | (4.864 0.352 0.024) | (3.922 0.369 0.025) | (4.536 0.343 0.023) | 5000 | 3.152 | 3.782 | 4.272 | ||

(5.420 0.386 0.026) | (5.465 0.344 0.023) | (4.165 0.333 0.022) | 50000 | 3.169 | 3.780 | 4.289 | 103.117s | ||

(4.725 0.373 0.025) | (6.295 0.378 0.025) | (5.519 0.352 0.023) | 500000 | 3.175 | 3.782 | 4.285 | |||

UDCM | (5.582 0.356 0.024) | (3.437 0.380 0.025) | (5.469 0.347 0.023) | 5000 | 3.294 | 3.488 | 3.888 | ||

(4.411 0.381 0.025) | (5.073 0.353 0.024) | (4.238 0.312 0.021) | 50000 | 3.286 | 3.497 | 3.862 | 17.423s | ||

(4.045 0.366 0.025) | (5.925 0.361 0.024) | (4.693 0.323 0.021) | 500000 | 3.286 | 3.497 | 3.871 |

Similarity of concepts is a fundamental study in uncertain artificial intelligence. By utilizing FCT and BCT, bidirectional cognitive transformation between intension and extension of a concept is realized by the CM. Furthermore, CM reflects the uncertainty of qualitative concept itself and, meanwhile, reveals the objective relationship between probability and fuzziness in the uncertain concept. As a significant expression, distribution is always utilized to describe the uncertain phenomenon. Based on this, we propose a new similarity measure UDCM and introduce its calculation in detail. Due to employing complete uncertainty, it acquires similarity results in accordance with the uncertain distribution and then gives some valuable consultations in synthetic evaluation. Besides, UDCM has merits of discriminability and stability and is an effective tool for cognitive computing. Finally, UDCM is also a framework of SMCM. Employing different forms of second-order uncertainty will result in different results. In the future, selection of uncertain forms for different situations also deserves to be studied.

The data used to support the findings of this study are included within the article.

The authors declare that they have no conflicts of interest.

This work was supported by Innovation and Exploration Project of Guizhou Province (QKHPTRC [2017]5727–06), PhD Initiation Fund of Zunyi Normal University (ZSBS[2019]04), and PhD Training Program of Chongqing University of Posts and Telecommunication (no. BYJS201902).