simhash函数 如何计算文本相似度
发表在Python答疑区
2024-07-23 悬赏:20 学分
是否精华
是
否
版块置顶:
是
否
使用Simhash模块 计算出了两个文本的海明距离,但是不知道该如何使用simhash自带的函数进一步计算相似度。
如果进一步自定义函数,应该怎么计算呢?求各位相助
from simhash import Simhash
hash1 = Simhash(u'what is your real answer he asked the criminal when they firstly meet each other in ')
hash2 = Simhash(u'the criminal did not tell anybody his name and motivation when they' )
print('hash1',hash1)
print('hash2',hash2)
print('hash1.distance(hash2):',hash1.distance(hash2))
>--hash1 <simhash.Simhash object at 0x000002026372E310
>>--hash2 <simhash.Simhash object at 0x000002026372E730
>>--hash1.distance(hash2): 28