• Stars
    star
    359
  • Rank 118,537 (Top 3 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created about 13 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Mmseg Analysis plugin integrates Lucene mmseg4j-analyzer:http://code.google.com/p/mmseg4j/ into elasticsearch, support customized dictionary.

Mmseg Analysis for Elasticsearch

The Mmseg Analysis plugin integrates Lucene mmseg4j-analyzer:http://code.google.com/p/mmseg4j/ into elasticsearch, support customized dictionary.

The plugin ships with analyzers: mmseg_maxword ,mmseg_complex ,mmseg_simple and tokenizers: mmseg_maxword ,mmseg_complex ,mmseg_simple and token_filter: cut_letter_digit .

Versions

Mmseg ver ES version
master 5.x -> master
5.5.2 5.5.2
5.4.3 5.4.3
5.3.2 5.3.2
5.2.2 5.2.2
5.1.2 5.1.2
1.10.1 2.4.1
1.9.5 2.3.5
1.8.1 2.2.1
1.7.0 2.1.1
1.5.0 2.0.0
1.4.0 1.7.0
1.3.0 1.6.0
1.2.1 0.90.2
1.1.2 0.20.1

Package

mvn package

Install

Unzip and place into elasticsearch's plugins folder, download plugin from here: https://github.com/medcl/elasticsearch-analysis-mmseg/releases

Install by command: ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-mmseg/releases/download/v5.5.2/elasticsearch-analysis-mmseg-5.5.2.zip

Mapping Configuration

Here is a quick example:

1.Create a index

curl -XPUT http://localhost:9200/index

2.Create a mapping

curl -XPOST http://localhost:9200/index/fulltext/_mapping -d'
{
        "properties": {
            "content": {
                "type": "text",
                "term_vector": "with_positions_offsets",
                "analyzer": "mmseg_maxword",
                "search_analyzer": "mmseg_maxword"
            }
        }
    
}'

3.Indexing some docs

curl -XPOST http://localhost:9200/index/fulltext/1 -d'
{"content":"美国留给伊拉克的是个烂摊子吗"}
'

curl -XPOST http://localhost:9200/index/fulltext/2 -d'
{"content":"公安部:各地校车将享最高路权"}
'

curl -XPOST http://localhost:9200/index/fulltext/3 -d'
{"content":"中韩渔警冲突调查:韩警平均每天扣1艘中国渔船"}
'

curl -XPOST http://localhost:9200/index/fulltext/4 -d'
{"content":"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"}
'

4.Query with highlighting

curl -XPOST http://localhost:9200/index/fulltext/_search  -d'
{
    "query" : { "term" : { "content" : "中国" }},
    "highlight" : {
        "pre_tags" : ["<tag1>", "<tag2>"],
        "post_tags" : ["</tag1>", "</tag2>"],
        "fields" : {
            "content" : {}
        }
    }
}
'

Here is the query result


{
    "took": 14,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 2,
        "hits": [
            {
                "_index": "index",
                "_type": "fulltext",
                "_id": "4",
                "_score": 2,
                "_source": {
                    "content": "中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"
                },
                "highlight": {
                    "content": [
                        "<tag1>中国</tag1>驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首 "
                    ]
                }
            },
            {
                "_index": "index",
                "_type": "fulltext",
                "_id": "3",
                "_score": 2,
                "_source": {
                    "content": "中韩渔警冲突调查:韩警平均每天扣1艘中国渔船"
                },
                "highlight": {
                    "content": [
                        "均每天扣1艘<tag1>中国</tag1>渔船 "
                    ]
                }
            }
        ]
    }
}

Have fun.

More Repositories

1

elasticsearch-rtf

elasticsearch中文发行版,针对中文集成了相关插件,方便新手学习测试.
JavaScript
2,687
star
2

infini-gateway

Moved to: https://github.com/infinilabs/gateway
Go
326
star
3

book-elastic-search-in-action

Elastic 搜索开发实战
CSS
247
star
4

gopa-abandoned

GOPA, a spider written in Go.(NOTE: this project moved to https://github.com/infinitbyte/gopa )
Go
94
star
5

ElasticSearch.Net

a client written in .net, won‘t maintenance any more
C#
92
star
6

lua-resty-weedfs

weefs,lua,nginx and file post processing with ffmpeg and graphicsmagick
Lua
77
star
7

elasticsearch-analysis-string2int

string2integer analysis for elasticsearch,save your memory and reduce the size of your index,the size of the filedcache can be reduced from giga to mega,and also the query time can be reduced from minutes to millionseconds.
Java
62
star
8

elasticsearch-carrot2

a elasticsearch plugin integrated with carrot2,which clustering your search results into topics,
Java
47
star
9

elasticsearch-partialupdate

an elasticsearch plugin that allows to update a specify fileds of a document,avoid full reindex and reduce traffic costs
Java
40
star
10

weedfs

forked from:http://code.google.com/p/weed-fs/
Go
32
star
11

elasticsearch-river-email

An Email River Plugin for [Elasticsearch](http://www.elasticsearch.org/)
Java
26
star
12

elasticsearch-analysis-paoding

Paoding Analysis Plugin for ElasticSearch
Java
21
star
13

ansible

A Curated Ansible Toolkit for Elastic Stack
Ruby
18
star
14

book-elastic-search-in-action-resources

HTML
13
star
15

ElasticSearch-WebConsole

a webconsole to elasticsearch
JavaScript
13
star
16

pi-warning-light-for-elasticsearch

Add warning light and sound alarm to Elasticsearch, using Raspberry Pi and Alerting.
Python
11
star
17

salt-elasticsearch

deploy elasticsearch with saltstack
Scheme
8
star
18

ElasticMonitor

ElasticSearch Cluster Monitor on AIR
7
star
19

FlumeAgent

a flume agent,thrift+c#,multi collector supported
C#
7
star
20

elasticsearch-filter-redis

a customized search filter for elasticsearch,use external redis-store to do search result filtering,supposed to move some part of logic from index to outer redis.
Java
7
star
21

csv2sql

A data import tool written for the non-profit project: ifish(http://bit.ly/ifish-project)
Go
7
star
22

foo_now_playing

foobar插件,自动发送当前播放歌曲到新浪微博,foobar_plugin_auto_send_what_i_am_listening_to_t.sina.com
C
5
star
23

book-template

CSS
3
star
24

Lucene.Index.Walker

a tool to quick view the lucene's directory
C#
2
star
25

infini-framework-vendor

INFINI Framework Vendor Codes
Go
2
star
26

ReferenceAnalysts

this is a tool for analyzing dll's (c# .net) references and dependencies
C#
1
star
27

es-gateway-java-rest-client-test

Elasticsearch JAVA High Level RESTful Client with Elasticsearch Gateway with SSL enabled
Java
1
star