• Stars
    star
    267
  • Rank 153,621 (Top 4 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created over 9 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Superword is a Java open source project dedicated in the study of English words analysis and auxiliary reading.

Superword is a Java open source project dedicated in the study of English words analysis and auxiliary reading, including but not limited to, spelling similarity, definition similarity, pronunciation similarity, the transformation rules of the spelling, the prefix and the dynamic prefix, the suffix and the dynamic suffix, roots, compound words, text auxiliary reading, web page auxiliary reading, book auxiliary reading, etc..

Donate to support Superword

Getting Started:

1、Install JDK8

    Add the $JAVA_HOME/bin directory into the $PATH environment variable,ensure you can use Java command: 
    
    java -version
        java version "1.8.0_60"
        
    Tip:
    Must use JDK8 not JDK7.
        
2、Get the source code of superword

    git clone https://github.com/ysc/superword.git
    cd superword
    
    We suggest  you register a GitHub account, fork the superword project to your own account, 
    and then clone the source code from your own account.
    This facilitates the application of GitHub features "Pull requests" for collaborative development.
    
3、Configure MySQL database

    MySQL character encoding: UTF-8,
    Server IP Address: 127.0.0.1
    Server Port: 3306
    Database: superword
    User name: root
    Password: 
    
    notice: 
    If your mysql password is not empty, please modify the file:
    src/main/java/org/apdplat/superword/tools/MySQLUtils.java in line 49 to your password.
    
    Execute the script in MySQL command line:
    source src/main/resources/mysql/superword.sql
    source src/main/resources/mysql/word_definition.sql
    source src/main/resources/mysql/word_pronunciation.sql

4、Run the project

    mvn jetty:run

5、Use system

    Open browser access: http://localhost:8080/index.jsp
    Notice: The first time to access the system may be a little bit slow, be patient please.

Engaging in complex language behavior requires various kinds of knowledge of language:

Phonetics and Phonology — knowledge about linguistic sounds
Morphology — knowledge of the meaningful components of words
Syntax — knowledge of the structural relationships between words
Semantics — knowledge of meaning
Pragmatics — knowledge of the relationship of meaning to the goals and intentions of the speaker
Discourse — knowledge about linguistic units larger than a single utterance

Resources download

4000 Essential English Words

The audio files of the Merriam-Webster dictionary that contain 11053 words: Download address

The audio files of the Oxford dictionary that contain 31222 words: Download address

The HTML pages of the Oxford dictionary that contain 33376 words: Download addressParse Program

The HTML pages of the Merriam-Webster dictionary that contain 59809 words: Download addressParse Program

The HTML pages of the old version iCIBA dictionary that contain 61809 words: Download addressParse Program

The HTML pages of the new version iCIBA dictionary that contain 63777 words: Download addressParse Program

The HTML pages of the youdao dictionary that contain 63789 words: Download addressParse Program

The 249 PDF e-books are related to IT field and software development: it-software-domain.zip

Related articles

一种使用随机抽样梯度下降算法来预估词汇量的方法

如何正确地快速地看电影学英语

使用Java8实现自己的个性化搜索引擎

192本软件著作用词分析

2000个软件开发领域的高频特殊词及精选例句

英语单词音近形似转化规律研究

986组同义词辨析

3211个词及其反义词

13054个词及其词义数

词组习语3054组

1208个合成词

根据76大细分词性对单词进行归组

分析996个词根在各大考纲词汇中的作用

分析113个前缀在各大考纲词汇中的作用

分析151个后缀在各大考纲词汇中的作用

分析在各大考纲词汇中既没有词根也没有前缀和后缀的独立单词

分析在各大考纲词汇中同时拥有前缀后缀和词根的词

JDK源代码以及200多部软件著作中出现的以连字符构造的1011个合成词

利用1691个精选句子彻底掌握2898个单词

https://travis-ci.org/ysc/superword

More Repositories

1

QuestionAnsweringSystem

QuestionAnsweringSystem是一个Java实现的人机问答系统,能够自动分析问题并给出候选答案。
Java
1,957
star
2

word

Java分布式中文分词组件 - word分词
Java
1,812
star
3

cws_evaluation

Java开源项目cws_evaluation:中文分词器分词效果评估对比
Lex
948
star
4

APDPlat

APDPlat是Application Product Development Platform的缩写,即应用级产品开发平台。
JavaScript
521
star
5

data-generator

如果你在从事大数据BI的工作,想对比一下MySQL、GreenPlum、Elasticsearch、Hive、Spark SQL、Presto、Impala、Drill、HAWQ、Druid、Pinot、Kylin、ClickHouse、Kudu等不同实现方案之间的表现,那你就需要一份标准的数据进行测试,这个开源项目就是为了生成这样的标准数据。
Java
278
star
6

search

元搜索引擎
Java
225
star
7

HtmlExtractor

HtmlExtractor是一个Java实现的基于模板的网页结构化信息精准抽取组件。
Java
157
star
8

jsearch

jsearch:高性能的全文检索工具包
Java
92
star
9

rank

rank是一个seo工具,用于分析网站的搜索引擎收录排名。
Java
66
star
10

realtime-log

微服务日志之实时日志
Java
30
star
11

short-text-search

自定制的精准短文本搜索服务
Java
18
star
12

word_web

通过web服务器对word分词的资源进行集中统一管理
Java
17
star
13

counter

分布式环境下的原子计数器和API每天调用次数限制
Java
17
star
14

high-availability

保障服务的持续高可用、高性能及负载均衡
Java
17
star
15

baby-typing-game

适合2到6岁的宝宝打字游戏
HTML
10
star
16

borm

大数据的对象持久化
Java
10
star
17

ysc.github.com

ysc.github.com
CSS
2
star
18

luke

Automatically exported from code.google.com/p/luke
Java
1
star
19

AudiobooksForKids

A collection of best-selling audiobooks for kids, from timeless classics to popular series.
1
star