Discover China's Leading Open Source Projects: Explore top-notch open source initiatives hailing from the vibrant tech community of China.
lw-lin/CoolplaySpark
酷玩 Spark: Spark 源代码解析、Spark 类库等geekyouth/SZT-bigdata
深圳地铁大数据客流分析系统🚇🚄🌟smallnest/C1000K-Servers
⚡ High performance websocket servers implemented by Spray-can, Netty, undertow, jetty, Vert.x, Grizzly, node.js and Go. It supports 1,200,000 active websocket connectionsjacksu/utils4s
scala、spark使用过程中,各种测试用例以及相关资料整理tminglei/slick-pg
Slick extensions for PostgreSQLXiaoMi/MiNLP
XiaoMi Natural Language Processing ToolkitsCSUG/HouseMD
HouseMD is an awesome diagnosing tool better than BTraceMethodJiao/PkpmSpark
awesome 三维数据挖掘 数据分析 & 推荐xubo245/SparkLearning
Learning Apache spark,including code and data .Most part can run local.Centaur/repox
Make sbt more responsiveluochana/News_recommend
基于Spark的新闻推荐系统,包含爬虫项目、web网站以及spark推荐系统Ldpe2G/DeepLearningForFun
Implementation of some interesting ideas of deeplearning.baolibin/Bigdata
大数据处理相关技术学习之路(持续更新中...)。 Bigdata整理 --> 慢慢滴~ 大数据相关技术包括离线处理,实时处理,OLAP等,如hadoop、spark、flink、hive、hbase、oozie...以及大数据项目,如用户画像、数据仓库等,欢迎感兴趣的小伙伴一起来开发...scalad/LayIM
基于HTML5 WebSocket的一款IM即时通讯软件,使用Gradle集成了Scala、SpringBoot、Spring MVC、Mybatis、Redis等,前端使用了LayIm框架zhengruifeng/spark-libFM
An implement of Factorization Machines (LibFM)LeechanX/Netflix-Recommender-with-Spark
基于Apache Spark的Netflix电影的离线与实时推荐系统titicaca/spark-iforest
Isolation Forest on Sparksmallnest/douban-recommender
基于Spark ML实现的豆瓣电影推荐系统neoremind/kraps-rpc
A RPC framework leveraging Spark RPC moduleQihoo360/XSQL
Unified SQL Analytics Engine Based on SparkSQLcookeem/CookIM
Distributed web chat application base websocket built on akka.daizikaikou/learningSpark
学习spark写的scala代码,工具使用的是IDEA2017.1.6,欢迎starSidneyXu/AndroidDemoIn4Languages
Comparison among Java, Groovy, Scala, Kotlin in Android Development.eryk/squant
SQuant是使用scala语言编写的量化开发工具箱,提供开箱即用的A股股票数据和外汇数据(docker镜像),以及高效的回测框架与交易模块。方便Java/Scala爱好者进行量化投资研究。 QQ群:281599099,微信公众号:Python量化交易实战。对,我已经转python了。。。LinMingQiang/sparkstreaming
💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算);🚀 支持运行过程中增删topic;🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。qifun/stateless-future
Asynchronous programming in fully featured Scala syntax.yaooqinn/spark-authorizer
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark | This repo is contributed to Apache Kyuubi | 项目已迁移至 Apache Kyuubialiyun/aliyun-emapreduce-datasources
Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.JerryLead/SparkLearning
Learning to write Spark examplesyouzan/gatling-dubbo
A gatling plugin for running load tests on Apache Dubbo(https://github.com/apache/incubator-dubbo) and other java ecosystem.TianLangStudio/DataXServer
为DataX(https://github.com/alibaba/DataX) 提供远程多语言调用(ThriftServer,HttpServer) 分布式运行(DataX on YARN) 功能LoveLonelyTime/Bergamot
An exquisite superscalar RV32GC processor.qindongliang/streaming-offset-to-zk
一个手动管理spark streaming集成kafka时的偏移量到zookeeper中的小项目liguohua-bigdata/simple-flink
xiaogp/recsys_spark
Spark SQL 实现 ItemCF,UserCF,Swing,推荐系统,推荐算法,协同过滤19801201/SpinalHDL_CNN_Accelerator
CNN accelerator implemented with Spinal HDLalibaba/SparkCube
SparkCube is an open-source project for extremely fast OLAP data analysis. SparkCube is an extension of Apache Spark.chucheng92/SwordOffer
🔥剑指offer题解(Java & Scala实现)STHSF/TextRank
基于PageRank的TextRank方法, 可以应用于中文关键词、短语、摘要提取程序,代码使用Scala编写。Qihoo360/XLearning-XDML
extremely distributed machine learningSidneyXu/JGSK
Java,Groovy,Scala,Kotlin 四种语言的特点对比dyweb/scrala
Unmaintained 🐳 ☕ 🕷️ Scala crawler(spider) framework, inspired by scrapy, created by @gaocegegeJerryLead/ApacheSparkBook
BaiGang/spark_multiboost
An implementation of the multi-class/multi-label classifier, of which the training is carried out using AdaBoost.MH on Apache Spark.Kent7306/akkaflow
akkaflow是一个基于akka架构上构建的分布式高可用DAG工作流调度工具,可以把子节点分配在集群机器上并行执行,高效利用集群资源。aliyun/MaxCompute-Spark
MaxCompute spark demo for building a runnable application.PasaLab/marlin
A Distributed Matrix Operations Library Built on Top of Sparkqingmang-team/chanamq
Open source AMQP messaging broker based on AkkaIronmanJay/UserBehaviorAnalysis
模拟电商系统上线运行一段时间后,根据收集到大量的用户行为数据,利用大数据技术(Flink)进行深入挖掘和分析,进而得到感兴趣的商业指标并增强对风险的控制。 整体可以分为用户行为习惯数据和业务行为数据两大类。用户的行为习惯数据包括了用户的登录方式、上线的时间点及时长、点击和浏览页面、页面停留时间以及页面跳转等等,从中进行流量统计和热门商品的统计,并深入挖掘用户的特征;业务行为数据分为两类:一类是能够明显地表现出用户兴趣的行为,比如对商品的收藏、喜欢、评分和评价,对数据进行深入分析,得到用户画像,进而对用户给出个性化的推荐商品列表;另一类则是常规的业务操作,关注异常状况以做好风控,比如登录和订单支付。qiniu/QStreaming
A simplified, lightweight ETL pipeline framework for build stream/batch processing applications on top of Apache SparkGuoNingNing/fire-spark
Spark 脚手架工程,标准化 spark 开发、部署、测试流程。zlb1028/learning-flink
wangzaixiang/scala-sql
scala SQL apititicaca/spark-gbtlr
Hybrid model of Gradient Boosting Trees and Logistic Regression (GBDT+LR) on Sparkxiaofateng/BinlogUpdatetoHive
mysql数据实时增量导入hiveshare23/Food_Recommender
基于 Spark Streaming + ALS 的餐饮推荐系统oeljeklaus-you/SparkCore
Spark源码分析,主要包含SparkContext源码、Executor进程启动、Stage划分、Task执行和Spark2.0的新特性howardlau1999/yatcpu
Yet another toy CPU.xlturing/spark-journey
spark实例代码qf6101/topwords
Implementation of paper: Deng K, Bol P K, Li K J, et al. On the unsupervised analysis of domain-specific Chinese texts[J]. Proceedings of the National Academy of Sciences, 2016: 201516510.Centaur/scalaconsole
Scala REPL in a GUIhibayesian/spark-fm
A parallel implementation of factorization machines based on SparkLdpe2G/PCANet
convert the matlab code of PCANet to C++ & Scalaaiyanbo/sbt-dependency-updates
⬆️ SBT plugin that can check Maven and Ivy repositories for dependency and plugin updatesTsinghuaDatabaseGroup/AI4DBCode
Codes for building an AI-native databasewanghan0501/UserSessionBehaviorOfflineAnalysis
四川大学拓思爱诺用户session行为数据离线分析项目xieyuheng/study
Study of language design and implementation.rison168/spark-profile-tags
基于Spark企业级用户画像项目massquantity/dismember
Advanced Retrieval Algorithms for Decomposing Large-Scale Candidate Set into Pieces.jrthe42/aloha
Aloha: a distributed task scheduling and management frameworkwulei-bj-cn/potatoes
scalad/SpringBoot-Scala
可以说近几年Spark的流行带动了Scala的发展,它集成了面向对象编程和函数式编程的各种特性,Scala具有更纯Lambda表粹的函数式业务逻辑解决方案,其语法比Java8后Lambda更加简洁方便,SpringBoot为Spring提供了一种更加方便快捷的方式,不再要求写大量的配置文件,作为一名Scala爱好者,使用SpringBoot结合Scala将大大节省我们开发的时间以及代码量wulei-bj-cn/learn-spark
godpan/akka-demo
some demo for akkamolikto/mlang
Towards changing things and see if it proofsgoodrain/realtime-message-system
Based akka distributed real-time message exchange systemnotyy/scalaSnippet
在工作中和各种scala培训中积累的代码片段yaooqinn/spark-ranger
已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.yizt/aiia_elec_miner
“AIIA”杯-国家电网-电力专业领域词汇挖掘liumingmusic/HadoopLearning
全套大数据基础学习教程,包含最基础的centos、maven。大数据主要包含hdfs、mr、yarn、hbase、kafka、scala、sparkcore、sparkstreaming、sparksql。教程包含所有的源代码演示以及在线文档说明。xubo245/CarbonDataLearning
Apache CarbonData Learningyaooqinn/itachi
A library that brings useful functions from various modern database management systems to Apache SparkJerryCatLeung/deepwalk_node2vector_eges
将deepwalk、node2vector和阿里的文章:Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba 用代码实现JoeWoo/hadoop-spark-hive-cluster-docker
hadoop-spark-hive-cluster-dockerrenchunxiao/scala-learn
scala 编程的基础知识,以及 快学scala 书中的习题cloudwu/efkbgfx
A bgfx renderer for effekseer runtimexlturing/spark-streaming-action
The code of book: Spark Streaming Actionpkeropen/BigData-News
基于Spark2.2新闻网大数据实时系统项目thestyleofme/user-behavior-analysis
基于flink的用户行为分析thestyleofme/flink-explore
基于canal/kafka conenct的mysql/oracle数据实时同步、flink rest api、flink sql以及udffrb502/spark-skewed-join-hint
SparkSQL自定义Hint优化器解决热点数据导致JOIN数据倾斜问题jxnu-liguobin/SpringBoot-SecKill-Scala
Scala语言实现的慕课网秒杀系统增强版(含Java版),Scala v1YCG09/xgbspark-text-classification
XGBoost on Spark for Chinese Text Classificationsjyttkl/spark_learning
尚硅谷大数据Spark-2019版最新 Spark 学习zhangslob/learning-spark
零基础学习spark,大数据学习foldright/sbt-one-log
🌳 sbt-one-log resolve the logging dependencies chaos in your development, just make logging work as you expect and follow the best practice, automatically.ojlm/pea
分布式压测引擎. A distributed stress tool based on gatlingchensoul/learning-spark
Learning to write Spark examplesTopSpoofer/hbrdd
一个为spark批量导入数据到hbase的库jizhang/spark-sandbox
A playground for Spark jobs.Love Open Source and this site? Check out how you can help us