• Stars
    star
    3,050
  • Rank 14,770 (Top 0.3 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created almost 5 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.

DSS

License

English | 中文

Introduction

       DataSphere Studio (DSS for short) is WeDataSphere, a one-stop data application development management portal developed by WeBank.

       With the pluggable integrated framework design and the Linkis, a computing middleware, DSS can easily integrate various upper-layer data application systems, making data development simple and easy to use.

       DataSphere Studio is positioned as a data application development portal, and the closed loop covers the entire process of data application development. With a unified UI, the workflow-like graphical drag-and-drop development experience meets the entire lifecycle of data application development from data import, desensitization cleaning, data analysis, data mining, quality inspection, visualization, scheduling to data output applications, etc.

       With the connection, reusability, and simplification capabilities of Linkis, DSS is born with financial-grade capabilities of high concurrency, high availability, multi-tenant isolation, and resource management.

UI preview

       Please be patient, it will take some time to load gif.

DSS-V1.0 GIF

Core features

1. One-stop, full-process application development management UI

       DSS is highly integrated. Currently integrated components include(DSS version compatibility for the above components, please visit: Compatibility list of integrated components):

       1. Data Development IDE Tool - Scriptis

       2. Data Visualization Tool - Visualis (Based on the open source project Davinci contributed by CreditEase)

       3. Data Quality Management Tool - Qualitis

       4. Workflow scheduling tool - Schedulis

       5. Data Exchange Tool - Exchangis

       6. Data Api Service - DataApiService

       7. Streaming Application Development Management Tool - Streamis

       8. One-stop machine Learning Platform - Prophecis

       9. Workflow Task Scheduling Tool - DolphinScheduler (In Code Merging)

       10. Help documentation and beginner's guide - UserGuide (In Code Merging)

       11. Data Model Center - DataModelCenter (In development)

       DSS version compatibility for the above components, please visit: Compatibility list of integrated components.

       With a pluggable framework architecture, DSS is designed to allow users to quickly integrate new data application tools, or replace various tools that DSS has integrated. For example, replace Scriptis with Zeppelin, and replace Schedulis with DolphinScheduler...

DSS one-stop video

2. AppConn, based on Linkis,defines a unique design concept

       AppConn is the core concept that enables DSS to easily and quickly integrate various upper-layer web systems.

       AppConn, an application connector, defines a set of unified front-end and back-end three-level integration protocols, allowing external data application systems to easily and quickly becoming a part of DSS data application development.

       The three-level specifications of AppConn are: the first-level SSO specification, the second-level organizational structure specification, and the third-level development process specification.

       DSS arranges multiple AppConns in series to form a workflow that supports real-time execution and scheduled execution. Users can complete the entire process development of data applications with simple drag and drop operations.

       Since AppConn is integrated with Linkis, the external data application system shares the capabilities of resource management, concurrent limiting, and high performance. AppConn also allows sharable context across system level and thus makes external data application completely gets away from application silos.

3. Workspace, as the management unit

       With Workspace as the management unit, it organizes and manages business applications of various data application systems, defines a set of common standards for collaborative development of workspaces across data application systems, and provides user role management capabilities.

4. Integrated data application components

       DSS has integrated a variety of upper-layer data application systems by implementing multiple AppConns, which can basically meet the data development needs of users.

       If desired, new data application systems can also be easily integrated to replace or enrich DSS's data application development process. Click me to learn how to quickly integrate new application systems

Component Description DSS0.X compatible version (DSS0.9.1 recommended) DSS1.0 compatible version (DSS1.1.0 recommended)
Linkis Computing middleware Apache Linkis, by providing standard interfaces such as REST/WebSocket/JDBC/SDK, upper-layer applications can easily connect and access underlying engines such as MySQL/Spark/Hive/Presto/Flink. Linkis0.11.0 is recommended (*Released *) >= Linkis1.1.1 (released)
DataApiService (DSS has built-in third-party application tools) data API service. The SQL script can be quickly published as a Restful interface, providing Rest access capability to the outside world. Not supported DSS1.1.0 recommended (released)
Scriptis (DSS has built-in third-party application tools) support online writing of SQL, Pyspark, HiveQL and other scripts, and submit to [Linkis](https ://github.com/WeBankFinTech/Linkis) data analysis web tool. Recommended DSS0.9.1 (Released) Recommended DSS1.1.0 (Released)
Schedulis Workflow task scheduling system based on Azkaban secondary development, with financial-grade features such as high performance, high availability and multi-tenant resource isolation. Recommended Schedulis0.6.1 (released) >= Schedulis0.7.0 (Released)
EventCheck (a third-party application tool built into DSS) provides signal communication capabilities across business, engineering, and workflow. Recommended DSS0.9.1 (Released) Recommended DSS1.1.0 (Released)
SendEmail (DSS has built-in third-party application tools) provides the ability to send data, all the result sets of other workflow nodes can be sent by email DSS0.9.1 is recommended (released) Recommended DSS1.1.0 (Released)
Qualitis Data quality verification tool, providing data verification capabilities such as data integrity and correctness Qualitis0.8.0 is recommended (**Released **) >= Qualitis0.9.2 (Released)
Streamis Streaming application development management tool. It supports the release of Flink Jar and Flink SQL, and provides the development, debugging and production management capabilities of streaming applications, such as: start-stop, status monitoring, checkpoint, etc. Not supported >= Streamis0.2.0 (Released)
Prophecis A one-stop machine learning platform that integrates multiple open source machine learning frameworks. Prophecis' MLFlow can be connected to DSS workflow through AppConn. Not supported >= Prophecis 0.3.2 (Released)
Exchangis A data exchange platform that supports data transmission between structured and unstructured heterogeneous data sources, the upcoming Exchangis1. 0, will work with DSS workflow not supported = Exchangis1.0.0 (Released)
Visualis A data visualization BI tool based on the secondary development of Davinci, an open source project of CreditEase, provides users with financial-level data visualization capabilities in terms of data security. Recommended Visualis0.5.0 = Visualis1.0.0 (Released)
DolphinScheduler Apache DolphinScheduler, a distributed and easily scalable visual workflow task scheduling platform, supports one-click publishing of DSS workflows to DolphinScheduler. Not supported DolphinScheduler1.3.X (Released)
UserGuide (DSS will be built-in third-party application tools) contains help documents, beginner's guide, Dark mode skinning, etc. Not supported >= DSS1.1.0 (Released)
DataModelCenter (the third-party application tool that DSS will build) mainly provides data warehouse planning, data model development and data asset management capabilities. Data warehouse planning includes subject domains, data warehouse hierarchies, modifiers, etc.; data model development includes indicators, dimensions, metrics, wizard-based table building, etc.; data assets are connected to Apache Atlas to provide data lineage capabilities . Not supported Planned in DSS1.2.0 (under development)
UserManager (DSS has built-in third-party application tools) automatically initialize all user environments necessary for a new DSS user, including: creating Linux users, various user paths, directory authorization, etc. Recommended DSS0.9.1 (Released) Planning
Airflow Supports publishing DSS workflows to Apache Airflow for scheduled scheduling. PR not yet merged Not supported

Demo Trial environment

       The function of DataSphere Studio supporting script execution has high security risks, and the isolation of the WeDataSphere Demo environment has not been completed. Considering that many users are inquiring about the Demo environment, we decided to first issue invitation codes to the community and accept trial applications from enterprises and organizations.

       If you want to try out the Demo environment, please join the DataSphere Studio community user group (Please refer to the end of the document), and contact WeDataSphere Group Robot to get an invitation code.

       DataSphereStudio Demo environment user registration page: click me to enter

       DataSphereStudio Demo environment login page: click me to enter

Download

       Please go to the DSS Releases Page to download a compiled version or a source code package of DSS.

Compile and deploy

       Please follow Compile Guide to compile DSS from source code.

       Please refer to Deployment Documents to do the deployment.

Examples and Guidance

       You can find examples and guidance for how to use DSS in User Manual.

Documents

       For a complete list of documents for DSS1.0, see DSS-Doc

       The following is the installation guide for DSS-related AppConn plugins:

Architecture

DSS Architecture

Usage Scenarios

      DataSphere Studio is suitable for the following scenarios:

      1. Scenarios in which big data platform capability is being prepared or initialized but no data application tools are available.

      2. Scenarios in which users already have big data foundation platform capabilities but with only a few data application tools.

      3. Scenarios in which users have the ability of big data foundation platform and comprehensive data application tools, but suffers strong isolation and and high learning costs because those tools have not been integrated together.

      4. Scenarios in which users have the capabilities of big data foundation platform and comprehensive data application tools. but lacks unified and standardized specifications, while a part of these tools have been integrated.

Contributing

       Contributions are always welcomed, we need more contributors to build DSS together. either code, or doc, or other supports that could help the community.

       For code and documentation contributions, please follow the contribution guide.

Communication

       For any questions or suggestions, please kindly submit an issue.

       You can scan the QR code below to join our WeChat and QQ group to get more immediate response.

communication

Who is using DSS

       We opened an issue for users to feedback and record who is using DSS.

       Since the first release of DSS in 2019, it has accumulated more than 700 trial companies and 1000+ sandbox trial users, which involving diverse industries, from finance, banking, tele-communication, to manufactory, internet companies and so on.

License

       DSS is under the Apache 2.0 license. See the License file for details.

More Repositories

1

fes.js

Fes.js 是一个基于 Vue 3 好用的前端应用解决方案。以约定、配置化、组件化的设计思想,让用户仅仅关心用组件搭建页面内容。技术曲线平缓,上手也简单。在经过多个项目中打磨后趋于稳定。丰富的 Vue 3 生态 和 Fes.js 插件,让业务开发更加简单快捷~
JavaScript
1,390
star
2

Scriptis

Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Vue
806
star
3

Qualitis

Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems caused by data processing. https://github.com/WeBankFinTech/Qualitis
Java
693
star
4

WeDataSphere

WeDataSphere is a financial grade, one-stop big data platform suite.
653
star
5

Prophecis

Prophecis is a one-stop cloud native machine learning platform.
Go
475
star
6

Exchangis

Exchangis is a lightweight,highly extensible data exchange platform that supports data transmission between structured and unstructured heterogeneous data sources
Java
447
star
7

Schedulis

Schedulis is a high performance workflow task scheduling system that supports high availability and multi-tenant financial level features, Linkis computing middleware, and has been integrated into data application development portal DataSphere Studio
Java
387
star
8

Visualis

Visualis is a BI tool for data visualization. It provides financial-grade data visualization capabilities on the basis of data security and permissions, based on the open source project Davinci contributed by CreditEase.
TypeScript
261
star
9

wxa

🖖 轻量级的小程序开发框架。可以渐进接入的小程序开发框架,专注于小程序原生开发,提供更好的工程化、代码复用能力,提高开发效率并改善开发体验。
JavaScript
226
star
10

Dockin

微众银行开源的基于私有云的容器平台
209
star
11

WeBank-all-Project

All the project addresses participated and established by WeBank are collected.汇集了微众银行参与和建立的所有项目地址。
202
star
12

DeFiBus

DeFiBus is a decentralized finacial message bus for microservices, provide request/reply, unicast, multi-cast, broadcast, delay-message etc, and also privide service governance capacity and operation tools.
Java
191
star
13

KoalaForm

中后台前端低代码表单
TypeScript
145
star
14

fes-design

Vue3 组件库,Typescirpt 编写,高性能,支持按需引入、国际化、配置主题,适配低代码。
TypeScript
139
star
15

WeTrident

一站式App开发套件,帮助开发者快速开发可正式上线运营的App。
JavaScript
125
star
16

DataSphereStudio-Doc

DataSphereStudio documents.
110
star
17

Streamis

Streaming application development and management system, based on Linkis and DSS, planning to provide the workflow-like graphical drag-and-drop development capability.
Java
103
star
18

Dockin-Installer

Production-grade highly available container platform
Shell
59
star
19

incubator-linkis-doc

incubator-linkis-doc has been migrated to incubator-linkis-website: https://github.com/apache/incubator-linkis-website
40
star
20

Dockin-RM

Dockin container platform resource manager is the core module for application definition and container instance management
Java
40
star
21

Dockin-Ops

dockin ops is a project used to handle the exec request for kubernetes under supervision
Go
37
star
22

wt-console

A lightweight, extendable react-native developer and tester tool
JavaScript
34
star
23

Dockin-CNI

kubernetes cni plugin, support fixed ip
Go
31
star
24

eventmesh-connector-defibus

connector for defibus in eventmesh
Java
8
star
25

wxa-vscode

微信小程序开发助手。开箱即用,安装完毕你将获得:代码自动填充、格式化; 语法高亮、检查(包括wxml、wxs文件); 代码片段提示; 单文件组件支持
TypeScript
8
star
26

TractionWidget

牵引小组件
Vue
7
star
27

react-native-rtext

JavaScript
6
star
28

WeCloudStack

Webank Financial Cloud (native) tech Stack for building cloud (native) applications.
1
star
29

wxa-templates

1
star