• Stars
    star
    1,993
  • Rank 22,235 (Top 0.5 %)
  • Language
    C#
  • License
    MIT License
  • Created about 5 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

NuGet Badge

Icon

.NET for Apache® Spark™

.NET for Apache Spark provides high performance APIs for using Apache Spark from C# and F#. With these .NET APIs, you can access the most popular Dataframe and SparkSQL aspects of Apache Spark, for working with structured data, and Spark Structured Streaming, for working with streaming data.

.NET for Apache Spark is compliant with .NET Standard - a formal specification of .NET APIs that are common across .NET implementations. This means you can use .NET for Apache Spark anywhere you write .NET code allowing you to reuse all the knowledge, skills, code, and libraries you already have as a .NET developer.

.NET for Apache Spark runs on Windows, Linux, and macOS using .NET 6, or Windows using .NET Framework. It also runs on all major cloud providers including Azure HDInsight Spark, Amazon EMR Spark, AWS & Azure Databricks.

Note: We currently have a Spark Project Improvement Proposal JIRA at SPIP: .NET bindings for Apache Spark to work with the community towards getting .NET support by default into Apache Spark. We highly encourage you to participate in the discussion.

Table of Contents

Supported Apache Spark

Apache Spark .NET for Apache Spark
2.4* v2.1.1
3.0
3.1
3.2

*2.4.2 is not supported.

Releases

.NET for Apache Spark releases are available here and NuGet packages are available here.

Get Started

These instructions will show you how to run a .NET for Apache Spark app using .NET 6.

Build Status

Ubuntu icon Windows icon
Ubuntu Windows
Build Status

Building from Source

Building from source is very easy and the whole process (from cloning to being able to run your app) should take less than 15 minutes!

Instructions
Windows icon Windows
Ubuntu icon Ubuntu

Samples

There are two types of samples/apps in the .NET for Apache Spark repo:

  • Icon Getting Started - .NET for Apache Spark code focused on simple and minimalistic scenarios.

  • Icon End-End apps/scenarios - Real world examples of industry standard benchmarks, usecases and business applications implemented using .NET for Apache Spark.

We welcome contributions to both categories!

Analytics Scenario

Description

Scenarios

Dataframes and SparkSQL
Simple code snippets to help you get familiarized with the programmability experience of .NET for Apache Spark.
Basic     C#     F#   Getting started icon
Structured Streaming
Code snippets to show you how to utilize Apache Spark's Structured Streaming (2.3.1, 2.3.2, 2.4.1, Latest)
Word Count     C#    F#    Getting started icon
Windowed Word Count    C#    F#    Getting started icon
Word Count on data from Kafka    C#    F#     Getting started icon

TPC-H Queries

Code to show you how to author complex queries using .NET for Apache Spark.
TPC-H Functional     C#    End-to-end app icon
TPC-H SparkSQL     C#    End-to-end app icon

Contributing

We welcome contributions! Please review our contribution guide.

Inspiration and Special Thanks

This project would not have been possible without the outstanding work from the following communities:

  • Apache Spark: Unified Analytics Engine for Big Data, the underlying backend execution engine for .NET for Apache Spark
  • Mobius: C# and F# language binding and extensions to Apache Spark, a pre-cursor project to .NET for Apache Spark from the same Microsoft group.
  • PySpark: Python bindings for Apache Spark, one of the implementations .NET for Apache Spark derives inspiration from.
  • sparkR: one of the implementations .NET for Apache Spark derives inspiration from.
  • Apache Arrow: A cross-language development platform for in-memory data. This library provides .NET for Apache Spark with efficient ways to transfer column major data between the JVM and .NET CLR.
  • Pyrolite - Java and .NET interface to Python's pickle and Pyro protocols. This library provides .NET for Apache Spark with efficient ways to transfer row major data between the JVM and .NET CLR.
  • Databricks: Unified analytics platform. Many thanks to all the suggestions from them towards making .NET for Apache Spark run on Azure and AWS Databricks.

How to Engage, Contribute and Provide Feedback

The .NET for Apache Spark team encourages contributions, both issues and PRs. The first step is finding an existing issue you want to contribute to or if you cannot find any, open an issue.

Support

.NET for Apache Spark is an open source project under the .NET Foundation and does not come with Microsoft Support unless otherwise noted by the specific product. For issues with or questions about .NET for Apache Spark, please create an issue. The community is active and is monitoring submissions.

.NET Foundation

The .NET for Apache Spark project is part of the .NET Foundation.

Code of Conduct

This project has adopted the code of conduct defined by the Contributor Covenant to clarify expected behavior in our community. For more information, see the .NET Foundation Code of Conduct.

License

.NET for Apache Spark is licensed under the MIT license.

More Repositories

1

aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
C#
33,217
star
2

maui

.NET MAUI is the .NET Multi-platform App UI, a framework for building native device applications spanning mobile, tablet, and desktop.
C#
21,364
star
3

core

Home repository for .NET Core
PowerShell
19,308
star
4

roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
C#
18,414
star
5

corefx

This repo is used for servicing PR's for .NET Core 2.1 and 3.1. Please visit us at https://github.com/dotnet/runtime
17,793
star
6

runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
C#
13,703
star
7

coreclr

CoreCLR is the runtime for .NET Core. It includes the garbage collector, JIT compiler, primitive data types and low-level classes.
12,807
star
8

efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
C#
12,774
star
9

AspNetCore.Docs

Documentation for ASP.NET Core
C#
12,270
star
10

csharplang

The official repo for the design of the C# programming language
C#
10,743
star
11

BenchmarkDotNet

Powerful .NET library for benchmarking
C#
9,929
star
12

orleans

Cloud Native application framework for .NET
C#
9,460
star
13

blazor

Blazor moved to https://github.com/dotnet/aspnetcore
PowerShell
9,348
star
14

machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
C#
8,456
star
15

reactive

The Reactive Extensions for .NET
C#
6,490
star
16

wpf

WPF is a .NET Core UI framework for building Windows desktop applications.
C#
6,346
star
17

tye

Tye is a tool that makes developing, testing, and deploying microservices and distributed applications easier. Project Tye includes a local orchestrator to make developing microservices easier and the ability to deploy microservices to Kubernetes with minimal configuration.
C#
5,309
star
18

msbuild

The Microsoft Build Engine (MSBuild) is the build platform for .NET and Visual Studio.
C#
5,073
star
19

winforms

Windows Forms is a .NET UI framework for building Windows desktop applications.
C#
4,188
star
20

MQTTnet

MQTTnet is a high performance .NET library for MQTT based communication. It provides a MQTT client and a MQTT server (broker). The implementation is based on the documentation from http://mqtt.org/.
C#
4,070
star
21

machinelearning-samples

Samples for ML.NET, an open source and cross-platform machine learning framework for .NET.
PowerShell
4,061
star
22

dotnet-docker

Docker images for .NET and the .NET Tools.
Dockerfile
4,033
star
23

docs

This repository contains .NET Documentation.
Dockerfile
3,921
star
24

Open-XML-SDK

Open XML SDK by Microsoft
C#
3,862
star
25

fsharp

The F# compiler, F# core library, F# language service, and F# tooling integration for Visual Studio
F#
3,741
star
26

docfx

Static site generator for .NET API documentation.
C#
3,663
star
27

Silk.NET

The high-speed OpenGL, OpenCL, OpenAL, OpenXR, GLFW, SDL, Vulkan, Assimp, WebGPU, and DirectX bindings library your mother warned you about.
C#
3,639
star
28

cli

The .NET Core command-line (CLI) tools, used for building .NET Core apps and libraries through your development flow (compiling, NuGet package management, running, testing, ...).
3,495
star
29

command-line-api

Command line parsing, invocation, and rendering of terminal output.
C#
3,095
star
30

standard

This repo is building the .NET Standard
3,073
star
31

aspnet-api-versioning

Provides a set of libraries which add service API versioning to ASP.NET Web API, OData with ASP.NET Web API, and ASP.NET Core.
C#
2,954
star
32

roslynator

Roslynator is a set of code analysis tools for C#, powered by Roslyn.
C#
2,913
star
33

corert

This repo contains CoreRT, an experimental .NET Core runtime optimized for AOT (ahead of time compilation) scenarios, with the accompanying compiler toolchain.
C#
2,910
star
34

samples

Sample code referenced by the .NET documentation
C#
2,896
star
35

vscode-csharp

Official C# support for Visual Studio Code
TypeScript
2,806
star
36

try

Try .NET provides developers and content authors with tools to create interactive experiences.
TypeScript
2,806
star
37

interactive

.NET Interactive combines the power of .NET with many other languages to create notebooks, REPLs, and embedded coding experiences. Share code, explore data, write, and learn across your apps in ways you couldn't before.
C#
2,732
star
38

sdk

Core functionality needed to create .NET Core projects, that is shared between Visual Studio and CLI
C#
2,516
star
39

extensions

This repository contains a suite of libraries that provide facilities commonly needed when creating production-ready applications.
C#
2,361
star
40

maui-samples

Samples for .NET Multi-Platform App UI (.NET MAUI)
C#
2,219
star
41

Docker.DotNet

🐳 .NET (C#) Client Library for Docker API
C#
2,164
star
42

pinvoke

A library containing all P/Invoke code so you don't have to import it every time. Maintained and updated to support the latest Windows OS.
C#
2,079
star
43

iot

This repo includes .NET Core implementations for various IoT boards, chips, displays and PCBs.
C#
1,932
star
44

format

Home for the dotnet-format command
C#
1,736
star
45

wcf

This repo contains the client-oriented WCF libraries that enable applications built on .NET Core to communicate with WCF services.
C#
1,664
star
46

Comet

Comet is an MVU UIToolkit written in C#
C#
1,623
star
47

templating

This repo contains the Template Engine which is used by dotnet new
C#
1,536
star
48

roslyn-analyzers

C#
1,515
star
49

llilc

This repo contains LLILC, an LLVM based compiler for .NET Core. It includes a set of cross-platform .NET code generation tools that enables compilation of MSIL byte code to LLVM supported platforms.
C++
1,512
star
50

infer

Infer.NET is a framework for running Bayesian inference in graphical models
C#
1,500
star
51

dotNext

Next generation API for .NET
C#
1,485
star
52

EntityFramework.Docs

Documentation for Entity Framework Core and Entity Framework 6
PowerShell
1,477
star
53

corefxlab

This repo is for experimentation and exploring new ideas that may or may not make it into the main corefx repo.
C#
1,462
star
54

ef6

This is the codebase for Entity Framework 6 (previously maintained at https://entityframework.codeplex.com). Entity Framework Core is maintained at https://github.com/dotnet/efcore.
C#
1,400
star
55

installer

.NET SDK Installer
C#
1,261
star
56

codeformatter

Tool that uses Roslyn to automatically rewrite the source to follow our coding styles
C#
1,235
star
57

ResXResourceManager

Manage localization of all ResX-Based resources in one central place.
C#
1,235
star
58

announcements

Subscribe to this repo to be notified of Announcements and changes in .NET Core.
1,231
star
59

Nerdbank.GitVersioning

Stamp your assemblies, packages and more with a unique version generated from a single, simple version.json file and include git commit IDs for non-official builds.
C#
1,223
star
60

MobileBlazorBindings

Experimental Mobile Blazor Bindings - Build native and hybrid mobile apps with Blazor
C#
1,189
star
61

runtimelab

This repo is for experimentation and exploring new ideas that may or may not make it into the main dotnet/runtime repo.
1,181
star
62

ILMerge

ILMerge is a static linker for .NET Assemblies.
C#
1,175
star
63

try-convert

Helping .NET developers port their projects to .NET Core!
C#
1,138
star
64

sourcelink

Source Link enables a great source debugging experience for your users, by adding source control metadata to your built assets
C#
1,136
star
65

diagnostics

This repository contains the source code for various .NET Core runtime diagnostic tools and documents.
C++
1,092
star
66

upgrade-assistant

A tool to assist developers in upgrading .NET Framework applications to .NET 6 and beyond
C#
982
star
67

project-system

The .NET Project System for Visual Studio
C#
945
star
68

try-samples

C#
920
star
69

TorchSharp

A .NET library that provides access to the library that powers PyTorch.
C#
891
star
70

designs

This repo is used for reviewing new .NET designs.
C#
843
star
71

ClangSharp

Clang bindings for .NET written in C#
C#
840
star
72

crank

Benchmarking infrastructure for applications
C#
819
star
73

LLVMSharp

LLVM bindings for .NET Standard written in C# using ClangSharp
C#
805
star
74

DataGridExtensions

Modular extensions for the WPF DataGrid control
C#
754
star
75

SqlClient

Microsoft.Data.SqlClient provides database connectivity to SQL Server for .NET applications.
C#
728
star
76

intro-to-dotnet-web-dev

Get Started as a Web Developer with .NET, C#, and ASP.NET Core
C#
666
star
77

Microsoft.Maui.Graphics

An experimental cross-platform native graphics library.
C#
657
star
78

HttpRepl

The HTTP Read-Eval-Print Loop (REPL) is a lightweight, cross-platform command-line tool that's supported everywhere .NET Core is supported and is used for making HTTP requests to test ASP.NET Core web APIs and view their results.
C#
651
star
79

arcade

Tools that provide common build infrastructure for multiple .NET Foundation projects.
C#
642
star
80

csharp-notebooks

Get started learning C# with C# notebooks powered by .NET Interactive and VS Code.
Jupyter Notebook
629
star
81

performance

This repo contains benchmarks used for testing the performance of all .NET Runtimes
F#
620
star
82

Microsoft.Maui.Graphics.Controls

Experimental Microsoft.Maui.Graphics.Controls - Build drawn controls (Cupertino, Fluent and Material)
C#
608
star
83

Scaffolding

Code generators to speed up development.
C#
596
star
84

csharpstandard

Working space for ECMA-TC49-TG2, the C# standard committee.
C#
596
star
85

dotnet-console-games

Game examples implemented as .NET console applications primarily for providing education and inspiration. :)
C#
569
star
86

cli-lab

A guided tool will be provided to enable the controlled clean up of a system such that only the desired versions of the Runtime and SDKs remain.
C#
563
star
87

dotnet-api-docs

.NET API reference documentation (.NET 5+, .NET Core, .NET Framework)
C#
558
star
88

dotnet-docker-samples

The .NET Core Docker samples have moved to https://github.com/dotnet/dotnet-docker/tree/master/samples
C#
545
star
89

WatsonTcp

WatsonTcp is the easiest way to build TCP-based clients and servers in C#.
C#
536
star
90

dotnet-monitor

This repository contains the source code for .NET Monitor - a tool that allows you to gather diagnostic data from running applications using HTTP endpoints
C#
527
star
91

Nerdbank.Streams

Specialized .NET Streams and pipes for full duplex in-proc communication, web sockets, and multiplexing
C#
514
star
92

Kerberos.NET

A Kerberos implementation built entirely in managed code.
C#
490
star
93

blazor-samples

HTML
483
star
94

buildtools

Build tools that are necessary for building the .NET Core projects
479
star
95

roslyn-sdk

Roslyn-SDK templates and Syntax Visualizer
C#
470
star
96

core-setup

Installer packages for the .NET Core runtime and libraries
455
star
97

training-tutorials

Getting started tutorials for C# and ASP.NET
C#
401
star
98

razor

Compiler and tooling experience for Razor ASP.NET Core apps in Visual Studio, Visual Studio for Mac, and VS Code.
C#
390
star
99

linker

C#
380
star
100

sign

Code Signing CLI tool supporting Authenticode, NuGet, VSIX, and ClickOnce
C#
374
star