• This repository has been archived on 22/Jan/2020
  • Stars
    star
    246
  • Rank 158,649 (Top 4 %)
  • Language
    Python
  • Created almost 10 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A rendering web crawler for Apache Mesos.

RENDLER โ‰๏ธ

A rendering web-crawler framework for Apache Mesos.

YES RENDLER

See the accompanying slides for more context.

RENDLER consists of three main components:

  • CrawlExecutor extends mesos.Executor
  • RenderExecutor extends mesos.Executor
  • RenderingCrawler extends mesos.Scheduler and launches tasks with the executors

Quick Start with Vagrant

Requirements

Start the mesos-demo VM

$ wget http://downloads.mesosphere.io/demo/mesos.box -O /tmp/mesos.box
$ vagrant box add --name mesos-demo /tmp/mesos.box
$ git clone https://github.com/mesosphere/RENDLER.git
$ cd RENDLER
$ vagrant up

Now that the VM is running, you can view the Mesos Web UI here: http://10.141.141.10:5050

You can see that 1 slave is registered and you've got some idle CPUs and Memory. So let's start the Rendler!

Run RENDLER in the mesos-demo VM

Check implementations of the RENDLER scheduler in the python, go, scala, and cpp directories. Run instructions are here:

Feel free to contribute your own!

Generating a pdf of your render graph output

With GraphViz (which dot) installed:

vagrant@mesos:hostfiles $ bin/make-pdf
Generating '/home/vagrant/hostfiles/result.pdf'

Open result.pdf in your favorite viewer to see the rendered result!

Sample Output

Sample Crawl Crawl

Shutting down the mesos-demo VM

# Exit out of the VM
vagrant@mesos:hostfiles $ exit
# Stop the VM
$ vagrant halt
# To delete all traces of the vagrant machine
$ vagrant destroy

Rendler Architecture

Crawl Executor

  • Interprets incoming tasks' task.data field as a URL
  • Fetches the resource, extracts links from the document
  • Sends a framework message to the scheduler containing the crawl result.

Render Executor

  • Interprets incoming tasks' task.data field as a URL
  • Fetches the resource, saves a png image to a location accessible to the scheduler.
  • Sends a framework message to the scheduler containing the render result.

Intermediate Data Structures

We define some common data types to facilitate communication between the scheduler and the executors. Their default representation is JSON.

results.CrawlResult(
    "1234",                                 # taskId
    "http://foo.co",                        # url
    ["http://foo.co/a", "http://foo.co/b"]  # links
)
results.RenderResult(
    "1234",                                 # taskId
    "http://foo.co",                        # url
    "http://dl.mega.corp/foo.png"           # imageUrl
)

Rendler Scheduler

Data Structures

  • crawlQueue: list of urls
  • renderQueue: list of urls
  • processedURLs: set or urls
  • crawlResults: list of url tuples
  • renderResults: map of urls to imageUrls

Scheduler Behavior

The scheduler accepts one URL as a command-line parameter to seed the render and crawl queues.

  1. For each URL, create a task in both the render queue and the crawl queue.

  2. Upon receipt of a crawl result, add an element to the crawl results adjacency list. Append to the render and crawl queues each URL that is not present in the set of processed URLs. Add these enqueued urls to the set of processed URLs.

  3. Upon receipt of a render result, add an element to the render results map.

  4. The crawl and render queues are drained in FCFS order at a rate determined by the resource offer stream. When the queues are empty, the scheduler declines resource offers to make them available to other frameworks running on the cluster.

More Repositories

1

marathon

Deploy and manage containers (including Docker) on top of Apache Mesos at scale.
Scala
4,068
star
2

kubernetes-mesos

A Kubernetes Framework for Apache Mesos
641
star
3

cloudkeeper

Resoto creates an inventory of your cloud, provides deep visibility, and reacts to changes in your infrastructure. โšก๏ธ
Python
637
star
4

mesos-dns

DNS-based service discovery for Mesos.
Go
483
star
5

marathon-lb

Marathon-lb is a service discovery & load balancing tool for DC/OS
Python
449
star
6

playa-mesos

Quickly build Mesos sandbox environments using Vagrant. Run apps on top!
Shell
441
star
7

universe

The Mesosphere Universe package repository.
Mustache
304
star
8

chaos

A lightweight framework for writing REST services in Scala.
Scala
251
star
9

marathon-ui

The web-ui for Marathon (https://github.com/mesosphere/marathon)
JavaScript
223
star
10

traefik-forward-auth

Go
205
star
11

mesos-docker

Project has been superseded by native docker support in Mesos
Python
177
star
12

dcos-kubernetes-quickstart

Quickstart guide for Kubernetes on DC/OS
HCL
168
star
13

dcos-commons

DC/OS SDK is a collection of tools, libraries, and documentation for easy integration of technologies such as Kafka, Cassandra, HDFS, Spark, and TensorFlow with DC/OS.
Java
157
star
14

reactjs-components

๐ŸŽจ A library of reusable React components
JavaScript
136
star
15

marathon-autoscale

Simple Proof-of-Concept for Scaling Application running on Marathon based on Utilization
Python
110
star
16

dcos-jenkins-service

Jenkins on DC/OS
Python
73
star
17

serenity

Intel:Mesosphere oversubscription technologies for Apache Mesos
C++
71
star
18

tweeter

A tiny Twitter clone for DC/OS
CSS
68
star
19

mesosaurus

Mesos task load simulator framework for (cluster and Mesos) performance analysis
Scala
59
star
20

reactive-graphql

A GraphQL implementation based around RxJS, very well suited for client side only GraphQL usage
TypeScript
57
star
21

net-modules

Apache Mesos modules for network isolation.
Python
55
star
22

konvoy-training

55
star
23

dcos-vagrant-box

Vagrant box packer for building boxes for dcos-vagrant
Shell
54
star
24

csilvm

A LVM2 CSI plugin
Go
53
star
25

spark-build

Used to build the mesosphere/spark docker image and the DC/OS Spark package
Python
53
star
26

docker-mesos-marathon-screencast

The scripts used in the Docker Clustering on Mesos with Marathon screencast.
Shell
51
star
27

dcos-docs-site

D2iQ Product Documentation and Docs Website Code
SCSS
51
star
28

mindthegap

Easily create and use bundles for air-gapped environments
Go
45
star
29

mesos-rxjava

RxJava client for Apache Mesos HTTP APIs
Java
42
star
30

letsencrypt-dcos

Let's Encrypt DC/OS!
Python
39
star
31

cd-demo

A continuous delivery demo using Jenkins on DC/OS.
Python
36
star
32

etcd-top

etcd realtime workload analyzer
Go
34
star
33

tachyon-mesos

A Mesos Framework for Tachyon, a memory-centric distributed file system.
Scala
32
star
34

dcos-kafka-service

Open source Apache Kafka running on DC/OS
Python
32
star
35

kubernetes-security-benchmark

A simple way to evaluate the security of your Kubernetes deployment against sets of best practices defined by various community sources
Go
29
star
36

coreos-setup

Deprecated. See DCOS Community Edition for how to currently deploy Mesos on CoreOS
28
star
37

cnvs

CNVS (pronounced "Canvas") is a system of user interface elements and components built for use across Mesosphere sites and products. CNVS defines stylistic guidelines for the design and structure of digital interfaces in an effort to ensure consistency in brand and interaction.
CSS
28
star
38

mesos-utils

Utilities for building distributed systems on top of mesos
Scala
24
star
39

scala-sbt-mesos-framework.g8

Scala
23
star
40

marathon-example-plugins

Example Plugins for Marathon Plugin Interface
Scala
22
star
41

star

Test program for network policies.
Rust
19
star
42

charts

D2IQ Helm Chart Repository
Mustache
17
star
43

marathon-client

Java Integration Library for Mesosphere Marathon
Java
17
star
44

marathon-pkg

Packaging utilities for Marathon.
17
star
45

mesos-dns-pkg

Packaging utilities for Mesos-DNS
Makefile
16
star
46

konvoy-image-builder

Go
15
star
47

mom

Mesos on Mesos
Go
15
star
48

dcos-openvpn

14
star
49

sample_mesos_executor

Sample mesos executor
Scala
13
star
50

dklb

Expose Kubernetes services and ingresses through EdgeLB.
Go
12
star
51

kommander-applications

Go
12
star
52

usi

Deploy and manage containers (including Docker) on top of Apache Mesos at scale.
Scala
12
star
53

dcos-flink-service

Shell
11
star
54

edgerouter

DCOS edgerouter
Python
11
star
55

dcosdev

Python
10
star
56

kubernetes-base-addons

Kubernetes Addon Repository for KSphere
Go
10
star
57

kudo-spark-operator

KUDO Spark Operator
Shell
10
star
58

jackson-case-class-module

Deserialization support for Scala case classes, including proper handling of default values.
Scala
10
star
59

kudo-cassandra-operator

KUDO Cassandra Operator
Go
10
star
60

mesos-http-adapter

Java
8
star
61

exhibitor-dcos

Exhibitor on DCOS
Shell
8
star
62

ANAGRAMMER

An anagram finder for Apache Mesos
Python
8
star
63

field-notes

7
star
64

cake-builder

Cake Docker Builder
Go
7
star
65

kubeaddons-kommander

Kommander Addon Repository
Go
7
star
66

d2iq-daggers

Collection of tasks and utilities to manage ci-cd pipelines
Go
7
star
67

dcos-helloworld

DCOS HelloWorld subcommand.
Python
6
star
68

docker-screencasts

Shell
6
star
69

chronos-pkg

Makefile
5
star
70

mesos-website-container

Scripts for building docker image for generating mesos.apache.org from sources
Shell
5
star
71

ip_vs_conn

Erlang
5
star
72

docker-mac-network

Shell
5
star
73

d2iq-engineering-blog

Just a techblog test repo for showcasing
SCSS
5
star
74

bun

Command-line program which detects the most common problems in a DC/OS cluster by analyzing its diagnostics bundle
Go
4
star
75

marathon-storage-tool

Marathon Storage Tool
Scala
4
star
76

kubeaddons-enterprise

Enterprise Addon Repository
Python
4
star
77

kubernetes-keygen

Scripts for generating RSA keys and SSL certificates/authorities for use by Kubernetes cluster deployments
Shell
4
star
78

dispatch-catalog

Dispatch Official Catalog
Python
3
star
79

aurora_tutorial

Shell
3
star
80

health-checks-scale-tests

Marathon and Mesos-native health checks testing rig
Python
3
star
81

kubeaddons-kaptain

Kubeflow Addons
3
star
82

golang-repository-template

Go
3
star
83

terraform-provider-dcos

a Terraform (http://terraform.io) provider for interacting with Mesosphere DC/OS
Go
3
star
84

marathon-ui-example-plugin

JavaScript
3
star
85

dcos-sdk-service-diagnostics

Fetches "SDK Service"-related diagnostics artifacts. Owned by the Data Services and Orchestration teams.
Python
2
star
86

mesosphere-zookeeper

Makefile
2
star
87

mesos-build-images

Shell
2
star
88

dkp-catalog-applications

Makefile
2
star
89

kubernetes-sre-addons

Go
2
star
90

marathon-demo

Resources for Marathon demos
Shell
2
star
91

kubeaddons-community

Community Addon Repository
2
star
92

marathon-integration-tests

A collection of Gatling simulations for Marathon.
Scala
2
star
93

marathon-perf-measurement

2
star
94

marathon-ui-plugin-sdk

2
star
95

dcos-perf-test-driver

๐Ÿ’ช The DC/OS Performance and Scale Test Driver
Python
2
star
96

mesos-state-backed-collections

Persistent collection types backed by implementations of the Mesos state API.
Scala
2
star
97

dynamic-credential-provider

Simplifies using the Kubelet image credential provider feature with multiple cloud infrastructures
Go
1
star
98

kubeaddons-tests

tests for kubeaddons-enterprise catalog addons
Shell
1
star
99

sre-kommander-applications

Community Helm Releases - used for Demos and Internally
Smarty
1
star
100

cp-docker-images

Python
1
star