• Stars
    star
    1,966
  • Rank 23,583 (Top 0.5 %)
  • Language
    Go
  • License
    BSD 3-Clause "New...
  • Created about 8 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A search engine which can hold 100 trillion lines of log data.

波塞冬:Poseidon

波塞冬,是希腊神话中的海神,在这里是寓意着海量数据的主宰者。

Poseidon 系统是一个日志搜索平台,可以在数百万亿条、数百PB大小的日志数据中快速分析和检索特定字符串。 360公司是一个安全公司,在追踪 APT(高级持续威胁)事件时,经常需要在海量的历史日志数据中检索某些信息, 例如某个恶意样本在某个时间段内的活动情况。在 Poseidon 系统出现之前,都是写 Map/Reduce 计算任务在 Hadoop 集群中做计算, 一次任务所需的计算时间从数小时到数天不等,大大制约了 APT 事件的追踪效率。 Poseidon 系统就是为了解决这个需求,能在几秒钟内从数百万亿条规模的数据集中找出我们需要的数据,大大提高工作效率; 同时,这些数据不需要额外存储,仍然存放在Hadoop集群中,节省了大量存储和计算资源。该系统可以应用于任何结构化或非结构化海量(从万亿到千万亿规模)数据的查询检索需求。

Quick Start

所用技术

  • 倒排索引:构建日志搜索引擎的核心技术
  • Hadoop:用于存放原始数据和索引数据,并用来运行Map/Reduce程序来构建索引
  • Java:构建索引时是用Java开发的Map/Reduce程序
  • Golang:检索程序是用Golang开发的
  • Redis/Memcached:用于存储 Meta 元数据信息

目录结构

builder

这里存放的是数据生成工具

  • doc :将原始日志转换为Poseidon格式的数据。
  • docmeta :将Doc相关的元数据信息写入NoSQL库中的工具。
  • index :从原始日志生成倒排索引数据的程序工具,是Hadoop 的 Map/Reduce 作业程序。
  • indexmeta :将倒排索引的元数据写入NoSQL库中的工具。

common

目前仅仅用来存放该项目中用到的 protobuf 定义

docs

存放了相关的技术文档。

service

这里存放的是各个HTTP微服务服务的程序

  • hdfsreader :读取HDFS中某个文件路径的一段数据。
    • /service/hdfsreader
  • idgenerator :全局的ID生成中心
    • /service/idgenerator
  • meta :针对存放Meta信息的NoSQL提供统一的HTTP接口服务
    • /service/meta/business/doc/get : DocGzMeta 信息查询接口
    • /service/meta/business/doc/set : DocGzMeta 信息更新接口
    • /service/meta/business/index/get : InvertedIndexGzMeta 信息查询接口
    • /service/meta/business/index/set : InvertedIndexGzMeta 信息更新接口
  • searcher :Poseidon搜索引擎的核心检索服务
  • proxy :searcher的一个代理,并能实现跨时间的查询服务
  • allinone : 为简化部署,将 idgenerator/meta/searcher/proxy 四个微服务集成在一个进程中,提供统一的服务接口

其他

  • qq交流群:21557451

More Repositories

1

RePlugin

RePlugin - A flexible, stable, easy-to-use Android Plug-in Framework
Java
7,261
star
2

Atlas

A high-performance and stable proxy for MySQL, it is developed by Qihoo's DBA and infrastructure team
C
4,650
star
3

wayne

Kubernetes multi-cluster management and publishing platform
TypeScript
3,706
star
4

evpp

A modern C++ network library for developing high performance network services in TCP/UDP/HTTP protocols.
C++
3,564
star
5

ArgusAPM

Powerful, comprehensive (Android) application performance management platform. 360线上移动性能检测平台
Java
2,673
star
6

safe-rules

详细的C/C++编程规范指南,由360质量工程部编著,适用于桌面、服务端及嵌入式软件系统。
2,363
star
7

Quicksql

A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Java
2,057
star
8

QConf

Qihoo Distributed Configuration Management System
C++
1,865
star
9

hbox

AI on Hadoop
Java
1,727
star
10

phptrace

A tracing and troubleshooting tool for PHP scripts.
C
1,677
star
11

mysql-sniffer

mysql-sniffer is a network traffic analyzer tool for mysql, it is developed by Qihoo DBA and infrastructure team
C
845
star
12

huststore

High-performance Distributed Storage
C
823
star
13

doraemon

Doraemon is a Prometheus based monitor system
JavaScript
655
star
14

logkafka

Collect logs and send lines to Apache Kafka
C++
500
star
15

zeppelin

A Scalable, High-Performance Distributed Key-Value Platform
C++
399
star
16

tensornet

C++
316
star
17

qbusbridge

The Apache Kafka Client SDK
C++
292
star
18

360zhinao

360zhinao
Python
274
star
19

XSQL

Unified SQL Analytics Engine Based on SparkSQL
Scala
210
star
20

WatchAD2.0

WatchAD2.0是一款针对域威胁的日志分析与监控系统
CSS
206
star
21

zendAPI

The C++ wrapper of zend engine
C++
183
star
22

mongosync

mongosync is simple && useful tool to sync data between mongo replicaSet, it is developed by Qihoo's DBA and infrastructure team
C++
154
star
23

artdumper

从oat文件中dump出来dex的工具
C++
138
star
24

influx-proxy

influxdb HA
Go
128
star
25

kmemcache

linux kernel memcache server
C
126
star
26

XLearning-XDML

extremely distributed machine learning
Scala
123
star
27

simcc

A simple C++ common base library used in Qihoo 360
C++
116
star
28

nemo

A library that provide multiply data structure. Such as map, hash, list, set. We build these data structure base on rocksdb as the storage layer for Pika https://github.com/OpenAtomFoundation/pika .
C++
115
star
29

ngx_http_subrange_module

Split one big HTTP/Range request to multiple subrange requesets
C
107
star
30

blackwidow

A library implements REDIS commands(Strings, Hashes, Lists, Sorted Sets, Sets, Keys, HyperLogLog) based on rocksdb, as the storage layer for Pika https://github.com/OpenAtomFoundation/pika .
C++
99
star
31

QNAT

C
88
star
32

Mario

A Library that make the write from synchronous to asynchronous.
C++
78
star
33

Luwak

利用预训练语言模型从非结构化威胁报告中提取 MITRE ATT&CK TTP 信息
Python
68
star
34

mpic

A C++ embedded library of multiple processes framework developed and used at Qihoo360.
C++
50
star
35

nemo-rocksdb

Add TTL feature on rocksdb, and compatible with rocksdb
C++
44
star
36

dgl-operator

The DGL Operator makes it easy to run Deep Graph Library (DGL) graph neural network training on Kubernetes
Go
44
star
37

ironwill

Useful iOS components for your project. 健壮且有用的OC代码, 可以直接在你的iOS应用中使用.
Objective-C
37
star
38

elog

A erlang log nif
C++
28
star
39

rust-jsonnet

rust-jsonnet - The Google Jsonnet( operation data template language) for rust
Rust
24
star
40

zeppelin-gateway

Object Gateway Provide Applications with a RESTful Gateway to zeppelin
C++
23
star
41

zeppelin-client

Client Library for zeppelin
C++
21
star
42

luajit-jsonnet

The Google Jsonnet( operation data template language) for Luajit
C++
16
star
43

HTTPSLayer

PHP
16
star
44

CReSS

Cross-model Retrieval between 13C NMR Spectrum and Structure
Python
15
star
45

wayne-backend-plugins

Wayne backend plugins
Go
13
star
46

gpstall

Stall Postgres' insert command
C++
8
star
47

cloud-website

360 cloud official website
PHP
8
star
48

wayne-frontend-plugins

Wayne UI Plugins
TypeScript
7
star
49

SEEChat

一见多模态对话模型
Python
5
star
50

wiki

wiki for qihoo infrastructure team
2
star
51

se-office

se-office扩展,提供基于开放标准的全功能办公生产力套件,基于浏览器预览和编辑office。
JavaScript
1
star