Top Rating
- Top Contributors
  Discover the Top Open Source contributors by country or by language
- Interviews
  Discover real stories from Open Source developers
Discover

Discover your Favorite Language
Discover the top trending repositories and projects on Github. Explore the latest trends in your preferred languages.

Swift

Groovy

Shell

Clojure

Solidity

C#

F#

CSS

More Languages
Awesome

Awesome repositories
Discover the most awesome repositories and projects of your favorite languages. Inspired by the Awesome-* lists trend in GitHub.

Elm

Dart

Nix

C#

Python

Zig

Objective-C

Erlang

More Languages
By Country

Rankings by Country
Discover the community of talented open source contributors in each country.

🇳🇵 Nepal

🇫🇷 France

🇧🇾 Belarus

🇧🇸 The Bahamas

🇰🇾 Cayman Islands

🇲🇶 Martinique

🇯🇲 Jamaica

🇲🇼 Malawi

All Countries Compare Countries

jackaduma/SecBERT

Stars
144
Rank 255,590 (Top 6 %)
Language
Python
License
MIT License
Created about 4 years ago
Updated over 1 year ago

jackaduma/SecBERT

jackaduma

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

pretrained BERT model for cyber security text, learned CyberSecurity Knowledge

`SecBERT`

中文说明 | English

SecBERT is a BERT model trained on cyber security text, learned CyberSecurity Knowledge.

SecBERT is trained on papers from the corpus of
SecBERT has its own vocabulary (secvocab) that's built to best match the training corpus. We trained SecBERT and SecRoBERTa versions.

Table of Contents

Downloading Trained Models

SecBERT models now installable directly within Huggingface's framework:

from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("jackaduma/SecBERT")

model = AutoModelForMaskedLM.from_pretrained("jackaduma/SecBERT")


tokenizer = AutoTokenizer.from_pretrained("jackaduma/SecRoBERTa")

model = AutoModelForMaskedLM.from_pretrained("jackaduma/SecRoBERTa")

Pretrained-Weights

We release the the pytorch version of the trained models. The pytorch version is created using the Hugging Face library, and this repo shows how to use it.

Huggingface Modelhub

Using SecBERT in your own model

SecBERT models include all necessary files to be plugged in your own model and are in same format as BERT.

If you use PyTorch, refer to Hugging Face's repo where detailed instructions on using BERT models are provided.

Fill Mask

We proposed to build language model which work on cyber security text, as result, it can improve downstream tasks (NER, Text Classification, Semantic Understand, Q&A) in Cyber Security Domain.

First, as below shows Fill-Mask pipeline in Google Bert, AllenAI SciBert and our SecBERT .

cd lm
python eval_fillmask_lm.py

Downstream-tasks

TODO

Star-History

Donation

If this project help you reduce time to develop, you can give me a cup of coffee :)

AliPay(支付宝)

WechatPay(微信)

License

MIT © Kun

awesome_LLMs_interview_notes

LLMs interview notes and answers:该仓库主要记录大模型（LLMs）算法工程师相关的面试题和参考答案

CycleGAN-VC2

Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2

Vicuna-LoRA-RLHF-PyTorch

A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna

Recurrent-LLM

The open-source LLM implementation of paper: RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text. AI 写小说，AI写作

CycleGAN-VC3

Voice Conversion by CycleGAN (语音克隆/语音转换)：CycleGAN-VC3

LAS_Mandarin_PyTorch

Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)

ChatGLM-LoRA-RLHF-PyTorch

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM

NLP4CyberSecurity

NLP model and tech for cyber security tasks

Jupyter Notebook

ThreatReportExtractor

Extracting Attack Behavior from Threat Reports

Alpaca-LoRA-RLHF-PyTorch

A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca

nude-detect

Porn Content Pic or Video Recognization

location_clustering

用户地理位置的聚类算法实现—基于DBSCAN和Kmeans的混合算法

awesome_NLP-Interview-Notes

nlp_interview notes and answers: 该仓库主要记录 NLP 算法工程师相关的面试题和参考答案

AI-WAF

AI driven Web Application Firewall

apk-view-tracer

Apk-view-tracer is a trigger tool for Android Dynamic Analysis and can be used in android anti-virus dynamic analysis.

drowsiness-detection

打瞌睡检测，通过检测眼皮对眼球的遮挡程度，判定是否打瞌睡😂

HomoglyphAttacksDetector

Detecting Homoglyph Attacks with CNN model using Computer Vision method

Jupyter Notebook

RepackagedAppDetector

Detect re-packaged app on Android based on fuzzy hash of instructions in dex

Loss-Function-In-PyTorch

Loss Function in PyTorch

Jupyter Notebook

WindowsStoreCrawler

crawl windows application from windows store on windows 8

SpeakerRecognition-ResNet-GhostVLAD

Utterance-level Aggregation For Speaker Recognition In The Wild, using a "thin-ResNet" trunk architecture, and a dictionary-based NetVLAD or GhostVLAD layer to aggregate features across time, that can be trained end-to-end

PrivacyLeakAdvancedDetection

Privacy Leak and Behavior Detect on Android based on method call graph

audio_classification_models.pytorch

audio/voice classification in pytorch implementations

GANs-implementation

GAN models implementation repo

jackaduma

personal profile

speaker_recognition_models.pytorch

speaker recognition / speaker verification models in pytorch implementation

DotNetAppGuard

Decompile &Static Analysis Dot Net App by using java

Speech-Transformer-PyTorch

jackaduma.github.io

django-cache-machine-mongoengine

Automatic caching and invalidation for Django & Mongodb. using models through the mongoengine ORM.

py-recommender-framework

Recommender Framework implemented by python

LangChain-OpenLLMs

Langchain-OpenLLMs with local knowledge library based on open source LLMs.

Jupyter Notebook

Annotated-Diffusion-Model

The Annotated Diffusion Model

Jupyter Notebook

SecCopilot

malicious-url-detection-with-ML

malicious url detection with machine learning

awesome_AI_in_CyberSecurity_papers

awesome AI in CyberSecurity papers list

phishing-url-detection-with-ML

phishing url detection with machine learning

awesome_AI_in_Speech_papers

awesome AI in Speech papers

weak-password-detection-with-ML

weak password detection with machine learning