ML News
May 2023
- 25: BLIP-Diffusion: Image generation modelwith zero-shot editing, style transfer and more (tweet, paper)
- 24: QLoRA: 4-bit finetuning (tweet, paper, demo)
- 22: BigCode Evaluation harness for code LLMs (tweet)
- 18: VisualGLM-6B (code, model)
- 16: PEFT Whisper (tweet, code)
- 12: Spacy-llm, Integrating LLMs into structured NLP pipelines, Explosion (tweet, code)
- 11: ProteinGeneration: diffusion in prpotein sequence space (tweet, paper)
- 10: PaLM 2, Google (blog)
- 9: StarChat: Creating a Coding Assistant with StarCoder (tweet, blog, demo)
- 5: OpenLLaMA: An Open Reproduction of LLaMA (tweet, code, model)
- 4: StarCoder, A State-of-the-Art LLM for Code (tweet, blog, code, model)
- 3: Pi, Personal Intelligence (tweet, demo)
April 2023
- 28: StableVicuna, RLHF LLM Chatbot (tweet, blog)
- 28: FastChat-T5: Compact, commercial-friendly chatbot (tweet, code)
- 28: DeepFloyd IF: State of the art text-to-image model that can also generate text (text, demo)
- 25: HuggingChat (tweet, UI)
- 25: MOSS, a 16B tool-augmented language model (tweet, code)
- 25: NeMo Guardrails (NVIDIA), (blog, code)
- 25: Track Anything: Segment Anything Meets Videos (tweet)
- 24: Scaling Transformer to 1M tokens (tweet, code)
- 20: Run Whisper 70x faster with JAX and TPU (tweet)
- 19: StableLM (tweet)
- 16: CAMEL: Physics, Chemistry and Biology datasets (tweet)
- 15: Open Assistant (tweet, website)
- 15: ControlNet 1.1 (tweet)
- 7: SegGPT: Segmenting Everything in context (tweet)
- 6: Vicuna-7B weights are released (tweet)
- 6: StackLlama (tweet)
- 6: VideoCrafter: text to video model (tweet)
- 6: Generative Novel View synthesis (tweet)
- 5: SAM - Segment anything (tweet)
- 5: ChatArena, multi-agent game environments for LLMs (tweet)
- 5: Kandinsky 2.1 for image generation (tweet)
- 5: LLaMA-Adapter (tweet)
- 5: LatentVideo Diffusion Models for long video generation (tweet)
- 4: MolFeat, a hub of molecular featurizers (tweet)
- 4: LangChain announced their $10M seed round (tweet)
- 4: Kandinsky 2.1 (tweet)
- 4: IGEL, an instruction-uned German LLM (tweet)
- 4: Koala-13B: A Dialogue Model for Academic Research (tweet, blog)
- 4: Baize: An Open-Source chat model with PEFT (tweet)
- 3: Vicuna-13B weights are released (tweet)
- 3: A Survey of Large Language Models (tweet)
March 2023
- 30: BloombergGPT: A Large Language Model for Finance (tweet, paper)
- 30: Nucleotide Transformer, SOTA Genomics (tweet, code)
- 30: ColossalChat (blog, code)
- 29: GeoV-9b (tweet, code, weights, colab)
- 29: Spanish BERTIN GPT-J-6B Alpaca and Alpaca LoRA (tweet)
- 29: LLaMA Adapter (tweet, code, paper)
- 28: PRESTO dataset (github)
- 28: OpenFlamingo (tweet, blog)
- 28: Raven RWKV (RWKV finetuned on alpaca and codealpaca) (tweet, demo)
- 28: Cerebras-GPT (tweet, models)
- 28: GPT4All (tweet, code)
- 28: Replit Partners with Google Cloud (tweet)
- 27: LLaMA voice chat + Siri TTS (tweet)
- 26: Japanese Alpaca LoRA (tweet, demo, report)
- 26: LLaMA voice chat (tweet)
- 24: Text2Video-Zero (tweet, code)
- 24: Dolly (tweet, code, demo)
- 22: Alpaca LoRA as a chatbot (tweet, code).
- 20: Runway Gen-2 (tweet)
- 24: SwissBERT (tweet, blog)
- 17: Alpacoom: BLOOM fine-tuned on Alpaca's dataset using LoRA (tweet, model)
- 16: Alpaca LoRA: instruct tune LLAMA on consumer hardware (tweet, code)
- 14: Claude, Anthropic (blog)
- 14: ChatGLM-6B (code, model)