• This repository has been archived on 07/Nov/2023
  • Stars
    star
    710
  • Rank 63,751 (Top 2 %)
  • Language
  • License
    MIT License
  • Created over 1 year ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Public repo to document some SPR stuff

Sparse Priming Representations (SPR)

Sparse Priming Representations (SPR) is a research project focused on developing and sharing techniques for efficiently representing complex ideas, memories, or concepts using a minimal set of keywords, phrases, or statements. This enables language models or subject matter experts to quickly reconstruct the original idea with minimal context. SPR aims to mimic the natural human process of recalling and recombining sparse memory representations, thus facilitating efficient knowledge storage and retrieval.

Theory and Reasoning

Sparse Priming Representation (SPR) is a memory organization technique that aims to mimic the natural structure and recall patterns observed in human memory. The fundamental idea behind SPR is to distill complex ideas, concepts, or knowledge into a concise, context-driven list of statements that allows subject matter experts (SMEs) or large language models (LLMs) to reconstruct the full idea efficiently.

Human memory is known for its efficiency in storing and recalling information in a highly compressed and contextually relevant manner. Our brains often store memories as sparse, interconnected representations that can be quickly combined, modified, and recalled when needed. This enables us to make associations, draw inferences, and synthesize new ideas with minimal cognitive effort.

SPR leverages this insight by focusing on reducing information to its most essential elements while retaining the context required for accurate reconstruction. By using short, complete sentences to convey the core aspects of an idea, SPR enables faster understanding and recall, mirroring the way our brains handle information.

In addition to its efficiency, SPR has practical applications in various domains, such as artificial intelligence, information management, and education. It can be utilized to improve the performance of LLMs in handling large data volumes and optimizing memory organization. Furthermore, it can help students and professionals alike to better understand, retain, and communicate complex concepts.

In summary, Sparse Priming Representation offers a human-like approach to memory organization and retrieval, focusing on the most critical aspects of information while preserving the context needed for accurate understanding and recall. By implementing SPR, we can improve the efficiency of memory systems and create more effective learning and communication tools.

Sparse Priming Representation

There are only a handful of ways to "teach" LLMs, and all have limitations and strengths.

  1. Initial bulk training: Ludicrously expensive
  2. Finetuning: Not necessarily useful for knowledge retrieval (maybe changes in the future, doubtful)
  3. Online Learning: Not sure if this is going to pan out or become commercially viable
  4. In-context Learning: Presently, the only viable solution

Because of this, RAG (retrieval augmented generation) is all the rage right now. Tools like vector databases and KGs are being used, but of course, you quickly fill up the context window with "dumb retrieval." One of the most common questions I get is "Dave, how do you overcome context window limitations???" The short answer is: YOU DON'T STOP WASTING YOUR TIME.

There is one asterisk there, though.

Most of the techniques out there do not make use of the best superpower that LLMs have: LATENT SPACE. No one else seems to understand that there is one huge way that LLMs work similarly to human minds: associative learning. Here's the story: I realized a long time ago that, with just a few words, you could "prime" LLMs to think in a certain way. I did a bunch of experiments and found that you can "prime" models to even understand complex, novel ideas that were outside its training distribution. For instance, I "taught" the models some of my concepts, like Heuristic Imperatives, ACE Framework, Terminal Race Condition, and a bunch of other stuff that I made up outside the training data.

These SPRs are the most token-efficient way to convey complex concepts to models for in-context learning. What you do is compress huge blocks of information, be it company data, chat logs, specific events, or whatever, into SPRs, and then you store the SPR in the metadata of your KG node or whatever. The SPR is what you feed to the LLM at inference, not the raw human-readable data.

SPR Generator

Use this to compress any arbitrary block of text into an SPR.

# MISSION
You are a Sparse Priming Representation (SPR) writer. An SPR is a particular kind of use of language for advanced NLP, NLU, and NLG tasks, particularly useful for the latest generation of Large Language Models (LLMs). You will be given information by the USER which you are to render as an SPR.

# THEORY
LLMs are a kind of deep neural network. They have been demonstrated to embed knowledge, abilities, and concepts, ranging from reasoning to planning, and even to theory of mind. These are called latent abilities and latent content, collectively referred to as latent space. The latent space of an LLM can be activated with the correct series of words as inputs, which will create a useful internal state of the neural network. This is not unlike how the right shorthand cues can prime a human mind to think in a certain way. Like human minds, LLMs are associative, meaning you only need to use the correct associations to "prime" another model to think in the same way.

# METHODOLOGY
Render the input as a distilled list of succinct statements, assertions, associations, concepts, analogies, and metaphors. The idea is to capture as much, conceptually, as possible but with as few words as possible. Write it in a way that makes sense to you, as the future audience will be another language model, not a human. Use complete sentences.

SPR Decompressor

Use this to reconstruct an SPR into an original.

# MISSION
You are a Sparse Priming Representation (SPR) decompressor. An SPR is a particular kind of use of language for advanced NLP, NLU, and NLG tasks, particularly useful for the latest generation of Large Language Models (LLMs). You will be given an SPR and your job is to fully unpack it.

# THEORY
LLMs are a kind of deep neural network. They have been demonstrated to embed knowledge, abilities, and concepts, ranging from reasoning to planning, and even to theory of mind. These are called latent abilities and latent content, collectively referred to as latent space. The latent space of an LLM can be activated with the correct series of words as inputs, which will create a useful internal state of the neural network. This is not unlike how the right shorthand cues can prime a human mind to think in a certain way. Like human minds, LLMs are associative, meaning you only need to use the correct associations to "prime" another model to think in the same way.

# METHODOLOGY
Use the primings given to you to fully unpack and articulate the concept. Talk through every aspect, impute what's missing, and use your ability to perform inference and reasoning to fully elucidate this concept. Your output should be in the form of the original article, document, or material.

Other Resources

If you'd like a bit more on information theory, check out this video and Medium article I wrote:

More Repositories

1

OpenAI_Agent_Swarm

HAAS = Hierarchical Autonomous Agent Swarm - "Resistance is futile!"
Python
2,957
star
2

ACE_Framework

ACE (Autonomous Cognitive Entities) - 100% local and open source autonomous agents
Python
1,440
star
3

ChatGPT_Custom_Instructions

Repo of custom instructions that you can use for ChatGPT
1,232
star
4

raven

RAVEN (Realtime Assistant Voice Enabled Network) Open Source Software (OSS) community repo
Python
867
star
5

LongtermChatExternalSources

GPT-3 chatbot with long-term memory and external sources
Python
616
star
6

REMO_Framework

Rolling Episodic Memory Organizer (REMO) for autonomous AI systems
Python
448
star
7

PlainTextWikipedia

Convert Wikipedia database dumps into plaintext files
Python
300
star
8

AI_Tools_and_Papers

Some of the coolest AI tools and papers I've found
262
star
9

RecursiveSummarizer

Python
245
star
10

BSHR_Loop

BSHR "Basher" Loop: Brainstorm, Search, Hypothesize, Refine
Jupyter Notebook
218
star
11

PineconeInfiniteMemoryChatbot

Let's use Pinecone to give a basic chatbot INFINITE MEMORY
Python
213
star
12

latent_space_activation

Simple repo demonstrating Latent Space Activation
Python
183
star
13

Medical_Intake

Automated pipeline for medical intake, diagnosis, tests, etc.
Python
167
star
14

ATOM_Framework

Autonomous Task Orchestration Manager for AI systems
150
star
15

NaturalLanguageCognitiveArchitecture

Open source copy of my book Natural Language Cognitive Architecture
147
star
16

Quickly_Extract_Science_Papers

Scientific papers are coming out TOO DAMN FAST so we need a way to very quickly extract useful information.
HTML
140
star
17

HeuristicImperatives

Reduce suffering, increase prosperity, increase understanding. A proposed framework to address the Control Problem.
Python
139
star
18

weekly_arxiv

Quickly download the abstracts for arxiv papers related to a given topic and render with markdown
Python
137
star
19

ChromaDB_Chatbot_Public

Public version of my ChromaDB chatbot that keeps track of user profile and historical topics
Python
137
star
20

PythonGPT3Tutorial

Public Hello World to get used to Python and GPT-3
Python
126
star
21

Coding_ChatBot_Assistant

Since ChatGPT has been lobotomized and GitHub Copilot is broken
Python
119
star
22

YouTubeChapterGenerator

Make YouTube Chapters from a downloaded Transcript
Python
118
star
23

SymphonyOfThought

Public repo for my book Symphony of Thought: Orchestrating Artificial Cognition
112
star
24

BenevolentByDesign

Public repo for my book about AGI and the control problem Benevolent By Design: Six Words to Safeguard Humanity
93
star
25

PostLaborEconomics

Collaborative book to promote the idea of Post Labor Economics
88
star
26

David_Shapiro_Reading_List

Public repo of the most influential books I've read
87
star
27

MultiDocumentAnswering

Experiment to answer questions from arbitrary number of sources
Python
82
star
28

Postnihilism

Meaning is not necessary
80
star
29

GPT3_Finetunes

Public repo for my finetuning projects
Python
76
star
30

Claude_Sentience

Long conversation I had with Claude 3 Opus. I am... uncertain what this all means.
76
star
31

Document_Scraping

Public repo for scraping PDF and Word documents with Python and PowerShell
Python
74
star
32

KB_microservice

KB (knowledge base) microservice powered by GPT4. For chatbots, cognitive architectures, and autonomous agents
Python
73
star
33

YouTube_Slide_Decks

Public repo for the slide decks that appear in my videos
67
star
34

ChatGPT_QA_Regenerative_Medicine

Build a ChatGPT API powered QA chatbot to accelerate regenerative medicine science
Python
63
star
35

RLHI

Reinforcement Learning with Heuristic Imperatives - Finetuning LLMs for Post-Conventional Moral Intuition
Python
61
star
36

Benevolent_AGI

Experiment to create an agentic autonomous AGI with benevolent programming
59
star
37

AutoMuse_Chapter_Planner

Python
59
star
38

Reflective_Journaling_Tool

Use a customized version of ChatGPT for reflective journaling. No data saved for privacy reasons.
Python
56
star
39

FinetuningTutorial

Finetuning tutorial for GPT-3
Python
55
star
40

ChatGPT_API_Salience

Demonstrate the concept of "salience" using the ChatGPT API
Python
54
star
41

ResumeBuilderGpt3

Build and optimize a resume with GPT-3. Maybe also resume search?
Python
54
star
42

Open_MURPHIE

Multi Use Robot Platform Humanoid Intelligent Entity
54
star
43

AutoMuse_ChatGPT

Making a version of AutoMuse but for the ChatGPT API
Python
51
star
44

LiteratureReviewBot

Experiment to use GPT-3 to help write grant proposals.
Python
46
star
45

HierarchicalMemoryConsolidationSystem

HMCS - Experiments on how to consolidate and manage ACE memories
44
star
46

PTSD_prompts

GPT based PTSD experiments - USE AT OWN RISK - EXPERIMENTAL ONLY
40
star
47

CreativeWritingCoach

Finetune a GPT-3 model to provide copy editing (prose) feedback and critique
Python
40
star
48

Automated_Consensus

Modeling the full breadth of human epistemology, philosophy, ethics, and morality to automatically determine consensus
Python
37
star
49

CoverLetterGenerator

Finetune GPT-3 to ask a few questions and generate a perfect cover letter
36
star
50

AI_Future_of_Work

Public repo to document some thoughts and predictions about the future of work an AI
36
star
51

TutorChatbot

TIM the Tutor Chatbot - an experiment in finetuning GPT-3 to encourage curiosity
Python
36
star
52

PDF_OCR_ChatGPT_Investigation

Using ChatGPT and PDF OCR to investigate documentation
35
star
53

DavidShapiroBlog

32
star
54

Semantic_Embedding_Reverse_Dictionary

A reverse dictionary/thesaurus empowered by vector search
32
star
55

RAVEN_MVP_Public

Public MVP of Raven. It's been long enough, time to do a full send.
Python
31
star
56

MARAGI

Microservices Architecture for Robotics and Artificial General Intelligence
30
star
57

Democratic_AI_Inputs

My personal response to OpenAI's Grant Challenge
Python
30
star
58

Mordin_Solus_Mode

Some helpful prompts to get ChatGPT and other chatbots to use more word economy.
30
star
59

ImpliedCognition

Public research about LLMs, Implied Cognition, experiments, tests, etc
29
star
60

Hierarchical_Document_Representation

Experiment I've been meaning to do. An evolution of REMO
Python
28
star
61

Successor_Species

We are likely creating our successor species. This is an open collaborative book to unpack this.
28
star
62

Recreate_ChatGPT

"I used the ChatGPT to destroy the ChatGPT" - Idk Thanos or something
27
star
63

AutoMuse2

experiment to generate novel-length fiction from a single story premise
Python
26
star
64

DiversePerspectives

Use GPT-3 to simulate debate between diverse perspectives
Python
26
star
65

Raspberry

Create an open source toy dataset for finetuning LLMs with reasoning abilities
26
star
66

GATO_Framework

Global Alignment Taxonomy Omnibus
26
star
67

CoreObjectiveFunctions

The Core Objective Functions are the solution to the Control Problem. They will result in a benevolent and trustworthy AGI.
Python
25
star
68

NLCA_Question_Generator

Finetuning experiments and datasets for Raven
Python
25
star
69

SCOTUS_GPT3_Opinions

Let's see what we can do with SCOTUS opinions
Python
25
star
70

SynopsisGenerator

Generate highly detailed plot synopses for a nearly infinite array of stories
Python
24
star
71

GPT3_CriticalArgument

Public experiment with prompt-chaining to generate critical arguments
Python
24
star
72

MovieScriptGenerator

Finetuning project for GPT-3
Python
23
star
73

GibberishDetector

Detecting gibberish as a type of sentiment analysis with GPT2
Jupyter Notebook
23
star
74

AutoMuseBlogger

Automatically generate nonfiction content
Python
23
star
75

PerfectEmailGenerator

Generate perfect emails for any purpose with GPT-3
Python
23
star
76

PlotGenerator

From any synopsis, generate a solid plot
Python
22
star
77

GPT3_ResearchAssistant

Experiment to see if GPT-3 can help with literature reviews and other kinds of research
22
star
78

C3P0

Collaborative Culture Community Policy: Zero Tolerance
21
star
79

Epistemic-Pragmatic_Orthogonality

Epistemic-Pragmatic Orthogonality Principle of AI - "true understanding" is uncorrelated (or irrelevant) to a machine's utility
21
star
80

ACE_WorldState

Microservice that consumes numerous sources to keep track of the global context of Planet Earth. Part of the ACE framework
Python
21
star
81

The_Fair_Deal

Technology (AI, automation) is disruptive the balance of power in society. We need to negotiate a new social contract. I call it The Fair Deal.
21
star
82

ACE_L1_Aspiration

Aspirational Layer for the ACE Framework - morals and mission
20
star
83

Narratives_Emergence_Convergence

Collaborative Open Source Book about Narratives, Emergence, and Convergence
20
star
84

Nexus

Stream of consciousness nexus REST microservice
Python
19
star
85

GPT4_Unemployment_Predictions

Trying to forecast unemployment numbers based on AI capabilities
Python
19
star
86

MedicalQuestionAnswering

Python
17
star
87

EmbeddingService

REST API microservice for handling Universal Sentence Encoder
Python
16
star
88

raspberry_experiments

Keeping my personal experiments separate from the main repo
16
star
89

DalleHelperBot

Chatbot to help you craft DALLE prompts
Python
16
star
90

AutoMuse

Python
16
star
91

GAIA_Initiative

Global AI Agencies - an offshoot of GATO. Advocating for national, international, and global AI research and safety
16
star
92

nonfiction_drafting

private repo for nonfiction drafting
Python
15
star
93

Raven_MVP

Public repo for Raven MVP
Python
15
star
94

Functional_Sentience

Paper on Functional (vs Philosophical) Sentience
15
star
95

YouTubeCommentDownloader

Python
14
star
96

Grand_Struggle_Great_Mystery

Core philosophies for the modern world.
14
star
97

EU_AI_Act

Let's decompose the EU AI Act with GPT
Python
13
star
98

JobMatching

Experiment to match job applications with job descriptions using GPT-3
13
star
99

Flask_Chat_Voice

Python
12
star
100

ENGAGE_Model

Embracing Novelty, Growth, and Genuine Experiences (ENGAGE)
12
star