• This repository has been archived on 22/Oct/2023
  • Stars
    star
    448
  • Rank 97,523 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 1 year ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Rolling Episodic Memory Organizer (REMO) for autonomous AI systems

REMO

Rolling Episodic Memory Organizer (REMO) for autonomous AI systems

  • REMO: Recursive Episodic Memory Organizer. Efficient, scalable memory management. Organizes conversational data into taxonomical ranks. Each rank clusters semantically similar elements. Powerful tool for context-aware AI systems. Improves conversational capabilities, recall accuracy.
  • Purpose: Assist AI systems in recalling relevant information. Enhance performance, maintain context. Supports natural language queries. Returns taxonomies of memory.
  • Structure: Tree-like, hierarchical. Bottom rank - message pairs. Higher ranks - summaries. Embeddings via Universal Sentence Encoder v5. Clustering by cosine similarity. Message pairs utilized because smallest semantic unit with context.
  • Functionality: Add new messages, rebuild tree, search tree. Passive microservice, memory management autonomic. Utilizes FastAPI REST API. Handles memory in concise, efficient manner.

Note: this code is still in early alpha. Testing and bugs should be expected!

EDIT: Someone implemented REMO with LangFlow: https://github.com/hunter-meloche/REMO-langflow

Binary Search Tree with Lightning Strike

Executive Summary

REMO (Rolling Episodic Memory Organizer) is an AI-powered microservice that organizes large volumes of text data, such as chat logs, into a hierarchical taxonomy. The taxonomy is constructed using summaries of message pairs and message clusters, allowing users to easily search and navigate through the conversation history. REMO utilizes the Universal Sentence Encoder for generating embeddings and clustering algorithms for organizing the data. The microservice is built using FastAPI, providing a simple and easy-to-use RESTful API.

Requirements

To run REMO, you will need the following:

  • Python 3.7 or higher
  • FastAPI
  • TensorFlow
  • TensorFlow Hub
  • scikit-learn
  • openai
  • PyYAML

Installation

Note: You may need to change tensorflow to tensowflow-macos in your requirements.txt file on certain OS X machines.

  1. Run pip install -r requirements.txt
  2. Create key_openai.txt file and put your OpenAI API key inside.

Usage

  1. Start the FastAPI server: uvicorn remo:app --reload
  2. Interact with the API using a REST client or web browser: http://localhost:8000

API Endpoints

  • POST /add_message: Add a new message to REMO. Speaker, timestamp, and content required.
  • GET /search: Search the taxonomy for relevant nodes. Query can be any string, such as messages, context, or whatever you want.
  • POST /rebuild_tree: Trigger a full tree rebuilding event. This deletes everything above L2_message_pairs and regenerates all clusters.
  • POST /maintain_tree: Trigger a tree maintenance event. This attempts to fit the most recent message pairs into the current tree structure, or create new nodes.

File Structure

  • remo.py: The main FastAPI application file.
  • utils.py: Utility functions for processing, clustering, and maintaining the taxonomy.
  • README.md: Documentation for the REMO project.

Folder Structure and YAML Files

The REMO microservice organizes conversation data into a hierarchical folder structure, with each folder representing a different taxonomical rank. Each folder contains YAML files that store the conversation data and associated metadata. Below is an overview of the folder structure and the content of the YAML files.

Folder Structure

REMO/
β”œβ”€β”€ L1_raw_logs/
β”œβ”€β”€ L2_message_pairs/
β”œβ”€β”€ L3_summaries/
β”œβ”€β”€ L4_summaries/
β”œβ”€β”€ ...

Description of folders

  • L1_raw_logs: This folder contains the raw conversation logs. Each YAML file in this folder represents a single message with its associated metadata.
  • L2_message_pairs: This folder contains message pairs, which are created by combining two consecutive raw logs. Each YAML file in this folder represents a message pair with its associated metadata and embeddings.
  • L3_summaries: This folder contains summaries of message pairs. Summaries are created by clustering message pairs and generating a concise representation of the cluster. The structure of this folder is similar to L2_message_pairs, with each YAML file representing a summary and its associated metadata and embeddings.
  • ...: Additional taxonomical ranks can be created as needed when clustering summaries at higher levels.

YAML Files

YAML files in the REMO folder structure store conversation data and associated metadata. YAML was selected because it is easily human readable for debugging and browsing. The structure of a YAML file is as follows:

content: <conversation_content>
speaker: <speaker_name> (only applicable for raw logs and message pairs)
timestamp: <timestamp>
vector: <embedding_vector>
files: <list_of_child_files> (only applicable for message pairs and summaries at higher ranks)
  • content: The conversation content, which can be a single message, a message pair, or a summary.
  • speaker: The name of the speaker for a single message or a message pair.
  • timestamp: The timestamp of the conversation, applicable for single messages or message pairs.
  • vector: The embedding vector generated from the conversation content, which is used for clustering and searching.
  • files: A list of child files that belong to the current summary (applicable only for summaries at higher taxonomical ranks).

Explanation of REMO Logic

REMO organizes chat logs into a hierarchical taxonomy using a combination of semantic embeddings and clustering techniques. The process can be understood through the following steps:

  1. Semantic Embeddings: Each message or message pair is converted into a high-dimensional semantic vector using the Universal Sentence Encoder. These vectors capture the meaning and context of the text, allowing for accurate comparisons between different messages.

  2. Clustering: The semantic vectors are grouped together using clustering algorithms, such as k-means clustering. This process creates clusters of related messages, which can be represented by summaries at different levels of the taxonomy.

  3. Summarization: AI language models, like GPT-3, are used to generate summaries of message pairs or clusters. These summaries provide a concise and coherent representation of the underlying conversations, making it easier for users to quickly understand the content.

  4. Taxonomy Construction: The resulting clusters and summaries are organized into a hierarchical structure, similar to a tree. Each level of the tree represents a different level of detail, with the top levels containing general summaries and the lower levels containing more specific information.

  5. Maintenance: As new messages are added to the system, REMO can efficiently integrate them into the existing taxonomy through periodic tree maintenance events. This ensures that the system remains up-to-date and relevant, even as new conversations are added.

The hierarchical structure created by REMO allows users to easily navigate and search through large volumes of conversation data. By starting at the top levels of the taxonomy and drilling down to the lower levels, users can efficiently explore the content and gain insights without getting overwhelmed by the details.

Explanation of the Returned Taxonomy and its Value

The taxonomy returned by REMO is a hierarchical structure that presents conversation data at varying levels of granularity. Each level of the taxonomy represents a different level of detail, with higher levels providing general summaries and lower levels offering more specific information. This structure enables users to explore and understand large amounts of conversation data efficiently.

The value and usefulness of the returned taxonomy lie in its ability to:

  1. Simplify Navigation: The hierarchical structure allows users to navigate conversation data in a logical and organized manner. Users can start at the top levels, which provide an overview of the main topics, and then delve deeper into the lower levels to explore specific conversations or details.

  2. Improve Searchability: With the taxonomy in place, users can quickly and accurately find relevant conversations based on their search queries. The system identifies the most relevant nodes in the taxonomy and returns a list of associated summaries, allowing users to pinpoint the desired information without sifting through countless unrelated messages.

  3. Enhance Understanding: The summaries generated at each level of the taxonomy provide concise and coherent representations of the underlying conversations. This makes it easier for users to grasp the main ideas and context of the conversations without needing to read through every individual message.

  4. Facilitate Knowledge Discovery: By organizing conversations into meaningful clusters and summaries, the taxonomy helps users uncover new insights and connections between different topics or ideas. This can lead to a deeper understanding of the conversation data and the identification of previously unrecognized patterns or trends.

  5. Optimize Scalability: The hierarchical structure of the taxonomy allows REMO to efficiently handle large volumes of conversation data. As new messages are added, the system can quickly integrate them into the existing taxonomy through periodic maintenance events, ensuring that the taxonomy remains up-to-date and relevant.

Example REMO Taxonomy

The following example is imaginary, but serves to illustrate the value. A returned taxonomy starts broad, vague, and generic. This can be useful when working with ChatBots as they frequently lose context. However, as the taxonomy drills down, it becomes more specific, quickly giving context as well as detail. Furthermore, the recursive summarization scheme of REMO results in a temporally invariant recall, which means that all memories are treated equally, no matter how old they are.

Example Query:

How does REMO handle salience?

Example Taxonomy:

  • Rank 3:
    • "Discussion about REMO, a memory system for chatbots, which uses a tree-like structure for organizing and retrieving memories."
  • Rank 2:
    • "Exploration of different strategies for handling salience in REMO, focusing on clustering conversational pairs and searching through tree levels."
  • Rank 1:
    • "With this search tree, you can very quickly zero in on the most salient memories AND it is intrinsically organized like a web/graph so you can "walk" to nearby memories. You can even add an algorithm so the "lightning strike" will fork a few times, like "grab the top and second most salient memories" all the way down."

You can see that the highest rank provides some context; what is REMO and what is it for? Then you can see that the taxonomy drills down into clustering strategies. Finally, the lowest rank recalls a specific line of dialog.

More Repositories

1

OpenAI_Agent_Swarm

HAAS = Hierarchical Autonomous Agent Swarm - "Resistance is futile!"
Python
2,957
star
2

ACE_Framework

ACE (Autonomous Cognitive Entities) - 100% local and open source autonomous agents
Python
1,440
star
3

ChatGPT_Custom_Instructions

Repo of custom instructions that you can use for ChatGPT
1,232
star
4

raven

RAVEN (Realtime Assistant Voice Enabled Network) Open Source Software (OSS) community repo
Python
867
star
5

SparsePrimingRepresentations

Public repo to document some SPR stuff
710
star
6

LongtermChatExternalSources

GPT-3 chatbot with long-term memory and external sources
Python
616
star
7

PlainTextWikipedia

Convert Wikipedia database dumps into plaintext files
Python
300
star
8

AI_Tools_and_Papers

Some of the coolest AI tools and papers I've found
262
star
9

RecursiveSummarizer

Python
245
star
10

BSHR_Loop

BSHR "Basher" Loop: Brainstorm, Search, Hypothesize, Refine
Jupyter Notebook
218
star
11

PineconeInfiniteMemoryChatbot

Let's use Pinecone to give a basic chatbot INFINITE MEMORY
Python
213
star
12

latent_space_activation

Simple repo demonstrating Latent Space Activation
Python
183
star
13

Medical_Intake

Automated pipeline for medical intake, diagnosis, tests, etc.
Python
167
star
14

ATOM_Framework

Autonomous Task Orchestration Manager for AI systems
150
star
15

NaturalLanguageCognitiveArchitecture

Open source copy of my book Natural Language Cognitive Architecture
147
star
16

Quickly_Extract_Science_Papers

Scientific papers are coming out TOO DAMN FAST so we need a way to very quickly extract useful information.
HTML
140
star
17

HeuristicImperatives

Reduce suffering, increase prosperity, increase understanding. A proposed framework to address the Control Problem.
Python
139
star
18

weekly_arxiv

Quickly download the abstracts for arxiv papers related to a given topic and render with markdown
Python
137
star
19

ChromaDB_Chatbot_Public

Public version of my ChromaDB chatbot that keeps track of user profile and historical topics
Python
137
star
20

PythonGPT3Tutorial

Public Hello World to get used to Python and GPT-3
Python
126
star
21

Coding_ChatBot_Assistant

Since ChatGPT has been lobotomized and GitHub Copilot is broken
Python
119
star
22

YouTubeChapterGenerator

Make YouTube Chapters from a downloaded Transcript
Python
118
star
23

SymphonyOfThought

Public repo for my book Symphony of Thought: Orchestrating Artificial Cognition
112
star
24

BenevolentByDesign

Public repo for my book about AGI and the control problem Benevolent By Design: Six Words to Safeguard Humanity
93
star
25

PostLaborEconomics

Collaborative book to promote the idea of Post Labor Economics
88
star
26

David_Shapiro_Reading_List

Public repo of the most influential books I've read
87
star
27

MultiDocumentAnswering

Experiment to answer questions from arbitrary number of sources
Python
82
star
28

Postnihilism

Meaning is not necessary
80
star
29

GPT3_Finetunes

Public repo for my finetuning projects
Python
76
star
30

Claude_Sentience

Long conversation I had with Claude 3 Opus. I am... uncertain what this all means.
76
star
31

Document_Scraping

Public repo for scraping PDF and Word documents with Python and PowerShell
Python
74
star
32

KB_microservice

KB (knowledge base) microservice powered by GPT4. For chatbots, cognitive architectures, and autonomous agents
Python
73
star
33

YouTube_Slide_Decks

Public repo for the slide decks that appear in my videos
67
star
34

ChatGPT_QA_Regenerative_Medicine

Build a ChatGPT API powered QA chatbot to accelerate regenerative medicine science
Python
63
star
35

RLHI

Reinforcement Learning with Heuristic Imperatives - Finetuning LLMs for Post-Conventional Moral Intuition
Python
61
star
36

Benevolent_AGI

Experiment to create an agentic autonomous AGI with benevolent programming
59
star
37

AutoMuse_Chapter_Planner

Python
59
star
38

Reflective_Journaling_Tool

Use a customized version of ChatGPT for reflective journaling. No data saved for privacy reasons.
Python
56
star
39

FinetuningTutorial

Finetuning tutorial for GPT-3
Python
55
star
40

ChatGPT_API_Salience

Demonstrate the concept of "salience" using the ChatGPT API
Python
54
star
41

ResumeBuilderGpt3

Build and optimize a resume with GPT-3. Maybe also resume search?
Python
54
star
42

Open_MURPHIE

Multi Use Robot Platform Humanoid Intelligent Entity
54
star
43

AutoMuse_ChatGPT

Making a version of AutoMuse but for the ChatGPT API
Python
51
star
44

LiteratureReviewBot

Experiment to use GPT-3 to help write grant proposals.
Python
46
star
45

HierarchicalMemoryConsolidationSystem

HMCS - Experiments on how to consolidate and manage ACE memories
44
star
46

PTSD_prompts

GPT based PTSD experiments - USE AT OWN RISK - EXPERIMENTAL ONLY
40
star
47

CreativeWritingCoach

Finetune a GPT-3 model to provide copy editing (prose) feedback and critique
Python
40
star
48

Automated_Consensus

Modeling the full breadth of human epistemology, philosophy, ethics, and morality to automatically determine consensus
Python
37
star
49

CoverLetterGenerator

Finetune GPT-3 to ask a few questions and generate a perfect cover letter
36
star
50

AI_Future_of_Work

Public repo to document some thoughts and predictions about the future of work an AI
36
star
51

TutorChatbot

TIM the Tutor Chatbot - an experiment in finetuning GPT-3 to encourage curiosity
Python
36
star
52

PDF_OCR_ChatGPT_Investigation

Using ChatGPT and PDF OCR to investigate documentation
35
star
53

DavidShapiroBlog

32
star
54

Semantic_Embedding_Reverse_Dictionary

A reverse dictionary/thesaurus empowered by vector search
32
star
55

RAVEN_MVP_Public

Public MVP of Raven. It's been long enough, time to do a full send.
Python
31
star
56

MARAGI

Microservices Architecture for Robotics and Artificial General Intelligence
30
star
57

Democratic_AI_Inputs

My personal response to OpenAI's Grant Challenge
Python
30
star
58

Mordin_Solus_Mode

Some helpful prompts to get ChatGPT and other chatbots to use more word economy.
30
star
59

ImpliedCognition

Public research about LLMs, Implied Cognition, experiments, tests, etc
29
star
60

Hierarchical_Document_Representation

Experiment I've been meaning to do. An evolution of REMO
Python
28
star
61

Successor_Species

We are likely creating our successor species. This is an open collaborative book to unpack this.
28
star
62

Recreate_ChatGPT

"I used the ChatGPT to destroy the ChatGPT" - Idk Thanos or something
27
star
63

AutoMuse2

experiment to generate novel-length fiction from a single story premise
Python
26
star
64

DiversePerspectives

Use GPT-3 to simulate debate between diverse perspectives
Python
26
star
65

Raspberry

Create an open source toy dataset for finetuning LLMs with reasoning abilities
26
star
66

GATO_Framework

Global Alignment Taxonomy Omnibus
26
star
67

CoreObjectiveFunctions

The Core Objective Functions are the solution to the Control Problem. They will result in a benevolent and trustworthy AGI.
Python
25
star
68

NLCA_Question_Generator

Finetuning experiments and datasets for Raven
Python
25
star
69

SCOTUS_GPT3_Opinions

Let's see what we can do with SCOTUS opinions
Python
25
star
70

SynopsisGenerator

Generate highly detailed plot synopses for a nearly infinite array of stories
Python
24
star
71

GPT3_CriticalArgument

Public experiment with prompt-chaining to generate critical arguments
Python
24
star
72

MovieScriptGenerator

Finetuning project for GPT-3
Python
23
star
73

GibberishDetector

Detecting gibberish as a type of sentiment analysis with GPT2
Jupyter Notebook
23
star
74

AutoMuseBlogger

Automatically generate nonfiction content
Python
23
star
75

PerfectEmailGenerator

Generate perfect emails for any purpose with GPT-3
Python
23
star
76

PlotGenerator

From any synopsis, generate a solid plot
Python
22
star
77

GPT3_ResearchAssistant

Experiment to see if GPT-3 can help with literature reviews and other kinds of research
22
star
78

C3P0

Collaborative Culture Community Policy: Zero Tolerance
21
star
79

Epistemic-Pragmatic_Orthogonality

Epistemic-Pragmatic Orthogonality Principle of AI - "true understanding" is uncorrelated (or irrelevant) to a machine's utility
21
star
80

ACE_WorldState

Microservice that consumes numerous sources to keep track of the global context of Planet Earth. Part of the ACE framework
Python
21
star
81

The_Fair_Deal

Technology (AI, automation) is disruptive the balance of power in society. We need to negotiate a new social contract. I call it The Fair Deal.
21
star
82

ACE_L1_Aspiration

Aspirational Layer for the ACE Framework - morals and mission
20
star
83

Narratives_Emergence_Convergence

Collaborative Open Source Book about Narratives, Emergence, and Convergence
20
star
84

Nexus

Stream of consciousness nexus REST microservice
Python
19
star
85

GPT4_Unemployment_Predictions

Trying to forecast unemployment numbers based on AI capabilities
Python
19
star
86

MedicalQuestionAnswering

Python
17
star
87

EmbeddingService

REST API microservice for handling Universal Sentence Encoder
Python
16
star
88

raspberry_experiments

Keeping my personal experiments separate from the main repo
16
star
89

DalleHelperBot

Chatbot to help you craft DALLE prompts
Python
16
star
90

AutoMuse

Python
16
star
91

GAIA_Initiative

Global AI Agencies - an offshoot of GATO. Advocating for national, international, and global AI research and safety
16
star
92

nonfiction_drafting

private repo for nonfiction drafting
Python
15
star
93

Raven_MVP

Public repo for Raven MVP
Python
15
star
94

Functional_Sentience

Paper on Functional (vs Philosophical) Sentience
15
star
95

YouTubeCommentDownloader

Python
14
star
96

Grand_Struggle_Great_Mystery

Core philosophies for the modern world.
14
star
97

EU_AI_Act

Let's decompose the EU AI Act with GPT
Python
13
star
98

JobMatching

Experiment to match job applications with job descriptions using GPT-3
13
star
99

Flask_Chat_Voice

Python
12
star
100

ENGAGE_Model

Embracing Novelty, Growth, and Genuine Experiences (ENGAGE)
12
star