• Stars
    star
    457
  • Rank 95,750 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 3 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/casual, active/passive, and many more. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

PyPI - License visitors

Styleformer

A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/casual, active/passive, and many more.For instance, understand What makes text formal or casual/informal.

Table of contents

Usecases for Styleformer

Area 1: Data Augmentation

  • Augment training datasets with various fine-grained language styles.

Area 2: Post-processing

  • Apply style transfers to machine generated text.
  • e.g.
    • Refine a Summarised text to active voice + formal tone.
    • Refine a Translated text to more casual tone to reach younger audience.

Area 3: Controlled paraphrasing

  • Formal <=> Casual and Active <=> style transfers adds a notion of control over how we paraphrase when compared to free-form paraphrase where there is control or guarantee over the paraphrases.

Area 4: Assisted writing

  • Integrate this to any human writing interfaces like email clients, messaging tools or social media post authoring tools. Your creativity is your limit to te uses.
  • e.g.
    • Polish an email with business tone for professional uses.

Installation

pip install git+https://github.com/PrithivirajDamodaran/Styleformer.git

Quick Start

python [IMPORTANT] If you are using in notebook, use the below line to login: from huggingface_hub import notebook_login notebook_login() else use: huggingface-cli login

Casual to Formal (Available now !)

from styleformer import Styleformer
import torch
import warnings
warnings.filterwarnings("ignore")

'''
#uncomment for re-producability
def set_seed(seed):
  torch.manual_seed(seed)
  if torch.cuda.is_available():
    torch.cuda.manual_seed_all(seed)

set_seed(1234)
'''

# style = [0=Casual to Formal, 1=Formal to Casual, 2=Active to Passive, 3=Passive to Active etc..]
sf = Styleformer(style = 0) 

source_sentences = [
"I am quitting my job",
"Jimmy is on crack and can't trust him",
"What do guys do to show that they like a gal?",
"i loooooooooooooooooooooooove going to the movies.",
"That movie was fucking awesome",
"My mom is doing fine",
"That was funny LOL" , 
"It's piece of cake, we can do it",
"btw - ur avatar looks familiar",
"who gives a crap?",
"Howdy Lucy! been ages since we last met.",
"Dude, this car's dope!",
"She's my bestie from college",
"I kinda have a feeling that he has a crush on you.",
"OMG! It's finger-lickin' good.",
]   

for source_sentence in source_sentences:
    target_sentence = sf.transfer(source_sentence)
    print("-" *100)
    print("[Casual] ", source_sentence)
    print("-" *100)
    if target_sentence is not None:
        print("[Formal] ",target_sentence)
        print()
    else:
        print("No good quality transfers available !")
[Casual]  I am quitting my job
[Formal]  I will be stepping down from my job.
----------------------------------------------------------------------------------------------------
[Casual]  Jimmy is on crack and can't trust him
[Formal]  Jimmy is a crack addict I cannot trust him
----------------------------------------------------------------------------------------------------
[Casual]  What do guys do to show that they like a gal?
[Formal]  What do guys do to demonstrate their affinity for women?
----------------------------------------------------------------------------------------------------
[Casual]  i loooooooooooooooooooooooove going to the movies.
[Formal]  I really like to go to the movies.
----------------------------------------------------------------------------------------------------
[Casual]  That movie was fucking awesome
[Formal]  That movie was wonderful.
----------------------------------------------------------------------------------------------------
[Casual]  My mom is doing fine
[Formal]  My mother is doing well.
----------------------------------------------------------------------------------------------------
[Casual]  That was funny LOL
[Formal]  That was hilarious
----------------------------------------------------------------------------------------------------
[Casual]  It's piece of cake, we can do it
[Formal]  The whole process is simple and is possible.
----------------------------------------------------------------------------------------------------
[Casual]  btw - ur avatar looks familiar
[Formal]  Also, your avatar looks familiar.
----------------------------------------------------------------------------------------------------
[Casual]  who gives a crap?
[Formal]  Who cares?
----------------------------------------------------------------------------------------------------
[Casual]  Howdy Lucy! been ages since we last met.
[Formal]  Hello, Lucy It has been a long time since we last met.
----------------------------------------------------------------------------------------------------
[Casual]  Dude, this car's dope!
[Formal]  I find this car very appealing.
----------------------------------------------------------------------------------------------------
[Casual]  She's my bestie from college
[Formal]  She is my best friend from college.
----------------------------------------------------------------------------------------------------
[Casual]  I kinda have a feeling that he has a crush on you.
[Formal]  I have a feeling that he is attracted to you.
----------------------------------------------------------------------------------------------------
[Casual]  OMG! It's finger-lickin' good.
[Formal]  It is so good, it is delicious.
----------------------------------------------------------------------------------------------------

Formal to Casual (Available now !)

from styleformer import Styleformer
import warnings
warnings.filterwarnings("ignore")

# style = [0=Casual to Formal, 1=Formal to Casual, 2=Active to Passive, 3=Passive to Active etc..]
sf = Styleformer(style = 1) 
import torch
def set_seed(seed):
  torch.manual_seed(seed)
  if torch.cuda.is_available():
    torch.cuda.manual_seed_all(seed)

set_seed(1212)

source_sentences = [
"I would love to meet attractive men in town",
"Please leave the room now",
"It is a delicious icecream",
"I am not paying this kind of money for that nonsense",
"He is on cocaine and he cannot be trusted with this",
"He is a very nice man and has a charming personality",
"Let us go out for dinner",
"We went to Barcelona for the weekend. We have a lot of things to tell you.",
]   

for source_sentence in source_sentences:
    # inference_on = [-1=Regular model On CPU, 0-998= Regular model On GPU, 999=Quantized model On CPU]
    target_sentence = sf.transfer(source_sentence, inference_on=-1, quality_filter=0.95, max_candidates=5)
    print("[Formal] ", source_sentence)
    if target_sentence is not None:
        print("[Casual] ",target_sentence)
    else:
        print("No good quality transfers available !")
    print("-" *100)        
[Formal]  I would love to meet attractive men in town
[Casual]  i want to meet hot guys in town
----------------------------------------------------------------------------------------------------
[Formal]  Please leave the room now
[Casual]  leave the room now.
----------------------------------------------------------------------------------------------------
[Formal]  It is a delicious icecream
[Casual]  It is a yummy icecream
----------------------------------------------------------------------------------------------------
[Formal]  I am not paying this kind of money for that nonsense
[Casual]  But I'm not paying this kind of money for that crap
----------------------------------------------------------------------------------------------------
[Formal]  He is on cocaine and he cannot be trusted with this
[Casual]  he is on coke and he can't be trusted with this
----------------------------------------------------------------------------------------------------
[Formal]  He is a very nice man and has a charming personality
[Casual]  he is a really nice guy with a cute personality.
----------------------------------------------------------------------------------------------------
[Formal]  Let us go out for dinner
[Casual]  let's hang out for dinner.
----------------------------------------------------------------------------------------------------
[Formal]  We went to Barcelona for the weekend. We have a lot of things to tell you.
[Casual]  hehe..we went to barcelona for the weekend..we got a lot of things to tell ya..
----------------------------------------------------------------------------------------------------

Active to Passive (Available now !)

# style = [0=Casual to Formal, 1=Formal to Casual, 2=Active to Passive, 3=Passive to Active etc..]
sf = Styleformer(style = 2) 

Passive to Active (Available now !)

# style = [0=Casual to Formal, 1=Formal to Casual, 2=Active to Passive, 3=Passive to Active etc..]
sf = Styleformer(style = 3) 

Knobs

# inference_on = [-1=Regular model On CPU, 0-998= Regular model On GPU, 999=Quantized model On CPU]
target_sentence = sf.transfer(source_sentence, inference_on=-1, quality_filter=0.95, max_candidates=5)

Models

Model Type Status
prithivida/informal_to_formal_styletransfer Seq2Seq Beta
prithivida/formal_to_informal_styletransfer Seq2Seq Beta
prithivida/active_to_passive_styletransfer Seq2Seq Beta
prithivida/passive_to_active_styletransfer Seq2Seq Beta
prithivida/positive_to_negative_styletransfer Seq2Seq WIP
prithivida/negative_to_positive_styletransfer Seq2Seq WIP

Dataset

  • The casual <=> formal dataset was generated using ideas mentioned in reference paper 1
  • The positive <=> negative dataset was generated using ideas mentioned in reference paper 3
  • Fined tuned on T5 on a Tesla T4 GPU and it took ~2 hours to train each of the above models with batch_size = 16 and epochs = 5.(Will share training args shortly)

Benchmark

  • TBD

Streamlit Demo

pip install streamlit
streamlit run streamlit_app.py

References

Citation

  • TBD

More Repositories

1

Gramformer

A framework for detecting, highlighting and correcting grammatical errors on natural language text. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.
Python
1,422
star
2

Parrot_Paraphraser

A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.
Python
823
star
3

FlashRank

Ultra-lite & Super-fast SoTA cross-encoder based re-ranking for your search & retrieval pipelines. Created by Prithivi Da, open for PRs & Collaborations.
Python
66
star
4

ZSIC

Zero Shot Image Classification but more, Supports Multilingual labelling and a variety of CNN based models for a vision backbone by using OpenAI CLIP for $ conscious uses (Super simple, so a 10th-grader could totally write this but since no 10th-grader did, I did) - Prithivi Da
Python
42
star
5

Alt-ZSC

Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models to do ZSC. Hence, can be lightweight + supports more languages without trading-off accuracy. (Super simple, a 10th-grader could totally write this but since no 10th-grader did, I did) - Prithivi Da
Python
35
star
6

NLP-Experiments

NLP related python solutions and recipes I built
Jupyter Notebook
18
star
7

MQTTPush4Android

MQTT Push Notification Service for Android
Java
14
star
8

vision-language-modelling-series

Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations
Jupyter Notebook
13
star
9

Applied-ML-from-1st-principles

Companion Repo for the book The Applied ML Field Manual, Prithiviraj Damodaran
Jupyter Notebook
12
star
10

RealTimeNotification

AWS SNS Equivalent - LMAX Disruptor/MQTT/RabbitMQ/ based Real-time x-Platform Notification Framework
Java
7
star
11

WhatTheFood

An intentionally simple Image to Food cross-modal search. Created by Prithiviraj Damodaran.
3
star
12

langid-benchmark

A benchmark of off-the-shelf models that detects language in text - Contributions welcome
Python
3
star
13

Whatsapp--

A variant of whatsapp called "whatsapp--" with cassandra
JavaScript
2
star
14

AirportTaxiDemandPrediction

Airport Taxi Demand Forecast ML problem
Jupyter Notebook
2
star
15

VisionLanguageProgress

Repository to track the progress in V+L Multimodal Learning, including the datasets and the state-of-the-art of the current pre-trained models on various V+L tasks
2
star
16

ICECAGE

Image CEntric CAption Generation Evaluation
1
star
17

ATMFraudMonitor

CEP for Banking systems powered by Apache Flink
Java
1
star
18

PrithivirajDamodaran

1
star
19

IndeedSalaryPrediction

Indeed Salary Prediction
Jupyter Notebook
1
star