Repository information
This repository contains data and code for the paper below:
Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information
Sudha Rao ([email protected]) and Hal Daumé III ([email protected])
Proceedings of The 2018 Association of Computational Lingusitics (ACL 2018)
Downloading data
-
Download the clarification questions dataset from google drive here: https://go.umd.edu/clarification_questions_dataset
-
cp clarification_questions_dataset/data ranking_clarification_questions/data
-
Download word embeddings trained on stackexchange datadump here: https://go.umd.edu/stackexchange_embeddings
-
cp stackexchange_embeddings/embeddings ranking_clarification_questions/embeddings
The above dataset contains clarification questions for these three sites of stackexchange:
- askubuntu.com
- unix.stackexchange.com
- superuser.com
Running model on data above
To run models on a combination of the three sites above, check ranking_clarification_questions/src/models/README
Generating data for other sites
To generate clarification questions for a different site of stackexchange, check ranking_clarification_questions/src/data_generation/README
Retrain stackexchange word embeddings
To retrain word embeddings on a newer version of stackexchange datadump, check ranking_clarification_questions/src/embedding_generation/README
Contact information
Please contact Sudha Rao ([email protected]) if you have any questions or any feedback.