A Web Service for the CPAN
MetaCPAN aims to provide a free, open web service which provides metadata for CPAN modules.
REST API
MetaCPAN is based on Elasticsearch, so it provides a RESTful interface as well
as the option to create complex queries. The
docs/
directory provides a good
starting point for REST access to MetaCPAN.
Expanding Your Author Info
MetaCPAN allows authors to add custom metadata about themselves to the index. Log in to MetaCPAN to add more information about yourself.
Installing Your Own MetaCPAN:
If you want to run MetaCPAN locally, we encourage you to start with metacpan-docker. However, you may still find some info here:
Troubleshooting Elasticsearch
You can restart Elasticsearch (ES) manually if you need to troubleshoot.
sudo service elasticsearch restart
If you are unable to access [[http://localhost:9200]] (give it a few seconds) you should kill the Elasticsearch process and run it in foreground to see the debug output
sudo service elasticsearch stop
cd /opt/elasticsearch
sudo bin/elasticsearch -f
If you get a "Can't start up: not enough memory" error when trying to start Elasticsearch, you likely need to update your JRE. On Ubuntu:
# fixes "not enough memory" errors
sudo apt-get install openjdk-6-jre
(Note: If you intend to try indexing a full MiniCPAN, you may find that Elasticsearch wants to use more open filehandles than your system allows by default. This script can be used to start ES with the appropriate ulimit adjustment).
Run the test suite
The test suite accesses Elasticsearch on port 9900. The developer VM should have a dedicated test instance running in the background already, but if you want to run it manually:
cd /opt/elasticsearch
sudo bin/elasticsearch -f -Des.http.port=9900 -Des.cluster.name=testing
Then run the test suite:
cd /home/metacpan/metacpan-api
./bin/prove t
The test suite has to pass all tests.
Create the ElasticSearch Index
./bin/run bin/metacpan mapping --delete
--delete
will drop all indices first to clear the index from test data.
Begin Indexing Your Modules
./bin/run bin/metacpan release /path/to/cpan/authors/id/
You should note that you can index either your CPAN mirror or a minicpan mirror. You can even index just parts of a mirror:
./bin/run bin/metacpan release /path/to/cpan/authors/id/{A,B}
Tag the Latest Releases
./bin/run bin/metacpan latest --cpan /path/to/cpan/
Index Author Data
./bin/run bin/metacpan author --cpan /path/to/cpan/
Note that minicpan doesn't provide the 00whois.xml file which is used to generate the index; you will have to download it manually (it is in the authors/ directory) in order to index authors.
wget -O /path/to/cpan/authors/00whois.xml cpan.cpantesters.org/authors/00whois.xml
It also doesn't include author.json files, so that data will also be missing unless you get it from somewhere else.
Set Up Proxy in Front of ElasticSearch
Start API server on port 5000
./bin/run plackup -p 5000 -r
This will start a single-threaded test server. If you need extra performance, use Starman
instead.
Notes
For a full list of options:
./bin/run bin/metacpan release --help
Contributing:
If you'd like to get involved, find us at #metacpan on irc.perl.org or open an issue on GitHub and let us know what you'd like to start working on.
IRC
You can find us at #metacpan on irc.perl.org Access it via web interface.