There are no reviews yet. Be the first to send feedback to the community and the maintainers!
trec_eval is the standard tool used by the TREC community for evaluating an ad hoc retrieval run, given the results file and a standard set of judged results. ------------------------------------------------------------------------------ Installation: Should be as easy as typing "make" in the source directory. If you wish the trec_eval binary to be placed in a standard location, alter the first line of Makefile appropriately. ------------------------------------------------------------------------------ Testing: sample input and output files are included in the directory test. "make quicktest" will perform some sample simple evaluations and compare the results. ------------------------------------------------------------------------------ Usage: Most options can be ignored. The only one most folks will need is the "-q" flag, to indicate whether to output official results for individual queries as well as the averages over all queries. Official TREC usage might be something like trec_eval -q -c -M1000 official_qrels submitted_results to ensure correct evaluation if submitted_results doesn't have results for all queries, or returns more than 1000 documents per query. If you wish to output only one particular measure: trec_eval -m measure[.params] rel_info_file results_file ------------------------------------------------------------------------------ Change Log (only recent) ------------------------------------------------------------------------------ 12/31/08 comments and documentation of Zscore file format corrected. trec_eval.c get_zscores.c 2/25/08 Version 9.0alpha. Complete rewrite of entire trec_eval (needed for a long time!). Complete separation of individual measure calculations - computers are now fast enough so can afford recalculation of lots of intermediate values. Should be much easier to add measures to, and much easier to add new input file formats with associated measures. Parameters for measures (eg, cutoffs for P) can be specified on the command line. Choice of measures can be specifed on the command line. An initial set of preference evaluation measures (with their own input rel_info format) have been added. Help now gives targeted measure and format descriptions. Try trec_eval -h -m all_prefs -R prefs to get info on preference measures and formats, for instance. All internal calculations are in double rather than float. Yields minor variations in output at rare times; mostly when going from a double percentage to a corresponding doc cutoff (eg, in iprec_at_recall). All globally known procedure names or variables now begin with 'te_' to allow incorporation of procedures in other programs. Measures added: ndcg, ndcg_cut, set_F, success, map_avgjg, P_avgjg, various preference evaluation measures. Measures renamed: Rprec-mult_* was *R-prec set_P was exact_prec set_recall was exact_recall set_relative_P was exact_relative_prec set_recall was exact_recall set_map was exact_unranked_avg_prec gm_map was gm_ap 11pt_avg was 11-pt_avg P_* was P* recall_* was recall* relative_P_* was relative_P* iprec_at_recall_* was ircl_prn.* Measures dropped for now: 3-pt_avg avg_doc_prec avg_relative_prec exact_relative_unranked_avg_prec map_at_R int_map exact_int_R_rcl_prec int_map_at_R unranked_avg_prec* relative_unranked_avg_prec* rcl_at_142_nonrel fallout_recall_* int_*R-prec micro_prec micro_recall micro_bpref bpref variants time base measures. Input formats added: prefs - allows expression of preferences qrels_prefs - same as standard qrels, except treated as prefs qrels_jg - same as standard qrels, except allows judgment sets from multiple users (judgment groups). Version 8.1, Added infAP, minor bug fixes 7/24/06 Improved infAP comments (implementation verified by Yilmaz). trec_eval_help.c: allow longer measure explanations. 6/27/06 get_opt.c Fixed error message 6/22/06 Added measure infAP (Aslam et al) to allow judging only sample of pools. -1 for rel in qrels file interpreted as pool doc not judged. 6/22/06 trvec_teval.c: fixed bugs in calculation of bpref if multiple relevance levels were used and a non-default relevance level was given. (Eg. A doc with rel level of 2 was counted as unjudged rather than judged nonrel if a relevance level of 3 was needed to consider relevant.) 4/5/06 Changed comments in README, trec_eval.c, trec_eval_help.c files which incorrectly claimed queries with no relevant docs are ignored (this was true with very old versions of trec_eval). Now reads that queries with no relevance information are ignored. Giorgio Di Nunzio and Nicola Ferro, ------------------------------------------------------------------------------ Version 8.0, full bpref bug fix, see file bpref_bug. I decided to up the version number since bpref results are incompatible with previous results (though the changes are small). ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Files: Main procedure: trec_eval.c --- Procedures to read input files of various formats: formats.c Mapping names of input file formats to input procedures get_qrels.c Read the standard judged documents (qrels format) get_qrels_jg.c Read qrels format with multiple judgment groups per query get_prefs.c Read preferences judgments instead of doc judgments - see formats.c for full description. get_qrels_prefs.c Read qrels_jg format file, interpret as prefs file. get_trec_results.c Read the standard result file (trec_results format). --- Procedures to merge rel_info and results from input form into form that measures can easily use, if they wish: form_res_rels.c 'qrels' and 'trec_results' into RES_RELS format. form_res_rels_jg.c 'qrels_jg' and 'trec_results' into RES_RELS_JG format. form_pref_counts.c ('prefs' or 'qrels_prefs') and 'trec_results' format --- The actual measures: measures.c Associates measure name with parameters and init, calculation, accumulation, printing procedures meas_*.c Common procedures used by many measures for init, acc, printing. m_<measure_name>.c measure specific procedures --- Miscellaneous: Makefile Compile and test trec_eval README This file CHANGELOG Recent changes test Directory of collection of sample input and output for trec_eval trec_eval.h Basic evaluation structures. functions.h Prototype decorations of measure procedures. sysfunc.h common.h bpref_bug: Description of bug in bpref that existed in trec_eval versions 6 through 7.3. ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Adding a new measure. Assuming it uses standard input files: 1. In m_<new_measure>.c, write any needed measure specific procedures needed to initialize measure calculate measure accumulate measure (adding one topic's value to summary value over topics) calculate the ending average for a measure. print single print final query measure value (and cleanup if needed) Most measures require only a new calculate measure procedure - the other procedures are generic and already implemented depending on the type of measure (has cutoffs and value for each cutoff, parameters, is a float, etc). See functions.h to see fit for these generic procedures. 2. In same file, construct TREC_MEAS te_meas* entry pointing to above procedures and any default cutoffs or parameters. 3. Add pointer to that TREC_MEAS entry in "measures.c" 4. Add measure source file to Makefile ------------------------------------------------------------------------------ Adding a new file format 1. Implement reading of new format in get_<new_format>.c, with returned top level output of type ALL_REL_INFO or ALL_RESULTS. The individual topic returned values will be in a format dependent form which will be passed to the appropriate measures. 2. Add format to appropriate format list in formats.c 3. Add measures to take advantage of format (see above) 4. To use, invoke trec_eval with -R or -T values, and -m measures that are appropriate.
macos_security
macOS Security Compliance Project800-63-3
Home to public development of NIST Special Publication 800-63-3: Digital Authentication GuidelinesOSCAL
Open Security Controls Assessment Language (OSCAL)fipy
FiPy is a Finite Volume PDE solver written in Pythonjarvis
JARVIS-Tools: an open-source software package for data-driven atomistic materials design. Publications: https://scholar.google.com/citations?user=3w6ej94AAAAJjsip
JSIP: Java SIP specification Reference Implementation (moved from java.net)frvt
Repository for the Face Recognition Vendor Test (FRVT)dioptra
Test Software for the Characterization of AI Technologiesoscal-content
NIST SP 800-53 content and other OSCAL content examplesalignn
Atomistic Line Graph Neural Network https://scholar.google.com/citations?user=9Q-tNnwAAAAJ&hl=enSCTK
SP800-90B_EntropyAssessment
The SP800-90B_EntropyAssessment C++package implements the min-entropy assessment methods included in Special Publication 800-90B.PrivacyEngCollabSpace
Privacy Engineering Collaboration SpaceACVP
Industry Working Group on Automated Cryptographic Algorithm ValidationREFPROP-wrappers
Wrappers around NIST REFPROP for languages such as Python, MATLAB, etc.mobile-threat-catalogue
NIST/NCCoE Mobile Threat Cataloguetrojai-literature
NFIQ2
Optical live-scan and ink fingerprint image quality assessment toolMIST
Microscopy Image Stitching Toolapplesec
Draft SP 800-179r1 macOS 10.12 Security project files: draft publication, security settings spreadsheet and Bash script implementation of settings.ndn-dpdk
NDN-DPDK: High-Speed Named Data Networking ForwarderARIAC
Repository for ARIAC (Agile Robotics for Industrial Automation Competition), consisting of kit building and assembly in a simulated warehouseSFA
The NIST STEP File Analyzer and Viewer (SFA) generates a spreadsheet and a visualization from an ISO 10303 Part 21 STEP file.NEMO
NEMO is a laboratory logistics web application. Use it to schedule reservations, control tool access, track maintenance issues, and more.jsfive
A pure javascript HDF5 readerh5wasm
A WebAssembly HDF5 reader/writer librarypyMCR
pyMCR: Multivariate Curve Resolution for Pythonpolicy-machine-core
Core components of the Policy Machine, a NGAC reference implementation.psc-ns3
Public Safety Communication modeling tools based on ns-3chemnlp
ChemNLP: A Natural Language Processing based Library for Materials Chemistry Text DataMetrology
Metrology for software; software for metrologySTP2X3D
Translator from STEP format to X3D formatcombinatorial-testing-tools
Tools for combinatorial testing developed by the NIST ACTS projectjarvis_leaderboard
Explore State-of-the-Art Materials Design Methods: https://www.nature.com/articles/s41524-024-01259-wCOSMOSAC
A Benchmark Implementation of COSMO-SACACVP-Server
A repository tracking releases of NIST's ACVP server. See www.github.com/usnistgov/ACVP for the protocol.pfhub
The CHiMaD Phase Field Community WebsiteREFPROP-cmake
Small repo with CMake build system for building REFPROP shared libraryteqp
A highly efficient, flexible, and accurate implementation of thermodynamic EOS powered by automatic differentiationLightweight-Cryptography-Benchmarking
SimulatedRadarWaveformGenerator
A software tool that generates simulated radar signals and creates RF datasets for developing and testing machine/deep learning detection algorithms.iheos-toolkit2
XDS ToolkitOpenSeadragonFiltering
OpenSeadragon filtering pluginpmml_pymcBN
ActEV_Scorer
Scoring software for the TRECVID Activities in Extended Video (ActEV) evaluationHTGS
The Hybrid Task Graph Scheduler APIsctools
Tools for security content automation, baseline tailoring, and overlay development.hiperc
High Performance Computing Strategies for Boundary Value ProblemsOpenSeadragonScalebar
OpenSeadragon scalebar pluginpyPRISM
A framework for conducting polymer reference interaction site model (PRISM) calculationsocr-pipeline
Convert a corpus of PDF to clean text files on a distributed architecture800-63-4
mosaic
A modular single-molecule analysis interfaceoscal-cli
A simple open source command line tool to support common operations over OSCAL content.vulntology
Development of the NIST vulnerability data ontology (Vulntology).DT4SM
Digital Thread for Smart ManufacturingOOF3D
Object Oriented for Finite Elements 3D version code.NetSimulyzer
A flexible 3D visualizer for displaying, debugging, presenting, and understanding ns-3 scenarios.NetSimulyzer-ns3-module
A flexible 3D visualizer for displaying, debugging, presenting, and understanding ns-3 scenarios.pyramidio
Image pyramid reader and writerrcslib
NIST Real-Time Control Systems Library including Posemath, NML communications & Java PlotterAGA8
Files associated with the AGA8 standardhugo-uswds
Implementation of the The United States Web Design System (USWDS) 2.0 using the Hugo open-source static site generatorPrivacyFrmwkResources
This repository contains resources to support organizationsโ use of the Privacy Framework. Resources include crosswalks, Profiles, guidelines, and tools. NIST encourages new contributions and feedback on these resources as part of the ongoing collaborative effort to improve implementation of the Privacy Framework.dataplot
Source code and auxiliary files for dataplot.oscal-tools
Tools for the OSCAL projectSDNist
SDNist: Benchmark data and evaluation tools for data synthesizers.Voting
The NIST Voting Program repositorymetaschema
Documentation for and implementations of the metaschema modeling languageMDCS
pySCATMECH
pySCATMECH is a Python interface to SCATMECH: Polarized Light Scattering C++ Class Libraryphasefield-precipitate-aging
Phase field model for precipitate aging in ternary analogues to Ni-based superalloysatomvision
Deep learning framework for atomistic image dataOFDM-GAN
feasst
The Free Energy and Advanced Sampling Simulation Toolkit (FEASST) is a free, open-source, modular program to conduct molecular and particle-based simulations with flat-histogram Monte Carlo methods.liboscal-java
A Java library to support processing OSCAL contentlantern
Interpretable genotype-phenotype landscape modelingns3-oran
A module that can be used to model and simulate O-RAN-like behavior in ns-3.ChebTools
C++ tools for working with Chebyshev expansion interpolantsMediScore
Scoring tools for Media Forensics Evaluationshedgehog
REFPROP-issues
A repository solely used for reporting issues with NIST REFPROPSCATMECH
SCATMECH: Polarized light scattering C++ class libraryyoubot
Robotic platform for industrial control systems cybersecurity research. We use the research-grade Youbot as the robotics platform for our research. The ROS framework is used for inter-process communication, and Python is the language used for application development.ThreeBodyTB.jl
Accurate and fast tight-binding calculations, using pre-fit coefficients and three-body terms.Circuits
Circuits for functions of interest to cryptographyOOF2
Object Oriented for Finite Elements 2D version.libbiomeval
Software components for biometric technology evaluations.F4DE
Framework for Detection Evaluation (F4DE) : set of evaluation tools for detection evaluations and for specific NIST-coordinated evaluationsoptbayesexpt
Optimal Bayesian Experiment Designblockmatrix
This project is developing code to implement features and extensions to the NIST Cybersecurity Whitepaper, "A Data Structure for Integrity Protection with Erasure Capability". The block matrix data structure may have utility for incorporation into applications requiring integrity protection that currently use permissioned blockchains. This capability could for example be useful in meeting privacy requirements such as the European Union General Data Protection Regulation (GDPR), which requires that organizations make it possible to delete all information related to a particular individual, at that person's request.texture
Python scripts for analysis of crystallographic textureElectionResultsReporting
Common data format specification for election results reporting dataoscal-deep-diff
Open Security Controls Assessment Language (OSCAL) Deep Differencing ToolIFA
The NIST IFC File Analyzer (IFA) generates a spreadsheet from an IFC file.MUD-PD
A tool for characterizing the network behavior of IoT Devices. The primary intended use is to assist in the generation of allowlist files formatted according to the Manufacturer Usage Description specification.trojai-example
Example TrojAI SubmissionNIST-Tech-Pubs
XML metadata for NIST Technical Series Publicationsblossom-case-study
A case study for ACSAC 2022 utilizing OSCAL with a custom GitHub action to automate assessments.atomgpt
AtomGPT: Atomistic Generative Pretrained Transformer for Forward and Inverse Materials DesignLove Open Source and this site? Check out how you can help us