Data projects at Vox Media
What makes for a good data project?
Good data projects are an extension of the reporting process that rely on quantitative information to inform and educate. They should help flesh out a story and reveal information by means of analysis that would not otherwise be apparent.
Guidelines
-
Include guidelines for the processing of data, or at the very least document changes that have been made to the source and include it alongside the output.
-
Context should always accompany data and aid in the understanding of a dataset.
-
Add information about how to contribute to the project. Also include contribution information for one-off projects that likely will not be updated. Example:
This project is shared as-is. Bugs, issues, and pull requests may
not be readily addressed.
- List the authors and/or contact information.
Copyright and licensing
-
Provide origin of dataset.
-
Make sure we have permission to include the dataset in the repo.
-
Be clear on what exactly we are claiming as copyright, if relevant. For example, if we're using public domain government data, then what we claim as copyright is any script that transforms the dataset into something usable for a project. The data itself is still publicly domain.
-
This is the open source license that goes out with other Vox Media projects:
Copyright (c) 2015, Vox Media, Inc. All rights reserved.
BSD license
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list
of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this
list of conditions and the following disclaimer in the documentation and/or other
materials provided with the distribution.
Neither the name of the copyright holder nor the names of its contributors may be
used to endorse or promote products derived from this software without specific
prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
Contact
We're still discussing loose guidelines for how we should use the repo. Do you have any ideas for how to make data more transparent and accessible? Get in touch with us at [email protected] or drop us a note in the issues.