Thank you for contributing to the U.S. City Open Data Census, a joint community effort undertaken by contributors around the country, just like you.
The following tutorial will help you make your contribution to the U.S. City Open Data Census.
There are two main ways you can contribute to the U.S. City Open Data Census:
You will be asked a series of questions about the dataset as well as your knowledge about open data. This enables us to understand the submitters, as well as more subjective questions concerning data findability and usability. If you would like more details for what a particular question is asking, click the question-mark sign next to that question. It contains help text that is prompted once you click on the sign.
The Census seeks to measure the publication of open government data. It is therefore important to ensure that your submission refers to data that government is responsible for. A submission is only valid if the government is responsible for producing, managing, or publishing the submitted dataset. Usually, data will be found on a government website. In the rare cases where it is somewhere else but is the official location — this may be the case for state-owned enterprises or contractors delivering public services for government — you can verify this by: looking for a disclaimer on a website that the data is produced by government or calling a government official. Answer “Yes” if the data is collected by government, or a third party officially representing government. Tell us which agency collects or provides the data. If you cannot find evidence that the government does collect the data, provide a comment explaining why.
This question measures if the data is accessible online from the government without mandatory bureaucratic barriers. It is important that no such barriers exist because they can deter people from accessing data. Answer “Yes”, if the data is made available by the government on a public website. Answer “No” if the data is NOT available online or are available online only after registering, requesting the data from a civil servant via email, completing a contact form, or going through some other administrative process.
This question evaluates whether the dataset is stored centrally or across several websites and sub-pages. The question of where data is published is important to address the issue of data findability. Tell us the different websites, as well as sub-pages on a website (if applicable) where you found dataset characteristics. If you find the same information on several websites and sub-pages, only document those that enable you to answer the following questions with “Yes.”
Tell us whether you can find all required characteristics online, checking the box for each one as you verify that the dataset has it. This is important to make sure that the dataset has all of the information required. If the dataset does not have all required components, you should mark the dataset as missing and explain why with a comment.
If you are unsure whether a part is contained in the dataset or not, you can ask on the Open Knowledge discussion forum.
The data is free if you don’t have to pay for it.
This question measures whether you can download all information on your computer with easy steps. Downloads can be organized by month or year or broken down into subfiles for very large data files. It is important that the download is feasible in a few easy steps. Answer “No” if you can only view the data (but not download it), if you have to do many manual steps to download the data, or if you can only retrieve very few parts of a large dataset at a time (for instance through a search interface).
Often, data is only useful if it is provided in a timely manner. But different information needs to be provided in different time intervals. For example, data on 311 requests should ideally be updated daily, while annual budget data is only published once a year. This question measures if data is provided in a timely manner. Please base your answer on the date at which you answer this question. The dataset may have metadata that says how often the dataset is updated; also check to see when it was most recently updated. Be careful with publication dates and check whether the published data is actually up-to-date. Answer “No” if you cannot determine a date, or if the data is outdated. Please also use the comment function to explain why the data is outdated.
Tell us the file formats of the data. We automatically compare them against a list of file formats that are considered machine-readable and open. A file format is called machine-readable if your computer can process, access, and modify the dataset characteristics (see B3) that you find in a data file. Worked example: if you find a JPG image of a map embedded on a website it is not considered machine-readable. The Census considers formats to be “open” if they can be fully processed with at least one free and open-source software tool. These formats allow more people to use the data, because people do not need to buy specific software to open it. Important: The Census uses a less rigid definition of open formats. It may consider some file formats to be open even if their source code is not. What counts is that the file format can be fully processed with an open-source software tool.
Data may be in a machine-readable format like an XLS spreadsheet. But they might contain unstructured information (like notes randomly written in a column). Such data often has to be cleaned to become usable. Tell us the effort it takes for you to use the data. Document any relevant feedback on usability in the comment section.
The Census allows only one submission per one dataset. However, you can still help by commenting on a current submission and propose changes by creating a new topic in the Open Knowledge discussion forum. Leaving detailed notes in the comment field supports the review process, too.