Guide to NSF Data Management Plans
Download this guide in PDF.
This guide outlines a writing strategy for creating an NSF Data Management Plan based on the NSF Grant Proposal Guide (Chapter II.C.2.j). Before you start writing, please review carefully the Grant Proposal Guide and the Frequently Asked Questions provided by the NSF for official instructions. Be sure to consult the individual Directorates and Divisions for additional guidelines. This guide assumes that you have read and understood the NSF instructions and have determined which requirements apply to your proposal.
Identify all relevant requirements
An NSF solicitation may have data management requirements in three places. Comply with the requirements in this order:
- Directorate or Division
- NSF Grant Proposal Guide
Your plan should describe what data or research products you will produce and how you will format, document, store, and preserve them. Your plan should also describe what data you will share with other researchers and how you will distribute those data. The NSF expects that you will, as much as possible, share data and other products with other researchers. Your plan should address data management and sharing during your research and after your research is complete.
Note: If ethical, legal, contractual, or technical conditions prevent you from sharing or distributing data, you still have to submit a data management plan with your NSF proposal. In this situation, your plan should explain why you cannot share data.
In most cases, your data management strategy will reflect the unique needs of your project and the prevailing norms and practices in your field. You should make reference to those norms and practices whenever appropriate.
If your work will not produce data
You have to submit a plan with every proposal, but it is acceptable to state that your project will not produce data. Here's a good example.
However, if you are working on capacity-building project or educational program, explain how you will manage the products of that work. For example, if you record videos of a workshop, how will you manage and share them?
Identify a public-access repository, archive, or database before you start writing
Ideally, you should use a public-access repository, archive, or database to preserve and share data. If possible, identify a potential repository and review its submission requirements before you start drafting a plan. The submission requirements will shape your data management strategy and provide material for your plan. In some cases, you may have to use different repositories for different types of data. We can help you locate potential archives, data centers, or databases.
Note: You may be able to archive final data and other research products at the Digital Repository at the University of Maryland (DRUM). DRUM is managed and maintained by the University Libraries. Please contact us if you're interested in this option (email: firstname.lastname@example.org).
If you cannot find an appropriate data repository or archive, contact the NSF Program Officer associated with your solicitation for direction.
Dealing with multiple investigators, institutions, and sponsors
If your research involves multiple investigators or teams, either domestic or international, your plan should describe how you will harmonize and synchronize data management and post-project data sharing. At a minimum, indicate who is responsible for data management and sharing.
If your research involves multiple funding sources or partnerships, your plan should describe how you will accommodate and balance the data management expectations of the different sponsors or partners.
The amount of detail in your plan will depend on the characteristics of your research project. While your plan can be short, this guide will prompt you to provide more detail in a few important places.
It can be helpful to identify any relevant rules, standards, or codes of practice that will affect how you manage and share data. This demonstrates to the reviewers that your data management practices are consistent with the standards in your field.
In certain situations, it is appropriate to state that a particular issue will not affect your plan. For example, if there are no restrictions, limitations, or conditions on data sharing, it can be useful to state that fact (e.g. “This project will not collect or produce confidential or sensitive data.”).
These examples include real and mock plans. If you would like to contribute a plan, please email us: email@example.com
Earth and environmental sciences, ecology, and biology
Engineering and mathematics
NSF ENG: Designing systems to meet availability targets (No data collected or generated)
Social and behavioral sciences
This guide covers topics that frequently appear in Data Management Plans, but each NSF Directorate has its own guidelines, and, in some cases, Divisions have their own guidelines. Consult the individual Directorates and Divisions for official instructions. We provide information and advice about the following topics:
- Roles and responsibilities
- Types of data, and what data will be kept
- Format and documentation
- Access and sharing
- Re-use and re-distribution
- Long-term archiving and preservation, and budgeting for these activities
To help you complete your plan, we break down each topic into a series of basic questions. Your answers will provide the content for your plan. Not all questions will be relevant to your research.
For this topic, identify the individuals who will collect, organize, process, analyze, and share data. Outline their basic responsibilities.
Who will manage data during your project?
If various collaborators and students will be managing data, how will you monitor their work?
If you will be collaborating with researchers at other institutions, how will you harmonize and synchronize data management?
If a PI or co-PI leaves the institution, how will you ensure that data and documentation are not lost? How will you transfer responsibility to another member of the team?
For this topic, describe “the types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project” (Grant Proposal Guide, Chapter II.C.2.j). The NSF has a broad definition of data, including “data, publications, samples, physical collections, software and models” (NSF FAQ #1).
Use this section to outline how you will store and manage your data during the project. You should also identify what data you will retain and share after the project is complete.
What types of data will you produce, how much, and for how long long? (This is about the volume and variety of data.)
What are the data sources? Are you collecting data yourself or using publicly available data from open-access repositories or data centers?
What instruments or software are involved?
What is your plan for data storage, security, and backup during your project?
- If you prepared an IRB application for your project, you may be able to adapt this information from your IRB.
Of all the data you will collect or produce, what data will you retain after the project is finished, and why?
- Consider what data are necessary for replication and what data may stimulate new research in your field and beyond. See our criteria for retaining and sharing data for additional considerations.
- If possible, reinforce your decision to retain data with reference to potential user communities. Who could use your data?
- If you choose not to retain certain data, explain why?
For this topic, describe “the standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies)” (Grant Proposal Guide, Chapter II.C.2.j).
Data formats are usually the file formats that you use to save, transfer, and share data.
Metadata refers to all the information about your data that another researcher would need to understand and use your data for replication, reference, or new research. This is basically documentation and annotation. Depending on how you document your research, you may already collect this information in a codebook, data dictionary, readme file, or lab notebook. Typically, the metadata will include general information about your project, data collection methods, data processing, meaning of any codes or abbreviations, terms and conditions of use, software required, inventory of data files, and so on.
Note: You do not have to include any metadata in your plan, only a description of your method of documentation.
Tip: If you identify a potential public-access repository or archive before you start writing your plan, the data managers can often direct you to specific data formats and metadata standards.
What data structures and file formats will you use to capture and store data during your project?
What data structures and file formats will you use to share data after your project is finished?
- Many commercial software formats and instrument formats are not suitable for public access and long-term preservation because they can only be opened and manipulated by the software or instrument that created them. See our format recommendations for open alternatives.
How will you record documentation for your data? Will you use a standard form of metadata?
- In some fields, metadata is highly standardized and requires specific information. If this applies to you, identify the standard.
- If there is no commonly used metadata standard appropriate to your situation, state that fact and describe how you will document your data.
Where will you store and backup your documentation?
For this topic, describe “policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements” (Grant Proposal Guide, Chapter II.C.2.j).
What data will you share with other researchers?
Who will have access to your data? How will they find and obtain your data after your project is complete?
- Common people to consider in this situation are other researchers (in your field and beyond) and the general public. In some cases, depending on the nature of your project, you can share data with both groups without restriction. In other cases, you may be able to share data with other researchers but not the general public. Explain any such conditions.
- If you have identified a public-access repository or archive for your data, you will have to comply with their policies and requirements for access and sharing. Note any restrictions on access.
- You may be able to share final data and other research products through the Digital Repository at the University of Maryland (DRUM). Please contact us if you're interested in this option (email: firstname.lastname@example.org).
- You may be able to use your personal website, or your team's website, to share data. However, there is always a risk that the data files will be moved or deleted at some point. If you choose this option, we encourage you to contact us. We may be able to archive a permanent copy of your data in the Libraries.
- If you use data from a public-access repository, you may be able to refer people to the original data rather than distribute it yourself. In this case, you should provide links to these data in any documentation and publications.
- Depending on the nature of your data or the availability of public-access repositories, you may have to stipulate that your data "will be available on request.” However, you should avoid this method if possible. The NSF is increasingly dissatisfied with this method, viewing it as a barrier to efficient public access. If you are compelled to take this approach, contact the Program Officer in advance for guidance.
- If you cannot find an appropriate data repository or archive, contact the Program Officer for direction.
How soon will other researchers or the public have access to your data?
- The NSF expects that “all data will be made available after a reasonable length of time” (NSF FAQ #9). Individual Directorates and Divisions may have specific requirements—check your Directorate and Division guidelines.
- If there is no explicit length of time in the NSF instructions, answer this item with reference to the customary practices in your field. Making your data available when you publish your initial findings is typical, but norms vary by field. Delays that exceed customary practices will require more substantial justification.
If you produce confidential or sensitive data, how will the measures you take to protect subjects affect public access?
If you are working under the terms of an IRB, how will they affect public access?
Are there any additional federal, institutional, professional, or sponsor regulations that will affect public access?
Will you have any special security provisions or data use agreements?
Are there any intellectual property issues, such as ownership, copyright, or potential commercialization, that will affect public access?
- For research products generated under an NSF award, all intellectual property developed by researchers and students and all intellectual property rights therein shall belong to the University unless an exception or waiver is granted. In many cases, this will not prevent you from sharing data and other materials with researchers or the general public, but conditions apply when your activities involve materials transfer, inventions, patents, royalties from inventions, third-party contracts, and other special circumstances. Contact the Office of Research Administration for guidance on your situation.
For this topic, describe “policies and provisions for re-use, re-distribution, and the production of derivatives” (Grant Proposal Guide, Chapter II.C.2.j). This section is chiefly about terms and conditions of use.
Are there any intellectual property issues that will affect re-use and re-distribution?
- If your project uses data, software, or materials that belong to another individual, group, or institution, you must comply with any terms, conditions, permissions, licenses, or agreements specified by the data owner. Note any conditions that affect re-use.
- If you have identified a repository or archive for your data, you will have to comply with their policies and requirements for re-use and re-distribution. Note any conditions that affect re-use.
Will you make your data available with specific terms and conditions, licenses, or disclaimers?
- If there are no legal or contractual issues that will affect re-use or re-distribution, the simplest option is to refrain from adding any terms or conditions. However, you may wish to insist upon attribution, citation, or another form of credit whenever someone uses your data.
For this topic, describe “plans for archiving data, samples, and other research products, and for preservation of access to them” (Grant Proposal Guide, Chapter II.C.2.j). If you plan to deposit your data at a repository, archive, or database, your response for this section may overlap with information in section 3.3. Use this section to address the long-term disposition of your data.
Will you submit your data to a repository or archive for long-term archiving and preservation? Which one?
- In some cases, you may have to use different repositories for different data types.
- You may be able to archive final data and other research products at the Digital Repository at the University of Maryland (DRUM). Please contact us if you're interested in this option (email: email@example.com).
- If you cannot find an appropriate data repository, contact the Program Officer for direction.
For how long will you (or a repository) preserve your data?
- This depends on wide variety of factors. The NSF instructs you to retain data for at least three years, but there may be additional instructions in the solicitation or the requirements mandated by the Directorate or Division.
- UMD’s retention policy for most research records is a minimum of seven years after the completion of research. Different terms apply to investigational new drugs and investigational devices (UMD Records Schedule, Item 84).
- If you conduct research under HIPAA regulations, you should plan to retain data for a minimum of six years.
- Data related to patents should be retained for the life of the patent.
- In addition, consider the potential value of your data in temporal terms: will the value increase, decrease, or remain constant over time? Social and environmental observations that cannot be recreated may increase in value.
If there are costs associated with long-term archiving and preservation, such as deposit fees at a repository, how will you cover them?
- You can request funds in your proposal budget (Line G2).
This work is licensed under a Creative Commons Attribution 3.0 Unported License