This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
File management and formats
Before starting to gather data, a folder hierarchy and naming conventions should be chosen and documented. Furthermore, file formats must be considered to enable long-term data reuse.
Updated: 29 April, 2019
File management includes folder structures and naming conventions, plus the choice of appropriate formats.
Creating a folder hierarchy
With regards to folder structures, a good way to go about organising your research data is to create a hierarchy of folders [1]. As an example, you could use a folder for each project, with two subfolders:
- “Data”, to store your research data, including images, databases, media, etc.
- “Documentation”, to store all relevant project documents, including, e.g. methodology and consent forms.
The more complex the project, the more detailed your folder structure can be. If you are working alone or in a small team, you might wish to use just a handful of subfolders. However, if you work on a large project you could create a range of subfolders to suit the team’s needs. As an example, under “Data”, you might create a subfolder for each data type, such as “Databases”, “Images”, and “Sounds”. Similarly, under “Documentation” you might create a subfolder for each category of document, such as “Methodology”, “Consent forms”, and “Information sheets”.
Naming your files to support collaboration
When it comes to file naming, we recommend using simple but meaningful names. Best practice [2] includes the following:
- Capital letters should be used to delimit words in the place of spaces or underscores
- File names and paths should avoid unnecessary repetition and redundancy
- Numbers should always include at least two digits (i.e. 01 to 09 instead of 1 to 9)
Naming will follow a convention chosen by you and other project members. For instance, you may decide that “InterviewTra07JD20180214” means “Interview Transcript 7, written by John Doe, on 14/02/2018”. You will need to describe such conventions as part of your study metadata.
Choosing the right file format
Finally, choosing appropriate file formats [3] is key to ensuring data is reusable, as some formats become obsolete in time and may make your research inaccessible. You can use whatever software or format is convenient during your research, but when sharing the data you should follow best practice and ensure future reusability of your work. The US Library of Congress maintains a recommended formats statement [4], which we invite you to consult. This lists a series of file formats in order of preference by output type, including recommended metadata fields.
When saving research data for sharing, you might need to convert your working files. If you do so, always ensure that the conversion was successful and that no errors appear (e.g. missing values, wrong characters, text formatting, resolution, etc.).
Further reading
Footnotes
- [1] UKDS - Organising data https://www.ukdataservice.ac.uk/manage-data/format/organising
- [2] Naming conventions https://www.ed.ac.uk/records-management/guidance/records/practical-guidance/naming-conventions
- [3] File Formats http://www.ands.org.au/__data/assets/pdf_file/0003/731775/File-Formats.pdf
- [4] Recommended Formats Statement https://www.loc.gov/preservation/resources/rfs/
RDM at your institution
Quick access to relevant RDM information and guidance provided by your institution.
Add/update a link to your institutional RDM pageGot a suggestion for an update?
To suggest changes, or new content to be included in the toolkit, please get in touch.