Introduction#

A comparison between research projects conducted two decades ago and those of today reveals a marked increase in the demands placed upon early career researchers (ECRs; Weissgerber, 2021). This can be attributed, in part, to factors such as the need for larger sample sizes (Fan et al., 2014; Marx, 2013; Zook et al., 2017), the incorporation of novel methods such as pre-registration or dissemination possibilities (Ross-Hellauer et al., 2020; Tripathy et al., 2017), and the growing utilization of advanced computational and statistical techniques (Bolt et al., 2021; Bzdok et al., 2017) such as machine learning, as well as the implementation of cutting-edge technologies such as virtual reality (Matthews, 2018). All of these factors contribute to an increased time commitment required to successfully undertake such research endeavors (Powell, 2016). Accordingly, the motivation and eagerness many ECRs feel during the first year of their work is more and more often accompanied by feelings of being overwhelmed (Kismihók et al., 2022; Levecque et al., 2017), as many project choices have to be made and a variety of skills need to be learned that determine the long-term success of one’s first research project.

At this stage, most ECRs lack the necessary expertise and experience to make these important decisions. Moreover, many common terms need to be understood and learned. Learning this ‘language of science’ can be difficult for ECRs (Parsons et al., 2022; see also Table 1). In addition, institutions and supervisors often provide researchers with a relatively fixed array of available resources which are conventionally used, such as software subscriptions or in-house software. These tools are often expensive and bound to the institution itself (i.e., may become unavailable when the researcher changes institutions or works from home). On top of that, limited (subscription-based) resources might not only impede, but also prevent good scientific practice. Accordingly, many open access tools have been proposed to facilitate life as an ECR. However, these resources are often spread out and hard to find or to compare with each other in terms of reliability, validity, usability, and practicality. Moreover, these resources can be difficult to learn, in particular if there is limited support by supervisors. Taken together, these difficulties are time-consuming and create a (potentially error-prone) resource-labyrinth, further exacerbating the uncertainty of how, and with which tools, high-quality science can be achieved.

The Resource#

Therefore, a comprehensive overview of curated resources that cover all parts of a research project is warranted. To address this issue, we created ARIADNE – a living and interactive resource navigator that helps to use and search a dynamically updated database of resources (see also Figure 1 and exemplary resources marked with ➜ in the subsequent text). We named our tool ARIADNE, as we aim to help ECRs navigate the ‘labyrinth’ of research tools and resources, much like the mythological Ariadne helped Theseus navigate the labyrinth (see e.g. here. Our tool spans the whole research cycle, helps ECRs to identify and find relevant resources, and is available as a dynamic interface for easier use and searchability. The open-access database covers a constantly growing list of resources that are useful for each step of a research project, ranging from the planning and designing of study, over the collection and analysis of the data, to the writing and disseminating of findings. In doing so, we put an emphasis on open and reproducible science practices, as these practices become more and more valued and even mandatory (Kent et al., 2022).

Our step-by-step graphical interface to ARIADNE including detailed usage instructions and the corresponding source code is freely available online. The tool is divided into two sections: a web-hosted Jupyter Book detailing the aforementioned steps and a Cytoscape network showcasing the steps dynamically and interactively. We selected this framework due to the robust capabilities offered by GitHub for project management, development, and deployment within a single platform. By combining Jupyter Books with GitHub’s integrated web hosting, our online book is freely accessible. Additionally, GitHub Pages can host tools generated via JavaScript backend. We have exploited this feature to develop our network tool using a Cytoscape.js (Franz et al., 2016) backend. Furthermore, we have secured the backend by locally hosting the Cytoscape.js instance within our project’s repository. This approach safeguards the project from potential disruptions caused by significant changes in the Cytoscape.js backend, as we maintain our own copy of the Cytoscape.js instance.

Moreover, through issues and pull requests, users of our tool can contribute future updates, ideas, new resources, and bug fixes. Therefore, we confidently characterize our tool as a community-driven and open-source” tool. In the fast-paced realm of research, where new tools are frequently introduced, the imperative of being “community-driven” is effectively met through the myriad opportunities afforded by GitHub.

ARIADNE is supported by the Leibniz Institute for Psychology (ZPID) and the Interest Group for Open and Reproducible Science (IGOR) of the section Biological Psychology and Neuropsychology, which forms part of the German Psychological Society (DGPs). This community-driven resource is thus embedded in organizational structures of Psychology in Germany who will aid with long-term storage and scaling plans going forward.

In the following, we divide the research cycle into 10 steps that determine the quality and the success of research projects. We describe the challenges and choices to be made in each step and provide curated resources from ARIADNE for each of them: 1) project start, 2) study design, 3) study implementation, 4) piloting, 5) data collection, 6) data validation, 7) data analysis, 8) writing, 9) publication, and 10) dissemination. We also introduce key terms relevant in each step , ultimately aiming to facilitate training and communication between experts and people starting out in the world of academia (see Table 1 and italicized words in the main text; for open science-related terms see Parsons et al., 2022). Lastly, we provide a checklist with questions one might ask at each step of a research project in Table 2.

ARIADNE instructions Figure 1. Instructions regarding the tool with using exemplary resources

Table 1. Mini glossary of science-related terminology, sorted alphabetically.

Term

Definition

References

Corresponding author

The corresponding author is typically the researcher who takes primary responsibility for communication regarding the manuscript, during both pre-publication and post-publication phases. This usually includes communication with the publisher, the readers, and handling requests for data-sharing. Note that different journals may have different requirements for corresponding authors.

Pain, 2021

Cover letter

A letter to the editor of a scientific journal that is submitted together with a manuscript. It outlines the importance of the study and summarizes key findings and contributions to the field. Some journals explicitly require such a letter, while others actively discourage it.

Palminteri (2023)

CRediT Statement

A taxonomy of 14 roles that can be assumed when being part of a research project. The statement can be included at the end of a manuscript to transparently report which author assumed which roles.

Brand et al. (2015); Tay  (2021)

Data wrangling / munging

The process of transforming and mapping data from one “raw” data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics.

Endel & Piringer (2015); Kandel et al. (2011)

Early career researcher

An individual that is early in their academic career. Typically from graduate or PhD student to Postdoc level, sometimes even young principal investigators such as junior professors.

Bazeley (2003); Laudel & Gläser (2008)

First author

The first author is the person listed first in an author list of a manuscript. In many fields, it is the person who has done most of the hands-on work and who has taken on a pivotal role in the research project. Shared co-first authorship is possible when two (or more) authors provided equal first-author-level contributions.

Riesenberg (1990)

Garden of Forking Paths / Researcher degrees of freedom

Metaphor for the many (analytic) decisions that researchers can take, leading to many possible outcomes. The multitude of possible decisions can give rise to questionable measurement practices such as p-hacking or hypothesizing after the results are known (HARKing).

Gelman & Loken (2013); Botvinik-Nezer et al. (2020)

h-factor

A controversially discussed metric proposed by Hirsch (2005) to assess a researcher’s (or journal’s) scientific output by combining publication quantity and impact (i.e., citations). h is defined as the highest number of papers of an individual (or journal) with at least h citations (e.g., h = 3 means having 3 papers with at least 3 citations each).

Hirsch (2005)

Impact factor

A metric used to evaluate the relative importance of a scholarly journal in a particular field by measuring the average number of citations received per article published in that journal over a specific period of time. It is calculated by dividing the total number of citations a journal receives in a given year by the total number of articles published by the journal in the preceding two years and commonly used as a tool to assess the quality and significance of research, and has become an influential factor in the academic publishing industry, although it is controversially discussed.

Sharma et al. (2014)

Ivory tower

A metaphor for academia, portraying scientists as a group of closed-off individuals living in a tower and discussing scientific progress only amongst themselves, limiting the outreach of scientific results.

Bond & Paterson (2005); Lewis, (2018)

Lab book

Also known as a laboratory notebook, is a scientific record-keeping tool used by researchers, scientists, and students to document their research project, experiments, observations, data, and findings.

Schnell (2015); Guerrero et al. (2019)

Open-access

When scholarly content (such as software, data, materials, or output) is published in a way that is freely available to everybody.

Evans & Reimer (2009)

Paywall

A digital barrier implemented by academic publishers restricting access to scholarly content (e.g., articles) to researchers or institutions that have paid for a subscription (or a one-time access). These costs are intended to cover processes associated with editing, peer-reviewing, and formatting; however, paradoxically, they limit dissemination and potentially hinder scientific progress. Hence, some researchers advocate for open access publishing models to promote equity in knowledge distribution.

Barbour et al. (2006); Day et al. (2020)

Peer review

The act of giving feedback on a manuscript under consideration at a scientific journal. Typically, a minimum of two reviewers that are experts in the field are invited to comment on a manuscript. Subsequently, editors make a decision whether to accept or reject the submission and authors can be asked to revise their work based on reviewers’ comments.

Jana (2019)

Pilot study

A pilot study is a small-scale preliminary investigation that is conducted before a larger research project or study to test the feasibility of the research design, methods, and instruments. The primary purpose of a pilot study is to identify potential problems and areas for improvement in the research protocol, which can be rectified before conducting the actual study.

Arain et al. (2010); In (2017); Thabane et al. (2010)

Postprint

The accepted or published version of a manuscript in a scientific journal. Postprints can often be shared on public repositories to make them accessible to everyone and forgo the “paywall”. Note that journal-specific policies (e.g., embargo periods) need to be considered.

Harnad (2003)

Power analysis

A statistical method used in research to determine the sample size needed for a study to achieve a desired level of statistical power. Statistical power refers to the ability of a study to detect a significant effect (or difference) between groups or conditions when a true effect (or difference) exists. Crucially, if a study is underpowered (i.e., the sample size is too small), researchers may not be able to detect significant effects even if they are present. Conversely, if a study is overpowered (i.e., the sample size is too large), resources may be wasted and the study may be unnecessarily expensive or time-consuming.

Kemal (2020)

Preprint

A version of a manuscript that has not yet been peer-reviewed and published in a scientific journal, but is uploaded to an open-access online repository, usually at the time of submission to a journal. Since preprints did not undergo the established scientific quality-control process (i.e., peer review), preprints usually include a brief note that the reported findings should be examined with caution by practitioners, journalists, and policymakers.

Hoy (2020); Wingen et al. (2022)

Rebuttal

A written response to a criticism made against a research manuscript or proposal. It aims to refute or dispute opposing arguments by presenting counter-evidence or alternative interpretations or theories. Thus, rebuttals are an important aspect of peer review processes, which allows for the improvement of scientific work through constructive feedback or critical discourse.

Palminteri (2023)

Registered Report

A type of scholarly article format that involves a two-stage peer review process. In this format, authors submit a detailed research proposal or protocol to a journal, which is then peer-reviewed before any data is collected. If the proposal is deemed to be methodologically sound and potentially impactful, the journal agrees in advance to publish the results of the study, regardless of the outcome.

Henderson & Chambers (2022)

Revise and Resubmit

An outcome resulting from the submission of a manuscript to a scientific journal. The manuscript is rejected in its current form, but the authors are invited to revise and resubmit their work after incorporating feedback from reviewers.

Kornfield (2019)

Scooped

A slang term used when one’s research idea, study, or result is being claimed by other researchers, e.g., through publishing first.

Laine (2017)

Senior author

The senior author is the lead person (e.g., classically the principal investigator; PI), primarily associated with funding, supervision, or major responsibility for the project. Shared co-senior authorship is possible when two (or more) authors provided equal senior-author-level contributions.

Pain (2021)

Standard operating procedure (SOP)

Documents or materials describing study procedures or processes for the purpose of establishing and managing data quality and reproducibility.

Manghani (2011)

Type I error rate

Type I or alpha error rate in statistics refers to the probability of rejecting a null hypothesis when it is actually true. In other words, it is the likelihood of obtaining a statistically significant result by chance alone, without any true underlying effect.

Banerjee et al. (2009)

Type II error rate

Type II or beta error rate in statistics refers to the probability of falsely rejecting the alternative hypothesis and maintaining the null hypothesis, when the alternative hypothesis is actually true. Beta can be used in power analyses.

Hartgerink et al. (2017)

Table 2. Checklist of relevant questions for each step of the research process.

Term

Definition

1) Project start

What is the gap in the literature and my resulting research question?

Is funding available to conduct the project?

What are the time plan and work packages of the project?

Who is responsible for what in the project?

2) Study design

What are thr hypotheses and how can they be tested?

Which independent variables (IVs) are manipulated?

Which dependent variables (DVs) need to be measured?

Is approval by an ethics/institutional review board (IRB) needed?

How large should the sample be?

3) Study implementation

What measures are most fitting (tasks, questionnaires, etc.)?

What stimuli need to be created (e.g., pictures, videos, text)?

Which programming environment should be used?

4) Piloting

Is the study feasible?

Do all manipulations work as intended?

5) Data collection

How can we make sure my data is safely stored, accessible, and backed up?

Is the data collected in a way that protects private and sensitive information (e.g., of participants?

6) Data validation

How can we ensure the quality and accuracy of the data?

How can we store the data reproducibly?

7) Data analysis

What are specific analysis pipelines and programs that can be used for specific types of data (e.g., EEG, (f)MRI, behavior)?

What open-source software is a good alternative to proprietary products?

Which tools allow complete replicability of an analysis pipeline, independent of the specific operating system of a user or continuous software updates?

How are results visualized in a captivating, yet transparent and maximally inclusive way?

8) Writing the manuscript

What is the scope of the paper?

What is the target audience and journal?

How to write a convincing abstract?

How to properly credit authors?

How to find and cite sources correctly?

What frameworks allow to conveniently write a reproducible manuscript?

9) Publication

Where to upload data, code, materials, and/or a preprint?

Is the published data FAIR (“Findable, Accessible, Interoperable, and Reusable”)?

How to write a cover letter?

How to write a rebuttal to reviewer comments?

10) Dissemination

How to design a poster for a conference?

How to prepare a scientific presentation?

How to present your research to a lay audience?