This is a guide to the AIDOS Lab, a research lab interested in geometry, topology, and machine learning. The guide contains information for prospective graduate students, as well as some general advice that might be helpful for those navigating academic systems. Feedback is always welcome—please reach out to bastian@rieck.me.

A Guide to the AIDOS Lab

Dr. Bastian Rieck

(image)

Version from 8th July 2024

Changelog

0.0.18

(2024-07-06) — Rewrote and extended many sections. Marking sections that are drafts or important.

0.0.17

(2024-04-26) — More on philosophy, writing, getting ideas, and my own responsibilities.

0.0.16

(2024-02-13) — Fixed some style issues and included slides on scientific writing.

0.0.15

(2023-06-01) — Discussion of a welcoming lab environment; provided more guidance for authorship rules.

0.0.14

(2023-02-21) — Added link to wiki. Removed more technical content, since this is best stored and discussed elsewhere.

0.0.13

(2023-01-27) — Updated personnel list. Added section about how to apply for vacations. Also added a new section about meetings, including their different types and purposes.

0.0.12

(2022-10-13) — Basic introduction to cluster setup, with a discussion of the necessary administrative request, as well as a brief recipe for installing Miniconda.

0.0.11

(2022-09-01) — More details on graduate school admission. Added special ID numbers for travelling. Added links for IT systems.

0.0.10

(2022-15-08) — Improved title page and quotes. Added discussion on shared authorships. Added special notation section. Added section on getting ideas.

0.0.9

(2022-07-09) — Added notes on research proposals

0.0.8

(2022-06-27) — Added notes on travelling

0.0.7

(2022-06-03) — Added sources

0.0.6

(2022-06-03) — Added section on internships

0.0.5

(2022-05-09) — Expanded on philosophy and additional details

0.0.4

(2022-04-11) — Added chapter on undergraduates; updated personnel list

0.0.3

(2022-03-21) — Added notes for collaborations

0.0.2

(2022-03-16) — Documents for doctoral studies

0.0.1

(2022-03-15) — First draft

Acknowledgements

I am grateful to Leslie O’Bray and Julius von Rohrscheidt for providing feedback on different versions of this guide. Likewise, Simon Wengert and Johanna Sommer made a lot of incisive comments that added additional depth to this guide—thank you very much for that!

License

This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/4.0 or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

Contents

1 Introduction

This is a guide for everyone interested in the AIDOS Lab. Whether you are a prospective student or already working with us in any capacity, this guide is here to answer some of the most relevant questions while also providing an overview of what we stand for. Most of the contents of this guide pertain to students, but the technical and logistical details are relevant for postdoctoral researchers, visiting scholars, and anyone who interacts with us on a close basis.

1.1 The origins of this guide

I observed that the most successful research groups all had some form of guide that serves as an introduction to the lab, expands on certain logistical details, and so on. At the same time, I am unaware of a guide that also directly addresses expectations for Ph.D. students and other lab members. I therefore also included brief sections on topics that are not commonly discussed in such guides—but that inevitably crop up time and again—to provide a good basis for further in-person talks or one-on-one meetings.

As a secondary reason, I want to provide some structure that all of us can refer to and benefit from. I am not a big proponent of informal structures that might arise if responsibilities are not clearly defined.1 This does not mean that I believe that everything needs to be codified, on the contrary! I see this guide therefore not as ‘laying down the law’, but more as the creation of a common scaffold for all of us. We may easily disagree on some of the finer points, but in my experience, it is better to clearly describe expectations.

1 For an excellent discussion of the problem of informal power structures, see this essay on power in organisations

1.2 About the lab

Being founded in late 2021, the AIDOS Lab is extremely young, giving us all the opportunity to help shape it. The word ‘AIDOS’ has two meanings that complement each other well. The first meaning refers to our mission statement, viz. to develop Artificial Intelligence for Discovering Obscured Shapes. Like the guide that you are reading, this short moniker should serve as a gentle scaffold for our research.

The second meaning originates from the Greek word ‘αἰδώς,’ which means ‘awe,’ ‘reverence,’ or ‘humility.’ This awe or humility should serve as one of our guiding principles when we work on challenging problems in healthcare research, aiming to improve our world using machine learning or other tools. The principle of humility is so central because it is easy to lose one’s sense of identity amidst things that are vying for our attention. As scientists, we are entrusted by society to think about long-term problems, and this trust requires a certain humble frame of mind.

True humility is not thinking less of yourself, it is thinking of yourself less. (C.S. Lewis)

In that spirit, please enjoy your stay with the lab! I am looking forward to doing great work together. I have tried to keep the tone rather light, even though the topics this guide addresses are certainly of great importance to us. Please let me know if I missed anything that you deem important. Ideally, this guide should be a living document, and a constant reflection of how we want to conduct our research and our daily interactions. As Heraclitus writes: ‘All is flux, nothing is stationary.’

On other lab names If you have been around for a while in science, you are probably aware of the fact that most labs are named after the PI. You will thus find a lot of instances of ‘Smith Lab,’ for example. Many institutions follow this unfortunate paradigm and would thus prefer to refer to us as the ‘Rieck Lab.’ I believe that a name should also carry a mission as opposed to being tied to a single person.

1.3 Our philosophy

Before we get started and lose ourselves in the jungle of additional information, I want to point out the great paper by Markowetz [3], which I encourage you to read. The title already encapsulates how I feel about our work: indeed, you are not working for me but with me. In case you are interested in learning more about the background of such a philosophy, the overall umbrella term is participative leadership. As part of this philosophy, here, in no particular order, are three tenets arising from this philosophy.

  • 1. Communication is key. If you have updates on a project or need advice on something that is blocking you, make yourself heard. I am very happy to give you feedback on anything, but you should also engage with your fellow lab members. Often, even briefly talking or writing about your specific problem can help you get ‘unstuck.’ The same holds when you want to try out new ideas. Do not be afraid to briefly talk them through.

  • 2. You are in charge. While I will help guide you and your research, you are in the driver’s seat. Be proactive, reach out to others if need be, and, following the previous point, communicate what is going on. For instance, if something is not working, let me—and your project partners—know so that we can jointly figure out what is going on. Moreover, you do not have to ask for permission if you want to try out new things. Find a course that interests you? Take it! See a paper that is inspiring? Read it and think about it. You get the idea—I trust you to use your time wisely; you have full autonomy concerning the research directions.

  • 3. Ask questions. Be not afraid to ask for clarification or help, in particular if you are unsure about how to proceed. A big part of science involves navigating unknown intellectual terrain, which can be daunting. Ask away and start discussions left and right. It will help you in the long run!

1.4 Our research

Broadly speaking, our lab is all about developing methods. I always like to use the term toolsmith for myself, echoing a great coinage by the late Fred Brooks [1]. This means that, foremost, we should be interested in developing new methods, new tools, new ways of looking at things. At the same time, this should never go at the expense of utility. Notice that utility is not the same as having a direct application! A method can provide utility by solving a problem more elegantly and with fewer assumptions, for instance. Ideally, a new tool should bring new insights. What I want to avoid is research along the lines of ‘solving somebody else’s problem using somebody else’s method.’ That is not to say that we should ignore existing literature or insist on always rolling our own solution, but it should provide an overall goal for us all to strive for.

Navigating this requires some wisdom and a lot of reflection. A quick litmus test is to ask yourself the question ‘Why am I doing this?’ In the beginning of my own Ph.D. research , I sometimes could only answer ‘Because I can.’ This is not a good motivation, and I have since thought much more deeply about this. Nowadays, I find that the dictum ‘theory without practice is empty’ sums up my views nicely. Some more thoughts about this may be found in chapter 3 and, more specifically, section 3.5.

I am aware that striving for novelty and utility is not necessarily easy, and many students are afraid of working on what they construe to be incremental advances.2 Notice, however, that scientific progress always advances incrementally in some sense. Big leaps do happen, of course, and it is good to strive for them, but at the same time, I am a big advocate of intellectual humility and giving generous credit to those who came before us. As Frank Herbert wrote: ‘From the top of the mountain, you cannot see the mountain.’ I construe this to mean that problems, once conquered, cease to look like big challenges. That does not mean that they were conquered without any effort, though. My advice to everyone, in particular if they are just starting out in science, is to focus more on the things you can do and contribute and worry about all the other aspects later. Your research is an ongoing conversation. Let it evolve and change over time!

2 This is also a comment one might get from reviewers, unfortunately. I wish we would beyond these shallow notions of (perceived) novelty.

2 General guidelines

This chapter covers general guidelines that are of general interest to anyone working with the lab or for the lab. Guidelines are ordered alphabetically. If you only read two sections of this chapter, it should be section 2.7, which outlines how members of the lab should comport themselves, and section 2.8, which provides suggestions for creating a welcoming environment.

2.1 Availability

As a general rule of thumb, all lab members should be available1 during normal European business hours, say from 10∶00–16∶00. This makes it possible for other groups to interact with us and greatly simplifies any collaborations. I will not blindly enforce this rule, but I expect you to notify me if you are not available during a certain period of time. Moreover, as a token of respect to all other lab members, everyone should usually be present for the lab meetings.

1 This means either being present in your office or being responsive to messages or calls when you are working from home. It does not mean that you are glued to your screen at all times.

2.2 Authorship rules

When submitting a paper to a machine learning venue, the order of authors should reflect their contribution and relevance for the project. Typically, this will mean that junior members, such as Ph.D. and graduate students, will be placed before more senior members. To assess whether a person is qualified to be an author on a manuscript, we will follow a set of rules. Anyone listed on a paper as an author has to satisfy all of the following criteria:2

  • 1. They have made substantial intellectual contributions to at least a subset of the work described in the manuscript.

  • 2. They have participated in writing or revising the manuscript.

  • 3. They are aware of the existence of the manuscript and of the intent to submit the manuscript to a certain venue.

  • 4. They agree with the content of the manuscript and are willing to vouch for the correctness of the work.

In particular, there will not be any ‘courtesy authorships.’ Unless you raise this an issue with me, I will typically assume that there are no conflicts with these guidelines. Let me know well before the submission if this is not the case! Following these rules is a way for all of us to demonstrate scientific integrity. The main idea behind this rule is that everyone listed as an author should be comfortable with giving a talk on the work. If this is not the case, something is wrong.

I realise that this policy cannot be enforced in larger collaborations unless all participants are aware of this. Certainly, I do not expect any students to handle authorship disputes—you should always feel comfortable to bring such matters to my attention. Ensuring fair treatment in scientific matters is one of my primary responsibilities.

On shared authorships In machine learning and many other scientific domains, it is common practice to share authorships. This is a nod to the inherent complexity and teamwork involved in large-scale projects. I encourage you to think about sharing authorships with other Ph.D. students whenever this is appropriate. Please involve me in such discussions early on. If you intend to write a publication-based dissertation, pay attention to what constitutes a ‘core’ publication for your thesis.

2 This policy is an amalgamation of the rules instituted by ETH Zurich and the ACM Criteria for Authorship.

2.3 Coding style

Since we will be working mostly with Python, here is the ‘house style’ that I urge everyone to adopt. Most of these rules are meant to make it easy to develop software together. They also make it easy for me (and others) to read your code and interact with it.

  • 1. Use a code formatter. It completely obviates any need for discussions about personal styles. If you integrate it into your editor or your GitHub actions, you can write your code in your preferred style, and let your editor format it for you. Personally, I like black for code formatting because it is easy to use.

  • 2. Follow the Python style guide. The style guide, also known as PEP 8, provides numerous hints as to how to structure your code. Some of this is redundant if you already use a code formatter, but the advice on how to build better software and structure your classes is worth a read.

  • 3. Use ‘docstrings’ to document your code. If you are just drafting your code, you do not need good documentation. But as soon as the dust has settled somewhat, it is a good idea to think about writing some documentation in case you want others to build on your project. Notice that not every project needs to be perfectly documented, but you are substantially increasing the impact and overall utility of your work if you make it easy for others to perceive the salient points of your project. See the numpy documentation guide for great examples of how to craft good documentation.

2.4 Collaboration

Science is fundamentally a collaborative endeavour. Given our research interests, we are in the special situation of working with mathematics as well as code. This makes our collaborations exciting—but may also be a source of conflict when writing code or papers together. Here are some suggestions to mitigate this (see also section 2.7 for our code of conduct):

  • 1. Understand that you will make mistakes. In the context of science, it is important to be honest about such mistakes. I would rather prevent us from publishing something that we know is wrong than hoping that no one notices it. Truth is paramount, and making mistakes provides a learning opportunity for all of us.

  • 2. You are not your code or your paper. It is useful to ‘depersonalise’ criticism and not take it personally. I know that this is easier said than done, but we should all at least strive for that. The goal is to do outstanding science—and we should take criticism as a way to improve. There is value even in negative results or approaches that do not work. Do not be dismayed by this.

  • 3. There’s always a bigger fish. The goal of collaborations is to expand your knowledge. There will be others who know more about a specific facet of the research. This is perfectly normal; there’s no need to be intimidated or scared—they will probably feel the same about you!

2.5 Confidentiality

We will adhere to the highest standards of confidentiality when it comes to data entrusted to us. This means that we ensure that confidential information—including, but not limited to, patient data—remains confidential. I will treat our one-on-one meetings as confidential and will not discuss them with anyone else without your approval.3 If you are in doubt whether you are allowed to share certain information, please ask me! In particular when it comes to information that is sensitive or falls under privacy laws, it is better to get explicit approval. If I consent to sharing something, it will also be my responsibility and you will be protected.

If you are working using your private hardware, ensure that the necessary steps are taken to protect potentially sensitive information. For instance, your smartphone should be protected using a PIN and, ideally, be encrypted. Make sure to lock your screen and protect your device when leaving it in a public space.

Confidentiality should never prevent you from talking about your ideas or sharing them with other people. This section only covers the protection of any form of sensitive data from a legal point of view, in particular when such data contains information that could be potentially linked to other persons. Moreover, I want members of the lab to feel safe when it comes to discussing potentially sensitive matters among each other; such information must never be divulged to third parties, unless consent is given by all parties involved.

3 For conflict resolution, a high-level dissemination of the contents of a meeting might be required. I will discuss this prior to divulging any information to anyone else, though. See also section 2.6.

2.6 Conflict resolution

We are all human, and as such, conflicts might arise naturally. Please follow the ‘least said, soonest mended’ policy and inform me immediately if you perceive a conflict that you cannot resolve or that you want my opinion on. If the conflict involves me, I am very open to discussing it with you and understanding how I can improve as a supervisor and a person. If you do not feel comfortable having such a discussion with me, please discuss it with other members of the Institute of AI for Health (for instance, Carsten Marr).

2.7 Code of conduct

We are working together with the mission to improve the world—be it ever so slightly! Excellent science can only arise in an excellent work environment. I therefore expect all of us to follow certain guidelines in our daily dealings with others:

  • 1. Be kind. If someone performs an action that hurts you, assume no ill intent. It is easy to jump to conclusions or aim to ‘condense fact from the vapour of nuance.’ Resist the urge to follow an emotional response and, if you can, sleep a night over it.

  • 2. Be professional. We are all humans and come in different shapes, forms, colours, and beliefs. Treat everyone with respect and always hold yourself to the highest standards. Follow the principles outlined below.

  • 3. Be open. Good science is achieved by collaborating with others. All of us are here to learn. Do not be afraid to ask simple questions, and do not put those in place that do so.

When representing the lab at a conference or at another venue, remember that we serve the public. We are graciously financed from public sources and should be accountable to the public. I personally find that the Nolan Principles embody the ideals that we should strive for as scientists. Here is my take on them, with slight modifications:

  • 1. Selflessness. We should act in terms of the interest of science itself. This does not mean that we are not allowed to have our own agenda, but said agenda should never be at odds with scientific integrity (see the second point).

  • 2. Integrity. We must act with integrity in our daily work. This includes, for example, that we shall always portray the work of others faithfully and to the best of our knowledge; we never wilfully ignore mentioning related work in papers to make our work look better.

  • 3. Objectivity. We must be as fair and impartial as possible and never let our inherent biases influence our decisions.

  • 4. Accountability. We are accountable to the public for our scientific work and should always be willing to explain our decisions.

  • 5. Transparency. We should act as if our decisions are fully transparent and open. The public has a right to understand what we are doing (see previous point).

  • 6. Honesty. We should always act in a truthful manner. This also means sharing ‘bad news,’ such as experiments that do not work. There will always be setbacks and things that do not work, but we must be able to rely on the fundamental veracity of members of the lab.

In software projects, I summarise this code of conduct often as follows:

Developing is a fundamentally collaborative endeavour. Whenever human beings interact, there is a potential for conflict. At the same time, there is an even greater potential to build something that is ‘bigger than the sum of its parts.’ In this spirit, all contributors shall be aware of the following guidelines:

  • Be tolerant of opposing views.

  • Be mindful of the way you utter your critiques; ensure that what you write is objective and does not attack a person or contain any disparaging remarks.

  • Be forgiving when interpreting the words and actions of others; always assume good intentions.

We want contributors to participate in a respectful, collaborative, and overall productive space here. Any harassment will not be tolerated.

By the by: I do not think that such a code of conduct is strictly necessary, but I believe that it is useful to write down some values that we should adopt.

2.8 Creating a welcoming environment

Many of the guidelines mentioned in this article have the explicit goal of creating a welcoming environment in the lab. Our research provides us with unprecedented access to people from many different cultures and walks of life. Yet, segregation and prejudices based on our visible and invisible differences continue to haunt and mar human relations. We need to work hard to address these issues properly, lest our biases lead to certain groups being prevented to excel. As part of that, we should all strive to make our lab environment welcoming for all. Among the myriad of things we can do to improve our daily interactions, I want to highlight two important aspects:

  • Whenever possible, conduct your discussions in English, to make it easier for others to join in, and to prevent excluding anyone.

  • Always use the names and pronouns that a person wants you to use to address them. Some of our fellow researchers are suffering from deadnaming by governmental organisations. Let us be a force for good here!

2.9 Independence

In a very real sense of the word, you are the captain of your journey through academia. I expect you to be the person in charge of your projects and work independently. You get to decide where to go and where to take things, my job is to provide guidance and support you in your endeavours (see section 2.15 for more thoughts on this). My support and involvement can always be adjusted to match your needs. There are projects where you might prefer close guidance, while in other projects, you might prefer to be left to your own devices. You are in charge!

2.10 Internships

I am a big proponent of internships and secondments. This is an excellent opportunity for lab members to learn new skills, get to collaborate with other labs, and change their scenery for a spell. I will support all internships—whether they are academical or with labs from industry—to the best of my capabilities. My preference is for lab members to not jump into internships right away, but an excellent time could be during the middle of your planned stay in the lab. That way, there will not be any time pressure such as a looming deadline for handing in your Ph.D. thesis. Moreover, this will also enable you to share some of the experiences with the group. I would very much like to use internships as a way to obtain new skills, perspectives, and ideas for both the intern and the lab.

One great way to plan such a secondment involves ELLIS. As soon as you meet the requirements for joining ELLIS as a Ph.D. student, you are eligible to join the research group of any other ELLIS member for some time. Make use of this opportunity and discuss it with me early on if you are interested.

2.11 Lab books

I strongly encourage everyone to maintain a lab book. This is a great way to structure information, store intermediate results, and also get an idea of what one has been working on at a certain point in time. Different types of lab books work for different people. I personally maintain a lab book as a text-only Markdown document because I like it to be available on every computer I use. Your mileage may vary—I also enjoy taking handwritten notes or using my iPad for this purpose. Find something that works for you and do not be afraid to try out multiple ways of keeping notes.

You may also use the lab book to create an agenda for meetings with me. This is certainly not a must—you decide whether to share the lab book with me or not. The upside of having a lab book is that you will always be able to understand where you spent the bulk of your time; this can be very beneficial for weeks that just do not feel that busy.

2.12 Meetings

Meetings, whether remote or in-person, are a double-edged sword: if done well, they are very conducive for research and inspiring. If done badly, they can suck the joy out of your day. We strive—obviously—to only have the first type of meeting. Notice that there are different kinds of meetings with different modalities and goals. For instance, if we are collaborating closely, we will have regular one-on-one meetings. These meetings are time that belongs to you, and you can fill it with anything that comes to your mind. I might have some direct questions about checking in with you, but apart from that, you fully control the flow of these meetings; indeed, you can and should feel free to ask me anything. The primary purpose of these meetings, next to being able to see each other on a regular basis, is for me to help you get unstuck. We can talk about some of your research issues or obstacles you face and there is the distinct possibility that just by talking about that, you will be able to overcome your obstacles or formulate new plans of attack. A good way to think about me (i.e. your supervisor) is to consider me an input–output machine.4 In our one-on-one meetings, you provide me with some input, which I use to generate some output, typically in the form of feedback or suggestions. However, I cannot create useful output from empty inputs. To be able to help you effectively, you need to provide me with something.

By contrast, there are some project meetings. These meetings are more about discussing the progress of projects, chiming in when it comes to certain aspects of the project, or discussing new ideas and directions. Project meetings typically comprise more people, and there is always the risk of zoning out if it not your turn to say something. We should resist that urge and think alongside our collaborators, though, because that way, we can make more rapid progress. Project meetings run the risk of feeling less effective, but we will try to keep them short and to the point. The most important thing to take from such meetings is an understanding of who is responsible for what. With some of my collaborators, I tend to use the terminology of the ‘token,’ i.e. if I have the ‘writing token,’ then I am responsible for writing until I pass the token to someone else. Being upfront about the responsibilities is a great way for making rapid progress.

Regardless of the type of meeting, if at any point, it is unclear to you what was discussed or what you should do, reach out to me. There is absolutely no shame in asking for clarifications, and I would rather re-discuss a potentially tricky aspect of a project multiple times than not making any progress at all. The same holds of course in case you disagree with a direction or are concerned about its feasibility. Please raise such concerns with me; despite what some established researchers would like you to believe, we are neither all-knowing nor all-powerful. We may miss issues just like everyone else.

4 This is from a great blog post by Austin Henley.

2.13 Mistakes

We are all human and make mistakes. ‘To err is human, to forgive divine.’ In that spirit, it is important to realise that mistakes happen and are a natural part of doing research. We should always strive to be open about mistakes that we make. For instance, if the calculations for a paper under revision are incorrect, it is critically relevant for team members to be aware of it as early as possible. At the very least, I should be made aware of it. I want everyone to feel confident in ‘confessing’ whenever a non-trivial mistake happens. Together, we can rectify the mistake and—most importantly—learn from it.

2.14 Research ideas and proposals

One of the most common dilemmas that I faced during my Ph.D. was elucidating my research ideas and proposals to myself and others. During my postdoctoral stay in Karsten Borgwardt’s group, I condensed my own experiences into a set of guiding principles, shown in the form of questions to shape a research project. Whenever you embark on a research endeavour, these questions may hopefully help shape and sharpen your ideas:

  • 1. What is the main idea of your project? Try to be as succinct as possible and refrain from using too much jargon.

  • 2. What are existing approaches, i.e. potential comparison partners, in this context? How do they measure up to your proposal?

  • 3. How do you measure success in this project? Notice that ‘success’ can take many forms; a review paper, for instance, might be considered successful if it manages to describe and categorise a specific research domain.

  • 4. Why is your project relevant? Try to see this from multiple angles and think about the potential audience of the project. A project might be relevant because it solves an existing problem; it might also be relevant because it builds a bridge between two disciplines, etc.

Additional resources

  • The Heilmeier Catechism is a famous list of questions George H. Heilmeier, a director of DARPA, used to pose. I personally do not like phrasing this as a catechism, but the overall spirit of assessing research proposals goes in a similar direction.

2.15 Responsibilities

Every person in the lab has certain responsibilities, which we mention here to provide an understanding of the expectations of different roles in the lab. The descriptions are kept relatively general on purpose; they should serve to provide a suitable overview. These responsibilities are paraphrased from the work guidelines of the DIB Lab, headed by Titus Brown.

Bastian’s responsibilities

  • Provide scientific guidance and leadership with respect to ongoing and upcoming projects. Depending on the project, this can mean different levels of involvement. Please just ask if you want to change my level of involvement in your project.

  • Be available for discussions, brainstorm sessions, and meetings whenever necessary. If you cannot get a hold of me in real life, here are some ways to contact me virtually: for questions with a time-critical component, I prefer Slack. For longer questions that can be answered asynchronously, e-mail is best. I am also happy to hop on any meeting tool with you or meet in real life on campus.

  • Provide guidance for the careers of lab members and assist lab members in reaching those goals by, for instance, writing letters of recommendation or pointing out opportunities for advancement. I am happy to discuss your career with you, and will always find the time to do so.

  • Ensure funding and acquire new funding.

  • Resolve scientific and personal disputes both inside and outside the lab.

Postdoc responsibilities

  • Develop research projects or identify potential new research directions; either on their own or with the help of the PI.

  • Pursue projects and collaborations that are aligned with their own career goals. Such projects do not necessarily have to involve any other members of the lab.

  • Present their work at lab meetings and other gatherings of lab members.

  • Support the PI by providing scientific leadership within the lab.

Ph.D. student responsibilities

  • Develop research projects or identify potential new research directions with the help of postdocs and/or the PI.

  • Work towards the goal of obtaining a Ph.D. with their research.

  • Present their work at lab meetings and other gatherings of lab members.

2.16 Social media

I encourage you to use social media in order to interact with other researchers and promote your own research. Setting up a blog is a simple way to get people excited about projects. In particular for machine learning projects, a blog post, with its more informal tone and no restrictions on length, can often turn out to be extremely helpful for others and ‘put you on the map.’ Your own research portfolio can be easily hosted on GitHub, for instance. Approach me if you are interested in additional hosting options and want some feedback on creating your portfolio.

In addition to maintaining such a website, I can also recommend creating a Twitter profile and following researchers or institutions that are aligned with your research interest. While the signal-to-noise ratio might leave something to be desired, Twitter can be a valuable source of fellowships, job advertisements, and papers. I highly recommend reading Cheplygina et al. [2] for tips on how to get started.

3 For (prospective) Ph.D. students

This chapter discusses your role as a (prospective) Ph.D. student in our lab. It outlines your privileges, your responsibilities, and aims to give general advice on certain issues, such as structuring your work, solving issues, and taking vacations.

If you are about to start your Ph.D. with us, make sure that you use the first few weeks to establish a sustainable work routine. The subsequent sections aim to provide more details about this, and you should make sure to find something that works for you. The other lab members and myself are happy to give our own thoughts about this!

3.1 Onboarding checklist

This section is still very much a work-in-progress. It is updated whenever a new process arises that can be automated. Some of these steps will have to be performed manually by you; however, if you see something that can improved, please tell me!1

  • Your equipment will be provided by the centre. Normally, this should happen on your first day; details about that day will be sent to you in a separate e-mail by our HR coordinator (see section 5.3 for the current team).

  • You should be part of our Slack already. If not, drop me an e-mail.

  • Create a GitHub account or, if you already have one, add your university e-mail address to it. Then inform me so that I can add you to our GitHub organization.

  • Study the guidelines of HELENA, i.e. our local graduate school, to learn about the registration process. Being a registered graduate student confers certain benefits, including additional access to great courses and travel support. Since I am affiliated with Technical University of Munich, you also need to check their rules for being enrolled in the roster of graduate students.

1 This is a general statement, pertaining to all the things discussed in this guide.

3.2 What is the goal of a Ph.D.?

The goal of a Ph.D. is to increase our knowledge of the world. This sounds lofty and exalted, but it really boils down to providing some additional insights into data or phenomena. There are numerous ways, none fundamentally ‘better’ than the other, to achieve this goal. In some cases, your insights will pertain to addressing complex issues, for instance contributing to our understanding of illnesses, such as Alzheimer’s disease. In other cases, your contributions may well be of a more theoretical nature—showing that a certain algorithmic approach is feasible, for example. Your individual trajectory might also oscillate between theory and practice. The important thing is that you will be able to point towards novel insights that you and your research helped unveil.

This is the primary goal of a Ph.D. A germane set of ancillary goals involves the things you might usually associate with pursuing a Ph.D., viz. (i) reading publications, (ii) publishing papers at conferences, in journals, and in workshops, (iii) discussing your work with other scientists, and (iv) writing a dissertation thesis. While important and conducive to reaching the primary goal, these are, in some sense, much more mundane tasks than contributing to the totality of human knowledge. I am stressing this point because every Ph.D. journey is different—it is research, after all—and not every Ph.D. student might end up with the same number of publications afterwards. This does not detract from the overall value of the endeavour! It is tough to resist the ‘counting’ mentality, but please keep in mind this overall goal and Goodhart’s Law, which I paraphrase as follows: ‘When a measure becomes a target, it ceases to be a good measure.’ Thus, do not aim for publication output but aim for knowledge first. It is better to graduate with a few excellent publications than lots of mediocre ones.

Additional resources

3.3 Time management

It is your responsibility to manage your time during the Ph.D., and there is no ‘one-size-fits-all approach.’ However, there are certain patterns that have proven effective, so I want to spell them out explicitly:

  • Make sure to allocate time for reading. You are a scientist, so reading papers is a core activity that you should do in your work time.

  • Take your time to document your thoughts and experiments. This will be very useful when writing your thesis. See also section 2.11 on keeping a lab book.

  • Set aside some time for ‘free-form learning.’ If you encounter a concept, a skill, or a technology that interests you, dive into it! It is often through serendipity that new research ideas are created.

Depending on the way you are internally wired and motivated, it might be useful to have regular mini-deadlines or milestone check-up. Raise these things in our one-on-one meetings to make sure that I am aware of your needs.

Additional resources

3.4 Work–life balance

There is more to life than research—even if we all enjoy it, everyone needs to recharge their batteries. I encourage everyone to keep a larger ‘identity’ beyond ‘I am doing research all the time’. Your output will suffer if you do not take care of your overall well-being. As a general reminder that supersedes everything else, consider ‘Do as I say, not as I do’ rule; I am writing this section as much for myself as I am writing it for you!

3.4.1 ‘Shipping is also a feature’

When you are writing code for your project, working on a paper, or writing a grant proposal, always remember that ‘Shipping is also a feature’. If you never submit your work to the scrutiny of the peer review process, you will never get any critical feedback. More importantly, your work will never be seen and thus also never be used by others. You are depriving yourself and the world by this. Thus, resist the urge to write the perfect paper. Your goal should be to iterate towards perfection—but you should always take opportunities to disseminate your work to a wider audience. I am not saying that you should intentionally lower the quality of your work. Just be on the lookout for the point in time at which your return on investment is just not that big any more. It’s better to submit a paper that you consider to be 90% finished than to wait for years until you finish the remaining 10%.

The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time. (Tom Cargill, Bell Labs)

3.4.2 ‘It’s a marathon, not a sprint’

Good research takes time. Do not fall into the trap of trying to solve everything at once. Your Ph.D. should be treated as a marathon first and foremost; avoid the ‘workaholism’ culture that you will sometimes find (not only) in academia. If you are bragging about not sleeping sufficiently, or if you find yourself repeatedly discussing how many things you sacrificed for your work, you might be on the wrong track. Your capacity to do deep work is diminished if you are sleep-deprived; your capacity to enjoy this marvellous journey might be killed as well. Resist this at all costs!

In my experience, people that obsess over how little sleep they need to be (barely) functioning are often highly inefficient. Of course, they put in a lot of hours, but they do not accomplish a lot because their priorities are not well-aligned. If you are always at ‘red alert’, the term will cease its meaning. There might be times when you need to put into more hours to get something done and it is a good idea to approach these times as a well-rested individual rather than an almost burnt-out candle. See also section 5.4 and section 5.5 for some practical tips about taking your vacations.

3.4.3 ‘To thine own self be true’

Resist the urge to compare yourself to others. I know that this is easier said than done, given that you will meet a lot of smart, ingenious, hard-working people. However, be aware that everyone has a different background and a different journey. While you can be inspired by what others are doing, you should also aim to focus on your own unique strengths.

3.5 Getting ideas

When you are just starting out with your Ph.D. project, you may feel that you do not have an understanding of the community and how to create ‘good’ research ideas. While there is no easy recipe, I am fond of a process that involves continuously refining and iterating on ideas to sharpen them. To actually get ideas in the first place, think about gaps in the literature or potential extensions. What is not yet properly addressed? Where would additional work be required? Even if such a gap may seem small and insignificant to you at first, it may often grow into something magnificent over time.

Gold is where you find it. (American proverb)

Related to this, be sure to talk about your ideas; getting feedback from others helps you start the important refinement loop I mentioned above. When giving feedback, make sure you always follow the ‘Yes, and…’ rule of improvisational comedy. By this, I mean that you should strive to build on ideas that others present, rather than shooting them down for being impractical or infeasible. A good idea can often arise from reducing a seemingly bad one to its core constituents.

3.6 Publishing

You will probably hear the stupid saying ‘Publish or perish’ quite often. Don’t be alarmed by this. In my view, publishing is one of the joys of research—you get to ‘wrap’ your thoughts and experiments into a nice paper and set it free for others to use. If you are new to this—and most of you reading this probably are—do not be daunted. I and the other group members will accompany you during this journey.

3.6.1 Writing style

Unless no other style rules apply,2 we will be following the style guide of the Federal Chancellery. Essentially, this means that our papers will be written in British English with ‘-yse’/‘-ise’ suffixes (e.g. analyse and optimise instead of analyze and optimize). This serves as a way to make our publications consistent and is more efficient than having to come up with new rules every time.

2 In existing projects, in particular those involving US-based collaborators, other writing styles can be followed.

3.6.2 Some writing tips

Writing is an art in and of itself. Investing in your writing skills is the single most important skill you have to grow during your Ph.D.; good writing will serve you well throughout your career. When writing a paper, you are competing for the attention of the reader. Good writing helps you keep that attention, while bad writing drives readers away—and once you have lost a reader’s attention, you have lost the reader for good. Hence, when writing a paper, your overall goal should be to write concisely. This entails getting rid of some empty phrases and using short verbs whenever possible. Personal pet peeves of mine include writing ‘utilise’ when ‘use’ would do, or using ‘methodology’ instead of plain ‘methods.’ Remember that papers are supposed to be read by many people—concise writing improves accessibility. Here are some additional tips that I find generally useful:

  • Prefer verb forms over nouns. English is great in that we can use either the verb form or the noun form of a word. However, in scientific writing, the verb form should be preferred because it is clearer. Compare ‘This enables the calculation of topological features by the neural network’ versus ‘The neural network can calculate topological features.’ I am arguing that in the second form, it is clearer who is acting (the neural network).

  • Check whether absolutes are required. It is rare3 that we use absolute words (such as ‘never’ or ‘always’) in scientific writing.

  • Use the active voice. The active voice increases the precision of your writing and reduces its complexity at the same time.

Finally, be aware that ‘the perfect is the enemy of the good.’ It is natural to start with a (bad) first draft and refine it iteratively. The important thing is to start.

3 What, never? No, never! What, never? Well, hardly ever!

Additional resources

3.6.3 Common mistakes in writing

Next to the content on writing and developing your own style in the previous section, this section mentions several common mistakes, which should best be avoided.

  • 1. Inconsistent citations or bibliographic entries: Unfortunately, most bibliographic entries that you can download from sources like Google Scholar are neither correct nor consistent. You will find different spellings of author names, missing umlauts or other characters, and much more. Avoid this by meticulously correcting bibliographic entries. A good bibliography is the hallmark of good scholarship.

  • 2. Excel-like tables: Probably due to our over-exposure to Excel, we tend to format our tables in a way that is not conducive to improved information processing. Make sure to take use good packages for formatting high-quality tables. If you are using , the booktabs package is great and gives a lot of tips on how to create high-quality tables. Together with a package like siunitx, you can format and align your numbers consistently, making it a pleasure to read and interact with your data.

3.7 Writing code

Just like writing a paper, writing code is more of a craft than a science. However, it is an important skill to learn: good code can make it easier for people to use your methods, thus substantially increasing their impact and reach. Good code can also facilitate collaborations and serve to ensure that results are reproducible. The following quote should serve as your guiding principle when writing code:

Programs must be written for people to read, and only incidentally for machines to execute. (Harold Abelson)

It will thus pay off to make sure that the next person reading the code has a rough idea of what is going on. Note that the next person reading the code may be you most of the time, for instance when collecting results for a follow-up publication. Here are the most important guidelines that I have distilled from past projects:

  • 1. Use a version control system to manage your code. We use git together with GitHub. Take the time to learn the basics (see below for additional resources), it will save you a lot of time in the long run!

  • 2. Limit your line lengths to a reasonable number. If personally prefer 72 since I am coding a lot on my laptop.

  • 3. When programming Python, follow the PEP 8 style (see below). If you collaborate with multiple persons, enforce4 a coding style for the full project—this makes code more consistent and easier to maintain.

4 There are tools like black or gray for formatting your code automatically.

3.8 Conclusion

I wish you an amazing time with the lab and hope you find joy in your research! If all of these suggestions are too much, feel free to ingest them at another point in time. Again, this document should only serve as a gentle scaffold—as opposed to a prison—for all of us.

4 For (prospective) undergraduate students

This chapter discusses the role of a (prospective) undergraduate student in our lab. You might be considering working on some project with us, for example as part of your bachelor’s or master’s thesis, or as a stand-alone stay. While many of the general recommendations in chapter 3 apply, your situation is special and deserves some more in-depth remarks.

4.1 What is the goal of your stay with us?

Your primary goal is to become an expert in a topic and learn how to communicate efficiently and effectively about this topic by means of a project report or a written thesis. Ideally, your work will result in a publication in an appropriate venue.1 Since the duration of your project is typically smaller, we will ensure that the individual learning goals are feasible.

1 Typically, such publications will be pursued after your main work has been finished. A publication will not count towards your grade and, depending on the project, publications might be easier or harder to obtain.

4.2 Potential topics

When it comes to selecting a topic for your stay with us, there is a large amount of flexibility. Bearing in mind overall feasibility, I am particularly fond of projects that either involve improving an existing method, or that add a new facet—such as a new application domain—to an existing method. Of course, ‘the sky is the limit;’ if you have a research idea that you want to pursue with us, I am happy to discuss it with you!

For some examples of potential topics, check the publication lists of members of the lab. You can also approach us with your project ideas, even if you think that they might not be a great fit for us. The only caveat is that we want to ensure that we are capable of providing outstanding supervision to you. If your topic is completely outside our area of expertise, we might not be able to fulfil our end of the supervision agreement, potentially making it harder for you to finish the project in case of obstacles.

4.3 Writing a thesis or a project report

The writing tips outlined in section 3.6 apply to a written thesis or a project report as well. Here, unlike for a scientific publication, your goal should be to strive for a comprehensive treatment of the topic at hand. The idea is for you to demonstrate that you have become proficient in a topic and are capable of communicating said proficiency to others. Hence, your thesis and project report is a perfect opportunity to practice writing skills. Regardless of your career path, such skills will always be useful.

A common mistake in thesis writing is an incorrect level of detail, i.e. putting too much emphasis on too many details. This can be detrimental to progress, paradoxically, and result in an additional source of stress, so it should best be avoided. While it is important for your thesis to be self-contained and readable by, say, a fellow student of yours, there is no need to recapitulate any common definitions. I realise that this is vague point; what is common for one person might be news for another one. A good rule of thumb is to think about basic required undergraduate courses. If the material is covered in them, it is probably fine to exclude it.

5 Miscellaneous information

This is the more ‘pedestrian’ part of the guide. Some administrative information appears to be unavoidable.

5.1 Important identification numbers

When interacting with our IT systems, you might be asked about certain ID numbers or ‘PSP Elements.’ Here is are some relevant ones that you might have to use:

.
Number

Usage

G-540007-001

For all expenses related to AIH. Use this for departmental retreats and events, for instance.

G-540003-001

For all expenses related to our group. This is the primary number you need to use for billing expenses to our budget when travelling.

5.2 IT systems

Getting started with Unix or Linux (or even the command-line) can feel daunting at first, but your efforts will pay off. Most of the code you will write as part of your research career will run in one variant of a Unix system, so all the skills will remain useful for a long time to come. As a member of the lab, you will have access to the IT resources of the institute. Our local IT services have written an excellent guide in their wiki. Be sure to familiarise yourself with this to set up your own computing environment most efficiently. Lab members can also contribute little helper scripts or so-called dotfiles in the dotfiles repository. You will find additional information and recipes about working with the cluster in our own wiki.

Requesting a cluster account By default, users do not have access to the compute cluster. You have to create a request for a cluster account (HPC account) in SPIT. As part of this request, or with a separate one, you may suggest that your default login shell be changed to bash (or any other shell you had in mind, such as zsh). This will simplify the subsequent installation slightly and probably provide a more comfortable working environment.

5.3 Personnel

This is a list of important personnel and their roles in the centre as well as in the research group. I will keep this list up to date in order to provide all of us with knowledge about whom to contact for certain things.

.
Person Role
Antunes, Marianne HR Manager
Bauer, Florian Procurement (hardware, software)
Feest, Christoph AIH Director, Helmholtz.AI Director
Singiali, Bom Bahadur System Engineer (cluster)

5.4 Travelling

Travelling is a staple of the scientific life, and you should make use of the opportunity to visit conferences, workshops, or other research groups. To be properly reimbursed for the travels, follow these steps:

  • 1. Discuss the proposed trip with me and create a travel request in SPIT. Make sure that this request is created prior to the actual travels; for legal reasons, we are not allowed to travel for work without this type of authorisation. Select either Carsten (Marr) or Christoph (Feest). as approvers of the request.

  • 2. While travelling, collect all additional costs that can be reimbursed (taxi rides, additional food if none was provided by the venue, and so on).

  • 3. After your trip, you will be automatically mailed a form for writing down your expenses. Send this, along with all receipts of your additional costs, to the administrative assistant handling travel reimbursements. The easiest way is to use the following e-mail address: fa-travel@helmholtz-muenchen.de.

As always, notify me about any hiccups in this process. You are entitled to compensation for food and other costs; if necessary, I will discuss any issues with the administration. The usual caveats for travelling apply, of course: being tax-funded ‘servants of the people,’ we are supposed to be frugal when it comes to booking reservations or vehicles. As such, it might be necessary for us to share accommodations or use a cheaper means of transportation if reasonably feasible.

5.5 Vacation

Make sure to take time off to rest and recharge! Research can be a burden at times, so it is always a good idea to step away for some time. Applying for vacation time is easy, just follow these steps:

  • 1. Create a request for vacation in SPIT. Make sure that this request is created prior to the actual vacation; backdating a vacation is possible but takes a lot of effort. I have been there–trust me. When creating the request, put me as the approver.

  • 2. Notify me about the request. The system sends out notifications by itself, but it is always good to have this on my radar.

  • 3. Enjoy your time off!

  • 4. Consider sharing some memories afterwards with the group. (This step is optional, but I always like to hear tall tales of adventure.)

5.6 What-the-heck list

In particular if you are just starting out in the lab, I would highly appreciate it if you brought things that made you go ‘What the heck?’ to my attention. In the spirit of continuous improvement, it is important to know where things are going wrong, and it often takes a new set of eyeballs to perceive these issues. Ideally, I would like to prevent fires rather than putting them out later on.

Bibliography

  • [1] Frederick P. Brooks. ‘The Computer Scientist as Toolsmith II’. In: Communications of the ACM 39.3 (1996), pp. 61–68. doi: 10.1145/227234.227243.

  • [2] Veronika Cheplygina et al. ‘Ten simple rules for getting started on Twitter as a scientist’. In: PLOS Computational Biology 16.2 (2020), pp. 1–9. doi: 10.1371/journal.pcbi.1007513.

  • [3] Florian Markowetz. ‘You Are Not Working for Me; I Am Working with You’. In: PLOS Computational Biology 11.9 (2015), pp. 1–8. doi: 10.1371/journal.pcbi.1004387.

Inspiration

This guide is inspired by several similar documents. The primary source is the Syllabus for Eric’s PhD Students written by Eric Gilbert. Moreover, Ryan Cotterell provides helpful hints on his supervision style, some of which have been incorporated in this guide as well.