This is a guide to the AIDOS Lab, a research lab interested in geometry, topology, and machine learning. The guide contains information for prospective graduate students, as well as some general advice that might be helpful for those navigating academic systems. Feedback is always welcome—please reach out to bastian@rieck.me.
Changelog
- 0.0.19
-
(2024-10-31) — Removed administrative information to focus on research. Added a lot of information on getting research ideas and publishing them.
- 0.0.18
-
(2024-07-06) — Rewrote and extended many sections. Marking sections that are drafts or important.
- 0.0.17
-
(2024-04-26) — More on philosophy, writing, getting ideas, and my own responsibilities.
- 0.0.16
-
(2024-02-13) — Fixed some style issues and included slides on scientific writing.
- 0.0.15
-
(2023-06-01) — Discussion of a welcoming lab environment; provided more guidance for authorship rules.
- 0.0.14
-
(2023-02-21) — Added link to wiki. Removed more technical content, since this is best stored and discussed elsewhere.
- 0.0.13
-
(2023-01-27) — Updated personnel list. Added section about how to apply for vacations. Also added a new section about meetings, including their different types and purposes.
- 0.0.12
-
(2022-10-13) — Basic introduction to cluster setup, with a discussion of the necessary administrative request, as well as a brief recipe for installing Miniconda.
- 0.0.11
-
(2022-09-01) — More details on graduate school admission. Added special ID numbers for travelling. Added links for IT systems.
- 0.0.10
-
(2022-15-08) — Improved title page and quotes. Added discussion on shared authorships. Added special notation section. Added section on getting ideas.
- 0.0.9
-
(2022-07-09) — Added notes on research proposals
- 0.0.8
-
(2022-06-27) — Added notes on travelling
- 0.0.7
-
(2022-06-03) — Added sources
- 0.0.6
-
(2022-06-03) — Added section on internships
- 0.0.5
-
(2022-05-09) — Expanded on philosophy and additional details
- 0.0.4
-
(2022-04-11) — Added chapter on undergraduates; updated personnel list
- 0.0.3
-
(2022-03-21) — Added notes for collaborations
- 0.0.2
-
(2022-03-16) — Documents for doctoral studies
- 0.0.1
-
(2022-03-15) — First draft
Acknowledgements
I am grateful to Leslie O’Bray and Julius von Rohrscheidt for providing feedback on different versions of this guide. Likewise, Simon Wengert and Johanna Sommer made a lot of incisive comments that added additional depth to this guide—thank you very much for that!
License
This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/4.0 or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
Contents
1 Introduction
This is a guide for everyone interested in the AIDOS Lab. Whether you are a prospective student or already working with us in any capacity, this guide is here to answer some of the most relevant questions while also providing an overview of what we stand for. Most of the contents of this guide pertain to students, but they may also be relevant for postdoctoral researchers, visiting scholars, and anyone who interacts with us on a close basis.
1.1 The origins of this guide
I observed that the most successful research groups all had some form of guide that serves as an introduction to the lab, expands on certain logistical details, and so on. At the same time, I am unaware of a guide that also directly addresses expectations for Ph.D. students and other lab members. I therefore also included brief sections on topics that are not commonly discussed in such guides—but that inevitably crop up time and again—to provide a good basis for further in-person talks or one-on-one meetings.
As a secondary reason, I want to provide some structure that all of us can refer to and benefit from. I am not a big proponent of informal structures that might arise if responsibilities are not clearly defined.1 This does not mean that I believe that everything needs to be codified, on the contrary! I see this guide therefore not as ‘laying down the law’, but more as the creation of a common scaffold for all of us. We may easily disagree on some of the finer points, but in my experience, it is better to clearly describe expectations.
1 For an excellent discussion of the problem of informal power structures, see this essay on power in organisations
1.2 About the lab
Being founded in late 2021, the AIDOS Lab is extremely young, giving us all the opportunity to help shape it. The word ‘AIDOS’ has two meanings that complement each other well. The first meaning refers to our mission statement, viz. to develop Artificial Intelligence for Discovering Obscured Shapes. Like the guide that you are reading, this short moniker should serve as a gentle scaffold for our research.
The second meaning originates from the Greek word ‘αἰδώς,’ which means ‘awe,’ ‘reverence,’ or ‘humility.’ This awe or humility should serve as one of our guiding principles when we work on challenging problems in healthcare research, aiming to improve our world using machine learning or other tools. The principle of humility is so central because it is easy to lose one’s sense of identity amidst things that are vying for our attention. As scientists, we are entrusted by society to think about long-term problems, and this trust requires a certain humble frame of mind.
True humility is not thinking less of yourself, it is thinking of yourself less. (C.S. Lewis)
In that spirit, please enjoy your stay with the lab! I am looking forward to doing great work together. I have tried to keep the tone rather light, even though the topics this guide addresses are certainly of great importance to us. Please let me know if I missed anything that you deem important. Ideally, this guide should be a living document, and a constant reflection of how we want to conduct our research and our daily interactions. As Heraclitus writes: ‘All is flux, nothing is stationary.’
On other lab names If you have been around for a while in science, you are probably aware of the fact that most labs are named after the PI. You will thus find a lot of instances of ‘Smith Lab,’ for example. Many institutions follow this unfortunate paradigm and would thus prefer to refer to us as the ‘Rieck Lab.’ I believe that a name should also carry a mission as opposed to being tied to a single person.
1.3 Our philosophy
Before we get started and lose ourselves in the jungle of additional information, I want to point out the great paper by Markowetz [4], which I encourage you to read. The title already encapsulates how I feel about our work: indeed, you are not working for me but with me. In case you are interested in learning more about the background of such a philosophy, the overall umbrella term is participative leadership. As part of this philosophy, here, in no particular order, are three tenets arising from this philosophy.
-
1. Communication is key. If you have updates on a project or need advice on something that is blocking you, make yourself heard. I am very happy to give you feedback on anything, but you should also engage with your fellow lab members. Often, even briefly talking or writing about your specific problem can help you get ‘unstuck.’ The same holds when you want to try out new ideas. Do not be afraid to briefly talk them through.
-
2. You are in charge. While I will help guide you and your research, you are in the driver’s seat. Be proactive, reach out to others if need be, and, following the previous point, communicate what is going on. For instance, if something is not working, let me—and your project partners—know so that we can jointly figure out what is going on. Moreover, you do not have to ask for permission if you want to try out new things. Find a course that interests you? Take it! See a paper that is inspiring? Read it and think about it. You get the idea—I trust you to use your time wisely; you have full autonomy concerning the research directions.
-
3. Ask questions. Be not afraid to ask for clarification or help, in particular if you are unsure about how to proceed. A big part of science involves navigating unknown intellectual terrain, which can be daunting. Ask away and start discussions left and right. It will help you in the long run!
1.4 Our research
Broadly speaking, our lab is all about developing methods. I always like to use the term toolsmith for myself, echoing a great coinage by the late Fred Brooks [1]. This means that, foremost, we should be interested in developing new methods, new tools, new ways of looking at things. At the same time, this should never go at the expense of utility. Notice that utility is not the same as having a direct application! A method can provide utility by solving a problem more elegantly and with fewer assumptions, for instance. Ideally, a new tool should bring new insights. What I want to avoid is research along the lines of ‘solving somebody else’s problem using somebody else’s method.’ That is not to say that we should ignore existing literature or insist on always rolling our own solution, but it should provide an overall goal for us all to strive for.
Navigating this requires some wisdom and a lot of reflection. A quick litmus test is to ask yourself the question ‘Why am I doing this?’ In the beginning of my own Ph.D. research , I sometimes could only answer ‘Because I can.’ This is not a good motivation, and I have since thought much more deeply about this. Nowadays, I find that the dictum ‘theory without practice is empty’ sums up my views nicely. Some more thoughts about this may be found in chapter 3 and, more specifically, section 3.4.
I am aware that striving for novelty and utility is not necessarily easy, and many students are afraid of working on what they construe to be incremental advances.2 Notice, however, that scientific progress always advances incrementally in some sense. Big leaps do happen, of course, and it is good to strive for them, but at the same time, I am a big advocate of intellectual humility and giving generous credit to those who came before us. As Frank Herbert wrote: ‘From the top of the mountain, you cannot see the mountain.’ I construe this to mean that problems, once conquered, cease to look like big challenges. That does not mean that they were conquered without any effort, though. My advice to everyone, in particular if they are just starting out in science, is to focus more on the things you can do and contribute and worry about all the other aspects later. Your research is an ongoing conversation. Let it evolve and change over time!
I am personally less an advocate of what one could call ‘leaderboard science,’ i.e. the attempt to beat the state of the art, often achieved by smart, incremental changes to existing methods. I have the utmost respect for these amazing engineering feats that often result in considerable performance improvements, so if you want to do this, you will have my full and unequivocal support—I just find this particular type of research quite hectic since you always run the risk of being scooped (meaning that someone else got better results with similar tricks). Of course, that is not to say that you should not strive for good (predictive or computational) performance, on the contrary! I just prefer the initial focus to lie on the method and an analysis of its advantages and limitations.
2 This is also a comment one might get from reviewers, unfortunately. I wish we would beyond these shallow notions of (perceived) novelty.
2 General guidelines
This chapter covers general guidelines that are of general interest to anyone working with the lab or for the lab. Guidelines are ordered alphabetically. If you only read two sections of this chapter, it should be section 2.8, which outlines how members of the lab should comport themselves, and section 2.9, which provides suggestions for creating a welcoming environment.
2.1 Advice
My primary role as your mentor is to give advice. Make use of this opportunity and take my words at face value; when I say ‘You could look into this,’ I really mean nothing else than ‘Hey, this might be an interesting direction.’ When you absolutely need to do something—for instance in case I find that there is something fundamental missing from a preprint, I will tell you directly. I know that there is a tendency to read between the lines of what PIs are saying; just ask me directly in case something is unclear!
2.2 Availability
As a general rule of thumb, all lab members should be available1 during normal European business hours, say from 10∶00–16∶00. This makes it possible for other groups to interact with us and greatly simplifies any collaborations. I will not blindly enforce this rule, but I expect you to notify me if you are not available during a certain period of time. Moreover, as a token of respect to all other lab members, everyone should usually be present for the lab meetings.
1 This means either being present in your office or being responsive to messages or calls when you are working from home. It does not mean that you are glued to your screen at all times.
2.3 Authorship rules
When submitting a paper to a machine learning venue, the order of authors should reflect their contribution and relevance for the project. Typically, this will mean that junior members, such as Ph.D. and graduate students, will be placed before more senior members. To assess whether a person is qualified to be an author on a manuscript, we will follow a set of rules. Anyone listed on a paper as an author has to satisfy all of the following criteria:2
-
1. They have made substantial intellectual contributions to at least a subset of the work described in the manuscript.
-
2. They have participated in writing or revising the manuscript.
-
3. They are aware of the existence of the manuscript and of the intent to submit the manuscript to a certain venue.
-
4. They agree with the content of the manuscript and are willing to vouch for the correctness of the work.
In particular, there will not be any ‘courtesy authorships.’ Unless you raise this an issue with me, I will typically assume that there are no conflicts with these guidelines. Let me know well before the submission if this is not the case! Following these rules is a way for all of us to demonstrate scientific integrity. The main idea behind this rule is that everyone listed as an author should be comfortable with giving a talk on the work. If this is not the case, something is wrong.
I realise that this policy cannot be enforced in larger collaborations unless all participants are aware of this. Certainly, I do not expect any students to handle authorship disputes—you should always feel comfortable to bring such matters to my attention. Ensuring fair treatment in scientific matters is one of my primary responsibilities.
On shared authorships In machine learning and many other scientific domains, it is common practice to share authorships. This is a nod to the inherent complexity and teamwork involved in large-scale projects. I encourage you to think about sharing authorships with other Ph.D. students whenever this is appropriate. Please involve me in such discussions early on. If you intend to write a publication-based dissertation, pay attention to what constitutes a ‘core’ publication for your thesis.
2 This policy is an amalgamation of the rules instituted by ETH Zurich and the ACM Criteria for Authorship.
2.4 Coding style
Just like writing a paper, writing code is more of a craft than a science. However, it is an important skill to learn: good code can make it easier for people to use your methods, thus substantially increasing their impact and reach. Good code can also facilitate collaborations and serve to ensure that results are reproducible. The following quote should serve as your guiding principle when writing code:
Programs must be written for people to read, and only incidentally for machines to execute. (Harold Abelson)
It will thus pay off to make sure that the next person reading the code has a rough idea of what is going on. Note that the next person reading the code may be you most of the time, for instance when collecting results for a follow-up publication. Since we will be working mostly with Python, here is the ‘house style’ that I urge everyone to adopt. Most of these rules are meant to make it easy to develop software together. They also make it easy for me (and others) to read your code and interact with it.
-
1. Use a version control system to manage your code. We use git together with GitHub. Take the time to learn the basics (see below for additional resources), it will save you a lot of time in the long run!
-
2. Use a code formatter. It completely obviates any need for discussions about personal styles. If you integrate it into your editor or your GitHub actions, you can write your code in your preferred style, and let your editor format it for you. Personally, I like black for code formatting because it is easy to use.
-
3. Follow the Python style guide. The style guide, also known as PEP 8, provides numerous hints as to how to structure your code. Some of this is redundant if you already use a code formatter, but the advice on how to build better software and structure your classes is worth a read.
-
4. Use ‘docstrings’ to document your code. If you are just drafting your code, you do not need good documentation. But as soon as the dust has settled somewhat, it is a good idea to think about writing some documentation in case you want others to build on your project. Notice that not every project needs to be perfectly documented, but you are substantially increasing the impact and overall utility of your work if you make it easy for others to perceive the salient points of your project. See the numpy documentation guide for great examples of how to craft good documentation.
Additional resources
2.5 Collaboration
Science is fundamentally a collaborative endeavour. Given our research interests, we are in the special situation of working with mathematics as well as code. This makes our collaborations exciting—but may also be a source of conflict when writing code or papers together. Here are some suggestions to mitigate this (see also section 2.8 for our code of conduct):
-
1. Understand that you will make mistakes. In the context of science, it is important to be honest about such mistakes. I would rather prevent us from publishing something that we know is wrong than hoping that no one notices it. Truth is paramount, and making mistakes provides a learning opportunity for all of us.
-
2. You are not your code or your paper. It is useful to ‘depersonalise’ criticism and not take it personally. I know that this is easier said than done, but we should all at least strive for that. The goal is to do outstanding science—and we should take criticism as a way to improve. There is value even in negative results or approaches that do not work. Do not be dismayed by this.
-
3. There’s always a bigger fish. The goal of collaborations is to expand your knowledge. There will be others who know more about a specific facet of the research. This is perfectly normal; there’s no need to be intimidated or scared—they will probably feel the same about you!
Additional resources
2.6 Confidentiality
We will adhere to the highest standards of confidentiality when it comes to data entrusted to us. This means that we ensure that confidential information—including, but not limited to, patient data—remains confidential. I will treat our one-on-one meetings as confidential and will not discuss them with anyone else without your approval.3 If you are in doubt whether you are allowed to share certain information, please ask me! In particular when it comes to information that is sensitive or falls under privacy laws, it is better to get explicit approval. If I consent to sharing something, it will also be my responsibility and you will be protected.
If you are working using your private hardware, ensure that the necessary steps are taken to protect potentially sensitive information. For instance, your smartphone should be protected using a PIN and, ideally, be encrypted. Make sure to lock your screen and protect your device when leaving it in a public space.
Confidentiality should never prevent you from talking about your ideas or sharing them with other people. This section only covers the protection of any form of sensitive data from a legal point of view, in particular when such data contains information that could be potentially linked to other persons. Moreover, I want members of the lab to feel safe when it comes to discussing potentially sensitive matters among each other; such information must never be divulged to third parties, unless consent is given by all parties involved.
3 For conflict resolution, a high-level dissemination of the contents of a meeting might be required. I will discuss this prior to divulging any information to anyone else, though. See also section 2.7.
2.7 Conflict resolution
We are all human, and as such, conflicts might arise naturally. Please follow the ‘least said, soonest mended’ policy and inform me immediately if you perceive a conflict that you cannot resolve or that you want my opinion on. If the conflict involves me, I am very open to discussing it with you and understanding how I can improve as a supervisor and a person. If you do not feel comfortable having such a discussion with me, please discuss it with other members of our institutions.
2.8 Code of conduct
We are working together with the mission to improve the world—be it ever so slightly! Excellent science can only arise in an excellent work environment. I therefore expect all of us to follow certain guidelines in our daily dealings with others:
-
1. Be kind. If someone performs an action that hurts you, assume no ill intent. It is easy to jump to conclusions or aim to ‘condense fact from the vapour of nuance.’ Resist the urge to follow an emotional response and, if you can, sleep a night over it.
-
2. Be professional. We are all humans and come in different shapes, forms, colours, and beliefs. Treat everyone with respect and always hold yourself to the highest standards. Follow the principles outlined below.
-
3. Be open. Good science is achieved by collaborating with others. All of us are here to learn. Do not be afraid to ask simple questions, and do not put those in place that do so.
When representing the lab at a conference or at another venue, remember that we serve the public. We are graciously financed from public sources and should be accountable to the public. I personally find that the Nolan Principles embody the ideals that we should strive for as scientists. Here is my take on them, with slight modifications:
-
1. Selflessness. We should act in terms of the interest of science itself. This does not mean that we are not allowed to have our own agenda, but said agenda should never be at odds with scientific integrity (see the second point).
-
2. Integrity. We must act with integrity in our daily work. This includes, for example, that we shall always portray the work of others faithfully and to the best of our knowledge; we never wilfully ignore mentioning related work in papers to make our work look better.
-
3. Objectivity. We must be as fair and impartial as possible and never let our inherent biases influence our decisions.
-
4. Accountability. We are accountable to the public for our scientific work and should always be willing to explain our decisions.
-
5. Transparency. We should act as if our decisions are fully transparent and open. The public has a right to understand what we are doing (see previous point).
-
6. Honesty. We should always act in a truthful manner. This also means sharing ‘bad news,’ such as experiments that do not work. There will always be setbacks and things that do not work, but we must be able to rely on the fundamental veracity of members of the lab.
In software projects, I summarise this code of conduct often as follows:
Developing is a fundamentally collaborative endeavour. Whenever human beings interact, there is a potential for conflict. At the same time, there is an even greater potential to build something that is ‘bigger than the sum of its parts.’ In this spirit, all contributors shall be aware of the following guidelines:
• Be tolerant of opposing views.
• Be mindful of the way you utter your critiques; ensure that what you write is objective and does not attack a person or contain any disparaging remarks.
• Be forgiving when interpreting the words and actions of others; always assume good intentions.
We want contributors to participate in a respectful, collaborative, and overall productive space here. Any harassment will not be tolerated.
By the by: I do not think that such a code of conduct is strictly necessary, but I believe that it is useful to write down some values that we should adopt.
2.9 Creating a welcoming environment
Many of the guidelines mentioned in this article have the explicit goal of creating a welcoming environment in the lab. Our research provides us with unprecedented access to people from many different cultures and walks of life. Yet, segregation and prejudices based on our visible and invisible differences continue to haunt and mar human relations. We need to work hard to address these issues properly, lest our biases lead to certain groups being prevented to excel. As part of that, we should all strive to make our lab environment welcoming for all. Among the myriad of things we can do to improve our daily interactions, I want to highlight two important aspects:
-
• Whenever possible, conduct your discussions in English, to make it easier for others to join in, and to prevent excluding anyone.
-
• Always use the names and pronouns that a person wants you to use to address them. Some of our fellow researchers are suffering from deadnaming by governmental organisations. Let us be a force for good here!
2.10 Independence
In a very real sense of the word, you are the captain of your journey through academia. I expect you to be the person in charge of your projects and work independently. You get to decide where to go and where to take things, my job is to provide guidance and support you in your endeavours (see section 2.16 for more thoughts on this). My support and involvement can always be adjusted to match your needs. There are projects where you might prefer close guidance, while in other projects, you might prefer to be left to your own devices. You are in charge!
2.11 Internships
I am a big proponent of internships and secondments. This is an excellent opportunity for lab members to learn new skills, get to collaborate with other labs, and change their scenery for a spell. I will support all internships—whether they are academical or with labs from industry—to the best of my capabilities. My preference is for lab members to not jump into internships right away, but an excellent time could be during the middle of your planned stay in the lab. That way, there will not be any time pressure such as a looming deadline for handing in your Ph.D. thesis. Moreover, this will also enable you to share some of the experiences with the group. I would very much like to use internships as a way to obtain new skills, perspectives, and ideas for both the intern and the lab.
One great way to plan such a secondment involves ELLIS. As soon as you meet the requirements for joining ELLIS as a Ph.D. student, you are eligible to join the research group of any other ELLIS member for some time. Make use of this opportunity and discuss it with me early on if you are interested.
2.12 Lab books
I strongly encourage everyone to maintain a lab book. This is a great way to structure information, store intermediate results, and also get an idea of what one has been working on at a certain point in time. Different types of lab books work for different people. I personally maintain a lab book as a text-only Markdown document because I like it to be available on every computer I use. Your mileage may vary—I also enjoy taking handwritten notes or using my iPad for this purpose. Find something that works for you and do not be afraid to try out multiple ways of keeping notes.
You may also use the lab book to create an agenda for meetings with me. This is certainly not a must—you decide whether to share the lab book with me or not. The upside of having a lab book is that you will always be able to understand where you spent the bulk of your time; this can be very beneficial for weeks that just do not feel that busy.
Additional resources
2.13 Meetings
Meetings, whether remote or in-person, are a double-edged sword: if done well, they are very conducive for research and inspiring. If done badly, they can suck the joy out of your day. We strive—obviously—to only have the first type of meeting. Notice that there are different kinds of meetings with different modalities and goals. For instance, if we are collaborating closely, we will have regular one-on-one meetings. These meetings are time that belongs to you, and you can fill it with anything that comes to your mind. I might have some direct questions about checking in with you, but apart from that, you fully control the flow of these meetings; indeed, you can and should feel free to ask me anything. The primary purpose of these meetings, next to being able to see each other on a regular basis, is for me to help you get unstuck. We can talk about some of your research issues or obstacles you face and there is the distinct possibility that just by talking about that, you will be able to overcome your obstacles or formulate new plans of attack. A good way to think about me (i.e. your supervisor) is to consider me an input–output machine.4 In our one-on-one meetings, you provide me with some input, which I use to generate some output, typically in the form of feedback or suggestions. However, I cannot create useful output from empty inputs. To be able to help you effectively, you need to provide me with something.
By contrast, there are some project meetings. These meetings are more about discussing the progress of projects, chiming in when it comes to certain aspects of the project, or discussing new ideas and directions. Project meetings typically comprise more people, and there is always the risk of zoning out if it not your turn to say something. We should resist that urge and think alongside our collaborators, though, because that way, we can make more rapid progress. Project meetings run the risk of feeling less effective, but we will try to keep them short and to the point. The most important thing to take from such meetings is an understanding of who is responsible for what. With some of my collaborators, I tend to use the terminology of the ‘token,’ i.e. if I have the ‘writing token,’ then I am responsible for writing until I pass the token to someone else. Being upfront about the responsibilities is a great way for making rapid progress.
Regardless of the type of meeting, if at any point, it is unclear to you what was discussed or what you should do, reach out to me. There is absolutely no shame in asking for clarifications, and I would rather re-discuss a potentially tricky aspect of a project multiple times than not making any progress at all. The same holds of course in case you disagree with a direction or are concerned about its feasibility. Please raise such concerns with me; despite what some established researchers would like you to believe, we are neither all-knowing nor all-powerful. We may miss issues just like everyone else.
4 This is from a great blog post by Austin Henley.
Additional resources
2.14 Mistakes
We are all human and make mistakes. ‘To err is human, to forgive divine.’ In that spirit, it is important to realise that mistakes happen and are a natural part of doing research. We should always strive to be open about mistakes that we make. For instance, if the calculations for a paper under revision are incorrect, it is critically relevant for team members to be aware of it as early as possible. At the very least, I should be made aware of it. I want everyone to feel confident in ‘confessing’ whenever a non-trivial mistake happens. Together, we can rectify the mistake and—most importantly—learn from it.
2.15 Research ideas and proposals
One of the most common dilemmas that I faced during my Ph.D. was elucidating my research ideas and proposals to myself and others. During my postdoctoral stay in Karsten Borgwardt’s group, I condensed my own experiences into a set of guiding principles, shown in the form of questions to shape a research project. Whenever you embark on a research endeavour, these questions may hopefully help shape and sharpen your ideas:
-
1. What is the main idea of your project? Try to be as succinct as possible and refrain from using too much jargon.
-
2. What are existing approaches, i.e. potential comparison partners, in this context? How do they measure up to your proposal?
-
3. How do you measure success in this project? Notice that ‘success’ can take many forms; a review paper, for instance, might be considered successful if it manages to describe and categorise a specific research domain.
-
4. Why is your project relevant? Try to see this from multiple angles and think about the potential audience of the project. A project might be relevant because it solves an existing problem; it might also be relevant because it builds a bridge between two disciplines, etc.
One actionable insight from these questions is that we can start a document in which you provide some (preliminary) to them. Such a document may then serve as the precursor for a longer research project
Additional resources
-
• The Heilmeier Catechism is a famous list of questions George H. Heilmeier, a director of DARPA, used to pose. I personally do not like phrasing this as a catechism, but the overall spirit of assessing research proposals goes in a similar direction.
2.16 Responsibilities
Every person in the lab has certain responsibilities, which we mention here to provide an understanding of the expectations of different roles in the lab. The descriptions are kept relatively general on purpose; they should serve to provide a suitable overview. These responsibilities are paraphrased from the work guidelines of the DIB Lab, headed by Titus Brown.
-
• Provide scientific guidance and leadership with respect to ongoing and upcoming projects. Depending on the project, this can mean different levels of involvement. Please just ask if you want to change my level of involvement in your project.
-
• Be available for discussions, brainstorm sessions, and meetings whenever necessary. If you cannot get a hold of me in real life, here are some ways to contact me virtually: for questions with a time-critical component, I prefer Slack. For longer questions that can be answered asynchronously, e-mail is best. I am also happy to hop on any meeting tool with you or meet in real life on campus.
-
• Provide guidance for the careers of lab members and assist lab members in reaching those goals by, for instance, writing letters of recommendation or pointing out opportunities for advancement. I am happy to discuss your career with you, and will always find the time to do so.
-
• Ensure funding and acquire new funding.
-
• Resolve scientific and personal disputes both inside and outside the lab.
-
• Develop research projects or identify potential new research directions; either on their own or with the help of the PI.
-
• Pursue projects and collaborations that are aligned with their own career goals. Such projects do not necessarily have to involve any other members of the lab.
-
• Present their work at lab meetings and other gatherings of lab members.
-
• Support the PI by providing scientific leadership within the lab.
Ph.D. student responsibilities
-
• Develop research projects or identify potential new research directions with the help of postdocs and/or the PI.
-
• Work towards the goal of obtaining a Ph.D. with their research.
-
• Present their work at lab meetings and other gatherings of lab members.
2.17 Social media
I encourage you to use social media in order to interact with other researchers and promote your own research. Setting up a blog is a simple way to get people excited about projects. In particular for machine learning projects, a blog post, with its more informal tone and no restrictions on length, can often turn out to be extremely helpful for others and ‘put you on the map.’ Your own research portfolio can be easily hosted on GitHub, for instance. Approach me if you are interested in additional hosting options and want some feedback on creating your portfolio.
In addition to maintaining such a website, I can also recommend creating a Twitter profile and following researchers or institutions that are aligned with your research interest. While the signal-to-noise ratio might leave something to be desired, Twitter can be a valuable source of fellowships, job advertisements, and papers. I highly recommend reading Cheplygina et al. [2] for tips on how to get started.
Additional resources
2.18 What-the-heck list
In particular if you are just starting out in the lab, I would highly appreciate it if you brought things that made you go ‘What the heck?’ to my attention. In the spirit of continuous improvement, it is important to know where things are going wrong, and it often takes a new set of eyeballs to perceive these issues. Ideally, I would like to prevent fires rather than putting them out later on.
3 For (prospective) Ph.D. students
This chapter discusses your role as a (prospective) Ph.D. student in our lab. It outlines your privileges, your responsibilities, and aims to give general advice on certain issues, such as structuring your work, solving issues, and taking vacations.
If you are about to start your Ph.D. with us, make sure that you use the first few weeks to establish a sustainable work routine. The subsequent sections aim to provide more details about this, and you should make sure to find something that works for you. The other lab members and myself are happy to give our own thoughts about this!
Additional resources
3.1 What is the goal of a Ph.D.?
The goal of a Ph.D. is to increase our knowledge of the world. This sounds lofty and exalted, but it really boils down to providing some additional insights into data or phenomena. There are numerous ways, none fundamentally ‘better’ than the other, to achieve this goal. In some cases, your insights will pertain to addressing complex issues, for instance contributing to our understanding of illnesses, such as Alzheimer’s disease. In other cases, your contributions may well be of a more theoretical nature—showing that a certain algorithmic approach is feasible, for example. Your individual trajectory might also oscillate between theory and practice. The important thing is that you will be able to point towards novel insights that you and your research helped unveil.
This is the primary goal of a Ph.D. A germane set of ancillary goals involves the things you might usually associate with pursuing a Ph.D., viz. (i) reading publications, (ii) publishing papers at conferences, in journals, and in workshops, (iii) discussing your work with other scientists, and (iv) writing a dissertation thesis. While important and conducive to reaching the primary goal, these are, in some sense, much more mundane tasks than contributing to the totality of human knowledge. I am stressing this point because every Ph.D. journey is different—it is research, after all—and not every Ph.D. student might end up with the same number of publications afterwards. This does not detract from the overall value of the endeavour! It is tough to resist the ‘counting’ mentality, but please keep in mind this overall goal and Goodhart’s Law, which I paraphrase as follows: ‘When a measure becomes a target, it ceases to be a good measure.’ Thus, do not aim for publication output but aim for knowledge first. It is better to graduate with a few excellent publications than lots of mediocre ones.
Additional resources
3.2 Time management
It is your responsibility to manage your time during the Ph.D., and there is no ‘one-size-fits-all approach.’ However, there are certain patterns that have proven effective, so I want to spell them out explicitly:
-
• Make sure to allocate time for reading. You are a scientist, so reading papers is a core activity that you should do in your work time.
-
• Take your time to document your thoughts and experiments. This will be very useful when writing your thesis. See also section 2.12 on keeping a lab book.
-
• Set aside some time for ‘free-form learning.’ If you encounter a concept, a skill, or a technology that interests you, dive into it! It is often through serendipity that new research ideas are created.
Depending on the way you are internally wired and motivated, it might be useful to have regular mini-deadlines or milestone check-up. Raise these things in our one-on-one meetings to make sure that I am aware of your needs.
Additional resources
3.3 Work–life balance
There is more to life than research—even if we all enjoy it, everyone needs to recharge their batteries. I encourage everyone to keep a larger ‘identity’ beyond ‘I am doing research all the time’. Your output will suffer if you do not take care of your overall well-being. As a general reminder that supersedes everything else, consider ‘Do as I say, not as I do’ rule; I am writing this section as much for myself as I am writing it for you!
3.3.1 ‘Shipping is also a feature’
When you are writing code for your project, working on a paper, or writing a grant proposal, always remember that ‘Shipping is also a feature’. If you never submit your work to the scrutiny of the peer review process, you will never get any critical feedback. More importantly, your work will never be seen and thus also never be used by others. You are depriving yourself and the world by this. Thus, resist the urge to write the perfect paper. Your goal should be to iterate towards perfection—but you should always take opportunities to disseminate your work to a wider audience. I am not saying that you should intentionally lower the quality of your work. Just be on the lookout for the point in time at which your return on investment is just not that big any more. It’s better to submit a paper that you consider to be 90% finished than to wait for years until you finish the remaining 10%.
The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time. (Tom Cargill, Bell Labs)
Additional resources
3.3.2 ‘It’s a marathon, not a sprint’
Good research takes time. Do not fall into the trap of trying to solve everything at once. Your Ph.D. should be treated as a marathon first and foremost; avoid the ‘workaholism’ culture that you will sometimes find (not only) in academia. If you are bragging about not sleeping sufficiently, or if you find yourself repeatedly discussing how many things you sacrificed for your work, you might be on the wrong track. Your capacity to do deep work is diminished if you are sleep-deprived; your capacity to enjoy this marvellous journey might be killed as well. Resist this at all costs!
In my experience, people that obsess over how little sleep they need to be (barely) functioning are often highly inefficient. Of course, they put in a lot of hours, but they do not accomplish a lot because their priorities are not well-aligned. If you are always at ‘red alert’, the term will cease its meaning. There might be times when you need to put into more hours to get something done and it is a good idea to approach these times as a well-rested individual rather than an almost burnt-out candle.
3.3.3 ‘To thine own self be true’
Resist the urge to compare yourself to others. I know that this is easier said than done, given that you will meet a lot of smart, ingenious, hard-working people. However, be aware that everyone has a different background and a different journey. While you can be inspired by what others are doing, you should also aim to focus on your own unique strengths.
3.4 Getting ideas
When you are just starting out with your Ph.D. project, you may feel that you do not have an understanding of the community and how to create ‘good’ research ideas. While there is no easy recipe, I am fond of a process that involves continuously refining and iterating on ideas to sharpen them. To actually get ideas in the first place, think about gaps in the literature or potential extensions. What is not yet properly addressed? Where would additional work be required? Even if such a gap may seem small and insignificant to you at first, it may often grow into something magnificent over time.
Gold is where you find it. (American proverb)
Related to this, be sure to talk about your ideas; getting feedback from others helps you start the important refinement loop I mentioned above. When giving feedback, make sure you always follow the ‘Yes, and…’ rule of improvisational comedy. By this, I mean that you should strive to build on ideas that others present, rather than shooting them down for being impractical or infeasible. A good idea can often arise from reducing a seemingly bad one to its core constituents.
As an alternative to the aforementioned process, I am also very happy to share ‘my’ ideas with you—I keep a backlog of interesting ideas, and I would be more than happy to see some of them implemented in the real world. That is not to say that these ideas are always sound, but they are in all likelihood useful as a rough guideline. I am not tied to any of these ideas, though. Hence, if you find that something does not work and cannot be realised in the way I had envisioned it, it is a good idea to inform me, because knowing that certain directions are just not fruitful is very valuable. As you mature and move further in your Ph.D. research, I would expect you to come up with new ideas on your own and discuss them with me and other lab members. Note that it is not useful to be too critical of an idea in advance; aim to follow the ‘Yes, and…’ approach by building and improving, or refining, ideas until they really shine. Do not be confused or worried by the novelty of an idea. Novelty is a fuzzy concept that is often thrown around in reviews. Unless someone else did it the way you did it, your idea has novelty. Novelty can easily arise from very different directions:
-
• Novelty can come from combining existing things into a new whole. I would argue that, if you are historically inclined, this is how many interesting concepts are created.
-
• Novelty can come from providing a different perspective on things and linking them. I personally love papers that show that two concepts are actually instances of a third, shared concept. This approach—which has a distinct smell of juicy category theory to it—can help others make new discoveries or tie together seemingly disparate fields.
-
• Novelty can come from studying something that no one has ever studied before, or describing an algorithm that has no precedent. This is probably the definition that most people have in mind but I find it hard to come up with examples in mathematics or computer science where one cannot find a precursor of an idea.
Novelty should definitely not be equated with difficulty or utility. These are separate categories altogether, and I wish we would not discuss novelty at all in reviews. If you find yourself referring to the novelty of a method or the lack thereof, a good exercise is to replace ‘novelty’ by ‘beauty.’ If you find that this replacement makes sense, you should try to find more substantial arguments for discussing a paper.
Additional resources
3.5 Publishing
You will probably hear the stupid saying ‘Publish or perish’ quite often. Don’t be alarmed by this. In my view, publishing is one of the joys of research—you get to ‘wrap’ your thoughts and experiments into a nice paper and set it free for others to use. If you are new to this—and most of you reading this probably are—do not be daunted. I and the other group members will accompany you during this journey.
3.5.1 Writing philosophy
Here are some guidelines that are very useful when writing a paper. First and foremost, you should understand that a paper is more like one contribution to a larger scientific discussion. A research agenda can easily give rise to multiple papers—as long as the core idea of each of your papers is strong, it is a good idea to publish it. This will enable other researchers to benefit from it and build their own work on ours. Here are some general thoughts on writing a paper:
-
• Do not put off writing until the last minute. Ideally, start your draft early on and fill it with notes, topic sentences, or a general scaffolding for your ideas. Writing is a highly iterative process and you should be prepared to treat it as such. Many people would rather only produce ‘polished’ text directly instead of filling a document with word salad, but this is largely a matter of training. Nothing is more daunting to fill than an empty page, so why not fill it with something first and then improve it?
-
• Your paper will be largely assessed by your readers based on the merits of its text (and its figures). Hence, treat the writing as the single most important task when it comes to submitting work. All supplementary materials like the code can typically be polished after the fact but the paper is literally the first exposure to your ideas that a reader encounters. Make it worth their while.
-
• When starting a new paper, begin with the core idea of the paper. What do you want to achieve? What should the reader get from reading your paper? See section 2.15 for more thoughts on how to write down such ideas.
-
• Every section in the paper should have a specific purpose. For instance, if you make a claim about the properties of your method, you need a section—experimental or theoretical—to back up the claim.
-
• Every theoretical statement should support your method. Avoid ‘cosmetic mathematics,’ i.e. mathematical formalism for the sake of making the paper appear more ‘sound,’ at all costs. This is but one of the troubling trends plaguing research; see Lipton and Steinhardt [3] for an in-depth discussion on the topic.
-
• When discussing empirical/experimental results, make sure to use the strongest possible baseline. Before trying to squeeze out more performance for your own method, first take a look at what other methods can achieve. This is one way to ensure that we are actually making progress; all too often, papers train their comparison partners only once, while subjecting their own method to extensive hyperparameter tuning. This can lead to incorrect conclusions and hamper progress. See Tönshoff et al. [5] for a recent example of how to overturn incorrect conclusions arising from insufficient hyperparameter tuning.
3.5.2 Writing style
If you do not have a preference for other styles,1 follow the style guide of the Federal Chancellery. Essentially, this means that your papers will be written in British English with ‘-yse’/‘-ise’ suffixes (e.g. analyse and optimise instead of analyze and optimize). This serves as a way to make your publications consistent and is more efficient than having to come up with new rules every time. You can choose another writing style. The important thing is to be consistent about it within one publication.
1 In existing projects, in particular those involving US-based collaborators, other writing styles can be followed.
3.5.3 Writing tips
Writing is an art in and of itself. Investing in your writing skills is the single most important skill you have to grow during your Ph.D.; good writing will serve you well throughout your career. When writing a paper, you are competing for the attention of the reader. Good writing helps you keep that attention, while bad writing drives readers away—and once you have lost a reader’s attention, you have lost the reader for good. Hence, when writing a paper, your overall goal should be to write concisely. This entails getting rid of some empty phrases and using short verbs whenever possible. Personal pet peeves of mine include writing ‘utilise’ when ‘use’ would do, or using ‘methodology’ instead of plain ‘methods;’ see also section 3.5.4 for more of these. Remember that papers are supposed to be read by many people—concise writing improves accessibility. Here are some additional tips that I find to be of general use:
-
• Prefer verb forms over nouns. English is great in that we can use either the verb form or the noun form of a word. However, in scientific writing, the verb form should be preferred because it is clearer. Compare ‘This enables the calculation of topological features by the neural network’ versus ‘The neural network can calculate topological features.’ I am arguing that in the second form, it is clearer who is acting (the neural network).
-
• Check whether absolutes are required. It is rare2 that we use absolute words (such as ‘never’ or ‘always’) in scientific writing.
-
• Use the active voice. The active voice increases the precision of your writing and reduces its complexity at the same time.
Finally, be aware that ‘the perfect is the enemy of the good.’ It is natural to start with a (bad) first draft and refine it iteratively. Nature abhors a vacuum, so it is best to start with something.
Additional resources
-
• Dreyer’s English: An Utterly Correct Guide to Clarity and Style
-
• Levelling Up Your Scientific Writing: Some additional slides emphasising the themes mentioned above.
-
• The Elements of Style: Read this extremely short (par for the course!) and insightful book. Ignore the grammar advice, but focus on the writing tips. The book has received its share of criticism, but it is still an excellent way to get you started. Plus, we are not in the business of writing the next great English novel, so any writing tips that help us get our point across in a better fashion should be celebrated.
3.5.4 Writing pet peeves (common mistakes in writing)
Next to the content on writing and developing your own style in the previous section, this section mentions several of my pet peeves, which should best be avoided. While I understand that these are somewhat personal preferences, I will give you a brief justification for each of them. However, reserve such updates for the final pass through documents. Once we have produced great content in the paper, we can polish the writing and the general ‘look-and-feel,’ but this should not be prioritised over strong experiments and strong text.
-
1. Inconsistent citations or bibliographic entries: Unfortunately, most bibliographic entries that you can download from sources like Google Scholar are neither correct nor consistent. You will find different spellings of author names, missing umlauts or other characters, and much more. Avoid this by meticulously correcting bibliographic entries. A good bibliography is the hallmark of good scholarship. Common mistakes include:
-
• Using a wrong spelling of author names, in particular names with special characters in them. BibTeX is actually great at handling those, but you need to give it some hints for some names. For instance, the Dutch ‘van’ or the German ‘von’ can be used as the prefix of a family name. To make sure that BibTeX understands this, flip the author name around. Instead of writing ‘Julius von Rohrscheidt,’ which treats the ‘von’ incorrectly as part of the first name, write ‘von Rohrscheidt, Julius.’ That way, BibTeX understands which part belongs to the first name.
-
• Using the wrong venue or a preprint. As you know, I am a big fan of preprints, but once a paper is published, make sure to cite the published version.
-
• Not using correct capitalisation of proper nouns (write ‘Bayes’ and ‘Euler’ instead of ‘bayes’ and ‘euler’) and abbreviations (write ‘ICML’ and ‘TDA’ instead of ‘icml’ and ‘tda’). This is actually not only my personal rule but considered proper English.
-
• Duplicating information, such as having a DOI and a URL pointing to the DOI itself.
Here is an example of one my own papers, showing you the (wrong!) BibTeX you get from the publisher:
@article{VANDAELE202185, title = { Stable topological signatures for metric trees through graph approximations }, author = {Robin Vandaele and Bastian Rieck and Yvan Saeys and Tijl {De Bie}}, year = 2021, journal = {Pattern Recognition Letters}, volume = 147, pages = {85--92}, doi = {https://doi.org/10.1016/j.patrec.2021.03.035}, issn = {0167-8655}, url = { https://www.sciencedirect.com/science/article/pii/S0167865521001306 }, keywords = { Topological data analysis, Algebraic topology, Persistent homology, Proximity graphs, Metric trees, Cell trajectory inference },
Here, the doi field is formatted incorrectly, while the issn field3 and the url field are redundant. A better version of this entry would look like this:
@article{Vandaele21a, title = { Stable topological signatures for metric trees through graph approximations }, author = {Robin Vandaele and Bastian Rieck and Yvan Saeys and Tijl {De Bie}}, year = 2021, journal = {Pattern Recognition Letters}, volume = 147, pages = {85--92}, doi = {10.1016/j.patrec.2021.03.035}, }
Notice that I removed the keywords field because, to my knowledge, there is not a single bibliography style that makes gainful use of this information. If the publisher cannot get it right, let us see what Google Scholar does to another paper:
@article{morris2023weisfeiler, title = {Weisfeiler and leman go machine learning: The story so far}, author = { Morris, Christopher and Lipman, Yaron and Maron, Haggai and Rieck, Bastian and Kriege, Nils M and Grohe, Martin and Fey, Matthias and Borgwardt, Karsten }, year = 2023, journal = {The Journal of Machine Learning Research}, publisher = {JMLRORG}, volume = 24, number = 1, pages = {15865--15923} }
Next to the incorrect capitalisation (it should be ‘Leman’), there are many things are wrong with this entry: The journal, publisher, number (issue) and pages are all incorrect. Here is a better entry:
@article{Morris23a, title = {Weisfeiler and Leman go Machine Learning: The Story so Far}, author = { Morris, Christopher and Lipman, Yaron and Maron, Haggai and Rieck, Bastian and Kriege, Nils M. and Grohe, Martin and Fey, Matthias and Borgwardt, Karsten }, year = 2023, journal = {Journal of Machine Learning Research}, volume = 24, number = 333, pages = {1--59}, }
Moreover, depending on the bibliography style, make sure that ‘Weisfeiler’ and ‘Leman’ remain capitalised. You can escape such names (or abbreviations) using curly braces. For instance, you could protect yourself against errors by writing {W}eisfeiler and {L}eman in the title field. For other words in the title, it is best to let your selected bibliography style do the work. Please capitalise the names of venues, though: It should be ‘Advances in Neural Information Processing Systems,’ not ‘advances in neural information processing systems.’ When you use abbreviations for the venues, make sure to use them consistently. Personally, I believe they are best to be avoided because there are no conferences where the length of the bibliography counts towards the page limit.
-
-
2. Excel-like tables: Probably due to our over-exposure to Excel, we tend to format our tables in a way that is not conducive to improved information processing. Make sure to take use good packages for formatting high-quality tables. If you are using LaTeX, the booktabs package is great and gives a lot of tips on how to create high-quality tables. Together with a package like siunitx, you can format and align your numbers consistently, making it a pleasure to read and interact with your data.
-
3. Incorrect use of citations. Make use of the right citation commands of the respective packages. For instance, natbib offers \citep for parenthetical citations, i.e. citations that follow a sentence or word, and \citet for in-text citations, i.e. citations that can be treated as a word. For biblatex, you can use \parencite and \textcite. The reason for this rule is that different citation styles can cause ‘bare’ citations to be rendered incorrectly, thus causing additional work when rewriting or resubmitting a paper. By communication your intent for a citation, you pass the control back to LaTeX and the bibliography package, making your paper read well independent of the citation style.
-
4. Incorrect use of internal references. If you use the cleveref package, you can just use \cref and it will automatically provide you with the right label, i.e. a label indicating a section, a figure, or a table. Make sure to not mix ‘bare’ \ref commands with this for consistency reasons. A similar functionality is provided by the \autoref command of the hyperref package, by the way.
-
5. Illegible labels in figures. Prefer vector graphics with consistent fonts over raster graphics. When raster graphics cannot be avoided, make sure that their resolution is sufficiently large. Also make sure that all items in figures are readable. In my experience, an additional round of polishing figures can pay off and really increase the readability and accessibility of a paper. Often, casual readers (and reviewers) just take a look at the figures, so make sure that they convey the right message.
3 When was the last time you went to a library and asked for a bibliographical item based on its ISSN?
3.5.5 Submission checklist
Having finished your preprint or published paper, here is a small ‘pre-flight checklist’ of common issues to look out for:
-
□ For a submission: Is the paper properly anonymised?
-
□ For a camera-ready version: Is the author list correct (including all affiliations)?
-
□ Are you using the right template for the specific venue?
-
□ Are there any ‘obvious’ typos or inconsistencies in spelling?
-
□ Are abbreviations used consistently?
-
□ Are all figures and tables referenced?
-
□ Are there are any duplicate labels for figures and tables?
-
□ Are all figures sized properly and have a sufficiently high resolution?
-
□ Are there any missing bibliographic references (typically indicated by ‘??’ in the text)?
-
□ When using biblatex: Are you using \parencite and \textcite correctly?
-
□ When using natbib: Are you using \citep and \citet correctly?
-
□ When using cref: Are you using \cref instead of \ref to reference things?
3.6 Conclusion
I wish you an amazing time with the lab and hope you find joy in your research! If all of these suggestions are too much, feel free to ingest them at another point in time. Again, this document should only serve as a gentle scaffold—as opposed to a prison—for all of us.
4 For (prospective) undergraduate students
This chapter discusses the role of a (prospective) undergraduate student in our lab. You might be considering working on some project with us, for example as part of your bachelor’s or master’s thesis, or as a stand-alone stay. While many of the general recommendations in chapter 3 apply, your situation is special and deserves some more in-depth remarks.
4.1 What is the goal of your stay with us?
Your primary goal is to become an expert in a topic and learn how to communicate efficiently and effectively about this topic by means of a project report or a written thesis. Ideally, your work will result in a publication in an appropriate venue.1 Since the duration of your project is typically smaller, we will ensure that the individual learning goals are feasible.
1 Typically, such publications will be pursued after your main work has been finished. A publication will not count towards your grade and, depending on the project, publications might be easier or harder to obtain.
4.2 Potential topics
When it comes to selecting a topic for your stay with us, there is a large amount of flexibility. Bearing in mind overall feasibility, I am particularly fond of projects that either involve improving an existing method, or that add a new facet—such as a new application domain—to an existing method. Of course, ‘the sky is the limit;’ if you have a research idea that you want to pursue with us, I am happy to discuss it with you!
For some examples of potential topics, check the publication lists of members of the lab. You can also approach us with your project ideas, even if you think that they might not be a great fit for us. The only caveat is that we want to ensure that we are capable of providing outstanding supervision to you. If your topic is completely outside our area of expertise, we might not be able to fulfil our end of the supervision agreement, potentially making it harder for you to finish the project in case of obstacles.
4.3 Writing a thesis or a project report
The writing tips outlined in section 3.5 apply to a written thesis or a project report as well. Here, unlike for a scientific publication, your goal should be to strive for a comprehensive treatment of the topic at hand. The idea is for you to demonstrate that you have become proficient in a topic and are capable of communicating said proficiency to others. Hence, your thesis and project report is a perfect opportunity to practice writing skills. Regardless of your career path, such skills will always be useful.
A common mistake in thesis writing is an incorrect level of detail, i.e. putting too much emphasis on too many details. This can be detrimental to progress, paradoxically, and result in an additional source of stress, so it should best be avoided. While it is important for your thesis to be self-contained and readable by, say, a fellow student of yours, there is no need to recapitulate any common definitions. I realise that this is vague point; what is common for one person might be news for another one. A good rule of thumb is to think about basic required undergraduate courses. If the material is covered in them, it is probably fine to exclude it.
Additional resources
Bibliography
-
[1] Frederick P. Brooks. ‘The Computer Scientist as Toolsmith II’. In: Communications of the ACM 39.3 (1996), pp. 61–68. doi: 10.1145/227234.227243.
-
[2] Veronika Cheplygina et al. ‘Ten simple rules for getting started on Twitter as a scientist’. In: PLOS Computational Biology 16.2 (2020), pp. 1–9. doi: 10.1371/journal.pcbi.1007513.
-
[3] Zachary C. Lipton and Jacob Steinhardt. ‘Troubling Trends in Machine Learning Scholarship. Some ML papers suffer from flaws that could mislead the public and stymie future research’. In: Queue 17.1 (2019), pp. 45–77. doi: 10.1145/3317287.3328534.
-
[4] Florian Markowetz. ‘You Are Not Working for Me; I Am Working with You’. In: PLOS Computational Biology 11.9 (2015), pp. 1–8. doi: 10.1371/journal.pcbi.1004387.
-
[5] Jan Tönshoff et al. ‘Where Did the Gap Go? Reassessing the Long-Range Graph Benchmark’. In: The Second Learning on Graphs Conference. 2023. url: https://openreview.net/forum?id=rIUjwxc5lj.
Inspiration
This guide is inspired by several similar documents. The primary source is the Syllabus for Eric’s PhD Students written by Eric Gilbert. Moreover, Ryan Cotterell provides helpful hints on his supervision style, some of which have been incorporated in this guide as well.