Rebuttals

I provide below some advice on how to write a strong paper rebuttal. Please contact me for comments and feedback.

A rebuttal is THE mean to influence a PC discussion

Reviewers have very little time to read your rebuttal. In the worst case, reviewers will skim over it, read only parts, and do not remember the comments of other reviewers (and maybe their own) in detail. When your paper is discussed in the PC, most other people will see the rebuttal for the first time and skim over it for less than 10 seconds.

In 1-5 minutes at the PC meeting when your paper is discussed, reviewers try to find 3-5 reasons why your paper is either an accept or a reject. They look for evidence that:

The claimed contribution is novel
The work is not incremental
The paper is finished (does not lack important ideas)
The paper is technically sound
The claimed contributions are supported by your evaluation

The primary objective of a rebuttal is to influence this 1-5 minute PC discussion in your favor by providing arguments that support the above points. In addition, a rebuttal should correct potential misunderstandings and errors in the paper.

The structure of a rebuttal

A good rebuttal:

Highlights your principal arguments
Makes it easy to extract information relevant for a specific reviewer
Addresses reviewer questions and at the same time also provides the typical evidence a PC is looking for (novelty, non-incremental, completeness, soundness, experimental support).

The following structure has proven successful:

Thank all reviewers for their constructive feedback and promise to incorporate their suggested corrections.
Provide 3-5 principal arguments (sorted by importance) for all reviewers
- State each argument (not the problem) directly in the title
- Followed by the reviewers whose questions are answered by this argument.
- Support your argument. Don't answer specific questions, but provide evidence for your argument and make sure this evidence addresses the invidiual questions raised in the reviews.
For each reviewer address the remaining questions and technicalities only relevant for the individual reviewer.

If space is an issue, shorten backwards. Never shorten your principal arguments, but maybe use short bullet point lists for individual questions. If necessary, ignore non-critical questions, rather than shortening your principal arguments.

How should I go about writing my rebuttal?

Start by answering the questions of each authors individually, the same way you would provide answers to an email inline:

Review A:

I am concerned ... I also wonder ... Finally, ...

Answers for A:

> I am concerned

This concern is invalid.


> I also wonder

This is correct.

> Finally, 

This problem does not apply in our case, because ...

In a second step, you 1) identify common questions and feedback given to multiple reviewers and 2) sort feedback according to priority (a: important for acceptance, b: technical question worth addressing, c: minor formatting/typo remarks). Questions raised by multiple reviewers which are important for acceptance are factored out as principal arguments, technical questions worth addressing remain in individual sections, and minor comments are dropped as they are addressed in the first sentence.

What to avoid?

Do not plainly agree to major criticism: If reviewers point out problems with your paper, do not plainly agree. Instead, show a perspective under which your argumentation is strong and then potentially agree to make amendments that address the concerns raised by your reviewers.
Do not provide new data: It is often explicitly stated that a rebuttal is to clarify questions, not to provide new insights and arguments for accepting your paper. If you provide new data in a rebuttal, this data has not been seen by the PC before and can not be evaluated in the small time-frame of a rebuttal. As a result, the PC is likely to conclude that the paper without this data is incomplete and will reject the paper to enable a resubmission of the completed paper at a later conference. The exception to this rule is to provide data for clarification. E.g., you run one specific configuration of a dataset and reviewers are curious if another configuration would not be better. In this case you can provide this data, but make an argument that it a) does not change the story of the paper, b) clarifies a question in the review, c) and is not something you did not have time to complete earlier.
Do not plainly disagree: In case your reviewers raise concerns regarding an argument in your paper, do not just state that you believe you are right (e.g., 'we claim that only >>objects<< need to be considered in our analysis'). Instead, try to explain under which perspective reviewer concerns would be true and show why they do not apply in your case.

FAQ?

What happens if I exceed the typical 500 word limit? In case you cannot submit more than for example 500 words, you obviously need to shorten the rebuttal. If the submission system allows for more than 500 words, make sure the core rebuttal fits within 500 words. If additional clarifications would help, it is good practice to deliver this information in an optional appendix.
How well must the rebuttal be written?

The principal arguments must be worded exceptionally well. I typically spend at least half a day on just these questions. If collaborating with students this takes over a day. The remaining answers should also be well written but normally 1-2 passes are sufficient.
How do I handle mistakes or wrong statements in my paper? Mistakes happen and can never be completely avoided. To professionally handle them it is important to not hide them, but to acknowledge their existance and explain how mistakes will be corrected. While acknowledging minor problems will not put your paper at risk, there is sometimes a conflict of interest between defending your work and acknowleding mistakes. Academic integrity must always take precedence! However, if reviewers pointed out some non-trivial problems, it often is wise to put your answer in context and be clear about how the story of the paper remains unaffected.
A reviewer finds a strong angle of attack and she is right. What do I do?

Sometimes attack is the best defense. In some cases reviewers point out major weaknesses in parts of your work. Instead of trying to prove them wrong, acknowledge these weaknesses and provide arguments why you deliberately accepted these weaknesses for example to enable other use cases or make your work strong according to a different metric.

A good example is the following response in our cache model paper from below. Reviewers raised concerns that our work failed to model many important hardware details. Our response was:

Deliberately avoiding over-detailed modeling yields a responsive cache model (A)

Our evaluation demonstrates that HayStack has good performance and accuracy in
practice, despite purposefully not modeling all hardware details. We understand
the concerns of ReviewerA regarding modeling additional details of the cache
hierarchy. As PolyCache demonstrates, detailed models come at significantly
higher cost and yield only minor gains in accuracy.

How should I prepare for a meeting with my supervisor

Make sure the following questions can be answered:
- Are there any open technical questions you could not answer?
- What are the most important questions (should be at the top)?

Example

The following rebuttal was submitted to PLDI'19 for our paper A Fast Analytical Model of Fully Associative Caches. The original reviewers were named A/B/C/D and are addressed accordingly. The scores for this paper were 'CBCAA', two strong accepts, one accept, and two weak rejects. The paper was accepted for publication, after the rebuttal.

We thank all reviewers for their constructive feedback and will incorporate
their corrections.


**Orders of magnitude faster cache modeling enables interactive developer feedback (A/C)**

Our objective is to close the cache modeling problem for computational kernels
by providing almost immediate results compared to the compute-heavy previous
approaches. By focusing on this one problem, we provide the core building block
for manual and next-generation automatic program optimization. Already today we
enable interactive developer feedback to support manual optimization and our
collaborators even started work on a plugin that presents detailed cache miss
information in the source code editor.

We strongly believe easily accessible cache miss information is key for memory
hierarchy aware programming. Only developers who have immediate feedback on the
cost of individual memory accesses can select program variants that minimizing
data movement.

**Deliberately avoiding over-detailed modeling yields a responsive cache model (A)**

Our evaluation demonstrates that HayStack has good performance and accuracy in
practice, despite purposefully not modeling all hardware details. We understand
the concerns of ReviewerA regarding modeling additional details of the cache
hierarchy. As PolyCache demonstrates, detailed models come at significantly
higher cost and yield only minor gains in accuracy.

**Cache performance is inherently problem-size dependent (A/D)**

The cache behavior of a program and the selection of good code transformations
are inherently problem size dependent. Computing the absolute number of cache
misses without knowing the relevant problem size is impossible. HayStack has no
problem with parsing parametric loop bounds and in this case asks the developer
to provide typical problem sizes.

**We count cache misses analytically without explicit enumeration of all memory
accesses (C/D)**

The Barvinok algorithm is a very sophisticated generalization of Gauss’ formula
to count the numbers in the interval [1, 2, 3, …., n-2, n-1, n] symbolically as
(n(n+1))/2. By counting the stack distance and the cache misses symbolically,
we can in most cases derive the cache misses immediately without explicit
enumeration.

**Counting with the Barvinok algorithm is fast in practice (C/D)**

Similar to the Simplex algorithm, the Barvinok algorithm has an exponential
worst case complexity but is fast in practice. The complexity of the algorithm
does not depend on the volume but on the structure of the counted sets. Our
implementation keeps this structure simple and as visible in our evaluation
ensures efficient counting in practice.

**ReviewerA**

As discussed by Hoefler et al. [20], the distribution of performance results is
usually skewed to the right. The median better represents the center of such
skewed distributions (besides being more outlier robust).  The tiled heat-3D
kernel triggered a performance bug in the Barvinok library. We addressed the
problem and now compute the cache misses in ~500s.  Table 1 selects 6
representative kernels with subdomain enumeration (blue bars in Fig. 14). We
will provide the data for all blue kernels in the final version of the paper.
PolyCache uses an entirely different and costly algorithm and requires the 1024
cores to achieve the published execution times. This is a problem for an
interactive tool that is supposed to run on a laptop.  We run Dinero as a QEMU
plugin and compute the trace on-the-fly. We did not measure the trace
generation cost separately but the plugin mode is much more efficient than
generating the trace in advance.

**ReviewerB**

The tool already reports the cache misses for every statement (even each memory
access) and also has the set of conflicting statements that contribute to the
cache miss readily available (Fig. 5, F intersect B). We will detail this point
in the final version of the paper (thanks for this idea).  We correct the
composition operator.

**ReviewerC**

HayStack uses the pet compiler infrastructure to parse C code and returns the
cache misses for every memory access of the program.  Manually guessing the
memory footprint of individual loop nests is an ad-hoc approach developers need
to revert to today. Yet this approach is imprecise and does not scale to
complex kernels such as cholesky.  The error is |#measured misses-#computed
misses|/#total accesses

**ReviewerD**

The Barvinok algorithm makes the cost of HayStack problem size independent
(asymptotically faster than simulation) except when enumerating all dimensions
(not observed).  All experiments model only the L1 and L2 caches except for
Fig. 13.