# Problem setter's handbook

Judges

The concept of the judge is fundamental to what makes Sphere Engine Problems so powerful. In previous chapters, we discussed both test case judges and master judges to show that they are responsible for the user submission correctness verification. This chapter covers the topic of built-in general purpose judges along with representative examples.

### Test case judges

We will recall that the test case judge is the program responsible for evaluating a solution to a programming problem in the context of a single test case. Typically, the judge checks the compliance of the output data with the data generated by the user's program.

Test case judge returns a piece of information about the results of the solution execution:

- verdict (see verdicts)
- score (optional)
- execution time
- memory consumption

As we have already discussed, in most cases it is enough to utilize one of the built-in test case judges that we discuss in this section.

#### Ignores extra white spaces

The output data of the user's program must match the reference output data with the accuracy of the extra white spaces (spaces, tabulations, newline characters, and carriage return characters). The judge's operation is equivalent to the exact comparison of standardized data, where the standardization process involves the removal of the initial and final white spaces and the transformation of the remaining continuous chains of white spaces into a single space. This test case judge doesn't set a score (the score is always equal to 0).

The

`prime number`

problem (i.e. for a number`n`

determine whether`n`

is a prime number and then return`1`

as a result. Otherwise the result is`0`

).

The problem of prime factorization of the number (i.e., for a number

`n`

find the prime factors`p`

that_{1}, p_{2}, ..., p_{k}`n = p`

). It is known that the prime factorization of the number is unique up to the order of prime factors. Therefore, if we require an output specification to write a sorted list of factors, there is only one correct answer to the problem._{1}• p_{2}• ... • p_{k}

#### Ignoring floating point errors up to 0.01

The judge treats white spaces (spaces, tabulations, newline characters, and
carriage return characters) identically to the **Ignores extra white spaces**
test case judge. In addition, all numbers occurring in the output data of the
user's program must match the corresponding figures in the reference output data
to two decimal places (i.e., `0.01`

). This means that if `x`

and `y`

are the
numbers being compared, we find them matching if `|x - y| < 0.01`

. This test
case judge doesn't set a score (the score is always equal to 0).

The problem of the triangle area, i.e. for given integer side lengths

`a, b, c`

calculate the area of the triangle. In general, the result is not an integer, but a high level of precision is not required (i.e., the precision up to 0.01 is enough).

#### Ignoring floating point errors up to 0.000001

The judge works analogously to the previous one. However, it compares numbers with greater precision, so it is useful
for problems that require greater accuracy. The required precision is up to `0.000001`

.

The problem of the value of the

`sine`

function. The range of the`sine`

function is`[-1,1]`

, therefore precision is important.

#### Score is source length

Solution correctness is verified in the same way as in the case of the **Ignores
extra white spaces** test case judge. The only difference is the score, which in
this case is equal to the length of the user's program source code.

This test case judge works exceptionally well for problems with an optimization side-goal, which is the implementation of the correct solution with as few characters as possible.

The problem of determining first

`n`

numbers in the decimal expansion of the number`π`

for given`n`

. The challenge is to solve this problem with the shortest possible source code.

#### Academy

It works exactly the same as the **Ignoring extra white spaces** judge, but in
addition it gives access to the difference between model output file and the
user's output.

#### Exact judge

The output data of the user's program must be identical to the reference output data.

### Master judges

Master judge is a program responsible for summarizing the results from individual test cases. On the basis of the partial results, the following components of the result are determined:

- the final verdict (see verdicts)
- final score
- overall execution time
- overall memory consumption

The master judge can arbitrarily set each part of that information based on the information received from test case judges (see diagram of submission flow).

##### Important

The final verdict and the score can be freely combined based on verdicts and scores from test case judges.

As we have already discussed, in most cases it is enough to utilize one of the built-in master judges that we are going to examine.

#### Generic master judge

The master judge accepts the solution if all test cases have been solved. The final parameters are determined as follows:

**final score**– the average value from all test cases**overall execution time**- total execution time of all test cases**overall memory consumption**- maximum memory consumption from all test cases

Test case Verdict Score Time Memory 1 accepted 2 1s 1000kB 2 accepted 4 4s 1500kB Final resultaccepted35s1500kBWhen all test cases obtain the positive verdict (i.e.

`accepted`

), then the final verdict is also accepted. The final score is an average of scores, the overall execution time is a sum and the overall memory consumption is a maximum value over all test cases.

Test case Verdict Score Time Memory 1 accepted 2 1s 1000kB 2 wrong answer 0 4s 1500kB 3 time limit exceeded 0 0s 1000kB Final resultwrong answer01s1000kBWhen any test case fails, the final verdict is inherited from the first failed test case. In this example, the problem has three test cases, the second and third of which fail. The final verdict is inherited from the second test case.

##### Tip

The **Generic master judge** is a proper choice when the problem setter requires
that the solution fulfills all their requirements i.e. it is absolutely correct
and sufficiently efficient.

#### Score is % of correctly solved sets

The judge accepts the solution if at least one test case has been correctly solved. The final parameters are determined as follows:

**final score**- the percentage of correctly solved test cases**overall execution time**- total execution time of all accepted test cases**overall memory consumption**- maximum memory consumption from all accepted test cases

Test case Verdict Score Time Memory 1 accepted 0 1s 1000kB 2 wrong answer 0 4s 1500kB 3 wrong answer 0 1s 1500kB 4 accepted 0 2s 2000kB 5 accepted 0 1s 1000kB Final resultaccepted604s2000kBWhen at least one test case obtains the positive verdict (i.e.

`accepted`

), then the final verdict is also positive. The final score is a percentage of correctly solved test cases (in this case`60%`

because 3 of 5 test cases are solved correctly), the overall execution time is a sum and the overall memory consumption is a maximum value over all test cases.

Let us consider the

`power function`

problem. For integer numbers`a`

and`b`

calculate the value of`a`

.^{b}

- The first test case can deliver only input instances for which the result is in the standard numeric type scope (e.g. the result is not greater than
`2`

).^{32}- The second test case requires the solution to manage a large number as the result.
These two test cases give information on solution correctness but without considering efficiency aspects.

- The third test case is designed to test how efficient the solution is. Therefore, it is designed to reject slow (inefficient) algorithms. Only fast (more efficient) algorithms should be able to pass successfully.
The least advanced (but in some way correct) solutions will pass the first test case only and achieve a result of

`33%`

. More advanced solutions (properly handling big numbers) are able to pass the first and the second test and achieve the result of`66%`

. To achieve the best result of`100%`

the solution needs to implement both big numbers and fast power algorithm to pass all three test cases.