python – Alfonso Sánchez-Beato's blog

These days, in the covid world we live in we have multiple videoconferences every day, with colleagues, family, friends… The average number of people attending has grown, and when I was in a call some time ago, I wondered, what would be the “optimal” way of splitting the videconference application window among the people in the call? What would be the definition of “optimal”? And, in how many ways could the space be split?, as sometimes you might prefer, for instance, one where space is shared equally or one where the current speaker has predominance.

This intrigued me, and I started some research to try to find answers to these questions. I started with the assumption that the videoconference application owns a window on the screen, which is a rectangle (we will call it the bounding rectangle), and we want to divide it in $N$ different rectangles, being $N$ the number of people in the call. We can define different criteria for optimizing like, say, dividing equally the window area among the callers. However, for that we need to define mathematically the dividing rectangles, and here is where the first difficulty arises, as the mathematical description changes depending on how you split the window. For instance, two of the ways of dividing the bounding rectangle (let’s call that “rectangulation”) for $N=3$ are as displayed:

Figure 1 – two possible ways of dividing a rectangle in three

Here we assign a width and a height to each rectangle. Note that optimizing means assigning values to these $w_i$ and $h_i$ to fulfill the optimizations criteria, which is like moving the “walls” between rectangles. The height $h$ and width $w$ of the bounding rectangle are considered fixed. In each rectangulation, it is possible to expose the implicit restrictions by equations that relate widths and heights of the different rectangles. For the first one in the image, the equations are:

$$\begin{array}{lcl}
w_1+w_2 & = & w \\
w_1 & = & w_3 \\
h_1+h_3 & = & h \\
h_1+h_3 & = & h_2 \\
\end{array}$$

For the second one:

$$\begin{array}{lcl}
w_1+w_2+w_3 & = & w \\
h_1 & = & h \\
h_1 & = & h_2 \\
h_2 & = & h_3
\end{array}$$

When we optimize we need to take into account these restrictions, which implies that for each rectangulation we have a different optimization problem with different solutions due to having different restrictions/equations in each case. The conclusion is that we need to solve the optimization problem for all possible rectangulations for a given $N$, and then compare the different optimal values to select the best one. In summary, the steps we need to follow are

Find all possible rectangulations for $N$ callers
Decide an optimization criteria
Solve the optimization problem for all rectangulations, finding the best sizes for each way of splitting the bounding rectangle
Choose the best rectangulation by finding out which of them has the optimal solution that fulfills better the optimization criteria

In the following sections we will go through each of these steps. I have developed some python code that implements the algorithms and that includes some unit tests that drive them and will be mentioned in different parts of the article. You can find it in github. If on an Ubuntu system, you can install dependencies and clone the repository from the command line with:

$ sudo apt install python3-pip
$ pip3 install numpy matplotlib scipy sympy
$ git clone https://github.com/alfonsosanchezbeato/optimal-rectangulations

N.B. In the mathematical expressions we use one-based indexes but, for practical reasons, the code uses zero-based indexes.

Rectangulating a rectangle

There is quite a bit of mathematical literature on the topic of rectangulations, motivated by applications like placing elements in integrated circuits. Depending on how you define them there can be more or less ways to “rectangulate” a rectangle. We want to take into account all possible rectangulations, but at the same time we need to be careful to avoid duplications. In our framework, a single rectangulation is one that can represent all rectangle sets that can fulfill a given set of equations as those shown at the beginning. Practically, that means that we can slide segments in the rectangulation (without changing the vertical or horizontal orientation) along the limiting segments, as that will not change the relationship that the equations describe. Each of these rectangulations is a class of equivalence of all possible rectangle sets described by a set of equations, and we need to find each of them and then find the optimal rectangles sizes for them, so we cover all possibilities. It turns out that each of these equivalence classes can be represented by what is called in the literature mosaic floorplans or diagonal rectangulations.

How to obtain these rectangulations? I will follow (mostly) the method explained in ¹. First, we create a matrix of size $N$x$N$, which conceptually splits the bounding rectangle in cells. We will call it the rectangulation matrix. We number the rectangles consecutively and put the numbers in the matrix diagonal. Then, we choose a permutation (any can do) for the numbers. This permutation determines how the rectangulation will look like. Finally, we follow the order in the permutation to create the rectangles, one by one. To build each rectangle, the rules are

Start from the free cell as left and bottom as possible that can include the cell with the rectangle number
Do not include cells with numbers for other rectangles
Make it as big as possible, with the restriction that it must not surpass the right of the rectangle below it and it must not surpass the top of the rectangle to its left. The bottom and left walls of the bounding rectangle are considered bottom and left rectangles for rectangles touching them.

This defines a map between permutations and diagonal rectangulations that we will call $\rho$. Figure 2 illustrates this procedure for the rectangulation $\rho(4,1,3,5,2)$.

Figure 2 – rectangulation for $\rho(4,1,3,5,2)$

This is implemented in the do_diagonal_rectangulation() function. There is a test to create random rectangulations that can be run with

$ ./rect_test.py TestRectangles.test_diagonal_rectangulation_random

Figure 3 shows an example of a run, with numbered rectangles. The sizes are as returned by do_diagonal_rectangulation(), so we can see that all rectangles have part of them in the top-left to bottom-right diagonal.

Figure 3 – rectangulation for N=10

It turns out that this map between permutations and diagonal rectangulations is surjective, so it would be interesting to filter out the permutations that produce duplicated rectangulations to reduce the number of calculations. Fortunately, it has been proved that it is possible to establish a bijection between a subset of permutations, called Baxter permutations, and diagonal rectangulations².

Baxter permutations

Baxter permutations can be defined by using the generalized pattern avoidance framework³. The way it works is that a sequence is Baxter if there is no subsequence of 4 elements that you can find in it that matches one of the generalized patterns 3-14-2 and 2-41-3. This is an interesting topic but I do not want to digress too much: I have based my implementation on the definition of section 2 of ⁴. The function is_baxter_permutation() checks whether a sequence is Baxter or not by generating all possible subsequences of 4 elements and checking if they avoid the mentioned patterns. An example using it can be run with

$ ./rect_test.py TestRectangles.test_baxter_permutation

Using Baxter permutations to filter out permutations can save lots of time for bigger $N$. For instance, for $N=10$, only around 10% of the permutations are Baxter. There is a closed form formula for the number of Baxter permutations as a function of $N$⁵, so we can compare easily with the total number of permutations, $N!$.

Rectangle equations

Once we have a rectangulation, we need to extract the equations that define the relations between widths and heights across the rectangles. To do this, we can loop through the rectangulation matrix and detect the limits between rectangles, which define horizontal and vertical segments. When found, we will create an equation that defines that the sum of the widths/heights of the rectangles at each side of the segment must be equal. There will be also two equations for the bounding rectangle that make the sum of width/height of rectangles at the side be equal to the total width/height. Note that these are 2 and not 4 equations, as 2 of the 4 we can extract would be linearly dependent on the full set of equations (this can be seen because if we take the equation on one side and follow the parallel segments equations while moving to the opposite side we can deduce the rectangles that touch the bounding rectangle at the opposite limit). Example of the equation sets that would be extracted are those we have already seen for the rectangulations in figure 1. In the slightly more complex case of figure 2, the equations would be

$$\begin{array}{lcl}
w_1+w_2 & = & w \\
w_2 & = & w_3+w_5 \\
w_1+w_3 & = & w_4 \\
h_1+h_4 & = & h \\
h_1 & = & h_2+h_3 \\
h_3+h_4 & = & h_5
\end{array}$$

The function that implements this algorithm is build_rectangulation_equations(). It returns a matrix with the equation coefficients that will be used when optimizing the rectangulations.

The final number of linearly independent equations we will get will be always $N+1$. This is a corollary of Lemma 2.3 from ⁶. This lemma defines two operations, “Flip” and “Rotate”, that allow transforming any rectangulation on another one by a finite number of these operations. These operations do not modify the number of segments in the rectangulation. Therefore, we could transform any rectangulation in one having only, say, vertical segments. This rectangulation would have $N-1$ segments splitting the bounding rectangle, which implies that the original rectangulation had also $N-1$ segments, because flips and rotations keep the number of segments the same. Finally, after considering two additional equations for the bounding rectangle we find out that the total number of equations for any rectangulation would be $N+1$.

In each rectangulation, some of the segments will be horizontal, defining $J$ equations, and others will be vertical, defining $K$ equations, with $J+K=N+1$ if we consider now two of the sides of the bounding rectangle segments. We can then define a set of coefficients $a_{j,i}$ and $b_{k,i}$ with $j \in \{1,\ldots,J\}$, $k \in \{1,\ldots,K\}$ and $i \in \{1,\ldots,N\}$ to have finally the equations

$$\begin{array}{lcl}
a_{1,i}w_i & = & w \\
a_{j,i}w_i & = & 0 & \textrm{for } j \in \{2,\ldots,J\}\\
b_{1,i}h_i & = & h \\
b_{k,i}h_i & = & 0& \textrm{for } k \in \{2,\ldots,K\}
\end{array}$$

where the two of them with independent term will be the equations to account for the limits imposed by the bounding rectangle. Note that here we are passing all non-independent terms to the left, so the values for the coefficients can be $0$, $1$ or $-1$.

Optimization problem

Now that have defined arithmetically a rectangulation, we need to decide, what do we want to optimize for? Depending on the circumstances, we might want to optimize with different targets. For instance, we might want to split a window equally among callers, but depending on the rectangulation that could produce very narrow rectangles either horizontally or vertically. So, this cannot be the only thing to consider. In the end, I decided to adopt two criteria:

Make the area the same for all callers as far as possible
Preserve the camera aspect ratio so we can display camera output without cropping

These two criteria overdetermine the system (although this would not be the case if we had selected only criterion 1, as in this case we would have $2N$ equations in total), so it is not possible to achieve both fully at the same time. So, I used a ponderated optimization function with a coefficient $c$ to balance between the two parts of the function. With that and taking into account the previously calculated rectangulation equations, we can pose the optimization problem as

$$\begin{aligned}
\min_{w_1,\ldots,w_N,h_1,\ldots,h_N \in \mathbb{R}^{2N}} F(w_1,\ldots,w_N,h_1,\ldots,h_N) = \\
= \sum_{i=1}^N{\left(w_i h_i – \frac{w h}{N}\right)}^2 + c\,h^2 \sum_{i=1}^N{\left(w_i – k h_i\right)}^2 \\
\textrm{subject to} \quad
\begin{aligned}
a_{1,i}w_i = w \\
a_{j,i}w_i = 0 \\
b_{1,i}h_i = h \\
b_{k,i}h_i = 0
\end{aligned}
\end{aligned}$$

where $w$ and $h$ are the width and height of the bounding rectangle and $w_i$ and $h_i$ are the width and height of the $i$-th rectangle of the rectangulation. The coefficient $k$ is the desirable width over height ratio for the rectangles, which we will want usually to be the same as the usual camera ratio between $x$ and $y$ resolution. In the first term of the objective function $F$ the difference between the rectangles areas and the area that we would have if the total area was equally split is calculated, while the second term measures how far we are from the target aspect ratio in each rectangle.

Note that in the equation we multiply $c$ by the the squared total height. This is needed so resulting sizes are scaled when $w$ and $h$ are scaled. That is, when multiplying the size of the bounding rectangle by a constant $q > 0$ so $w \rightarrow qw$ and $h \rightarrow qh$, we would like the sizes of the solution to scale by the same value. We can see that is actually the case if we pose a new optimization problem for the scaled bounding rectangle (with new variables $w_i^\prime$ and $h_i^\prime$) and then substitute in the new target function $F^\prime$:

$$
\begin{aligned}
F^\prime = \sum_{i=1}^N{\left(w_i^\prime h_i^\prime – \frac{q^2 w h}{N}\right)}^2 + c\,q^2 h^2 \sum_{i=1}^N{\left(w_i^\prime – k h_i^\prime\right)}^2 \\
= q^4 \sum_{i=1}^N{\left(\frac{w_i^\prime h_i^\prime}{q^2} – \frac{w h}{N}\right)}^2 + c\,q^4 h^2 \sum_{i=1}^N{\left(\frac{w_i^\prime}{q} – k \frac{h_i^\prime}{q}\right)}^2
\end{aligned}
$$

The constant $q^4$ does not affect the location of the minima of the function, so we can ignore it. Then, if we do the change of variable $w_i^\prime = q w_i$ and $h_i^\prime = q h_i$ we end up with the original objective function $F$, which proves that the minima of $F^\prime$ are $q$ times the minima of $F$. The constraints also behave well under the change of variables: the equations without independent terms are unaffected by scaling, and for the other two, when we replace $w$ by $qw$ and $h$ by $qh$ we see that $q$ can go to the left side of the equations and we would have $w_i^\prime/q$ and $h_i^\prime/q$ in all terms, showing that with the change of variables we would have the original minimization problem.

What this is telling us is that solutions to this problem depend only on $N$, $k$, $c$ and the ratio $w/h$, minus scaling. If we have a solution for a set of these variables, we know the solution if we scale proportionally the bounding rectangle.

Something similar happens if we scale only one of the dimensions, with the difference that this would happen only if we scale $k$ too, so $w \rightarrow qw$ and $k \rightarrow qk$ would scale widths by $q$, while $h \rightarrow qh$ and $k \rightarrow k/q$ would scale heights by $k$.

Finally, this is defining the solution to one rectangulation, determined by a given Baxter permutation that we will call $r$, that defines a set of optimal values $\{w_1^r,\ldots,w_N^r,h_1^r,\ldots,h_N^r\}$. The final solution would be the one that minimizes $F$ across the optimal values for all possible rectangulations. If we create a set $S_{B(N)}$ where each element is a set with the optimal solution to a rectangulation, being $B(N)$ the total number of rectangulations, the global optimal values would be

$$
\min_{\{w_1^r,\ldots,w_N^r,h_1^r,\ldots,h_N^r\} \in S_{B(N)}} F(w_1^r,\ldots,w_N^r,h_1^r,\ldots,h_N^r)
$$

Solving the optimization problem

It is possible to solve the optimization problem of the last section in a fully analytical way. For that, we can use Lagrange multipliers to include the restrictions in a new optimization function, and then find the derivatives to be able to find the extrema. The python sympy module comes in handy for solving analytically this problem: this is implemented by the solve_rectangle_eqs() function. There is a test that uses it:

$ ./rect_test.py TestRectangles.test_lagrange_method_analitycal

However, it turned out that when we have enough rectangles, sympy can take a very long time to get a solution, and we have many problems to solve, one for each possible rectangulation. Therefore, I used in the end the minimize function from scipy library to solve numerically the problem. We can calculate analytically the Jacobian for the objective function and provide good initial estimates by using the proportions from the diagonal rectangulation. The minimize function can also incorporate constraints, so we can easily add the rectangulation equations to the mix. Scipy will choose the SLSQP algorithm to perform the minimization, due to these constraints. The results are consistently fast and accurate. The function that implements this is minimize_rectangulation(). Some of the tests that use it can be run with:

$ ./rect_test.py TestRectangles.test_scipy_minimize
$ ./rect_test.py TestRectangles.test_diagonal_rectangulation_15rect
$ ./rect_test.py TestRectangles.test_diagonal_rectangulation_random

The solution for the rectangulation in figure 3 for a window of size 320×180 for $c=0.05$ and $k=1.5$ can be seen in figure 4. The area element of $F$ is predominant in this case.

Figure 4 – figure 3 rectangulation with sizes minimizing the target function

Results

Now that we have everything in place we can answer the questions we posed at the beginning of the post. For instance, in a videoconference with 5 callers, what would be the ideal window split? For this, the function get_best_rect_for_window() calculates all possible rectangulations for a given $N$, making sure we filter non-Baxter permutations, then minimizes according to the target function we defined previously (by using minimize_rectangulation()), and finally selects the rectangulation that produces the global smallest value for that target. A test that calculates the optimum for $N=5$, $c=0.05$, $k=1.33$, $w=320$ and $h=180$ can be run with:

$ ./rect_test.py TestRectangles.test_best_5rect

Visually, we get the rectangulation shown in figure 5.

Figure 5: optimal rectangulation for N=5, k=1.33

How things change as we start to resize the window? One of the tests shows what happens when the ratio $w/h$ changes (remember that solutions simply scale if we scale width and height with the same constant, so this ratio interests us more than real resolutions sizes in a monitor). For $c=0.05$, $k=1.33$ and area $wh=320 \cdot 200 = 57600$ pixels, this test finds the optima for different ratios:

$ ./rect_test.py TestRectangles.test_best_rect_for_w_h_ratio

We can see the optimal rectangulations for different ratios calculated by this test in figure 6. It is also interesting to see how things change if we change the balance between the members of the objective function. In figure 7 we show the solutions for same test but with $c=0.5$.

Figure 6 – optimal rectangulations for N=5, c=0.05 and different w/h ratios

Figure 7 – optimal rectangulations for N=5, c=0.5 and different w/h ratios

Outcome is in general lines as you would expect. Having a prime number of callers forces rectangles to not be the same all the time, but the algorithm follows the optimization criteria as far as possible and makes a reasonable job. With a bigger $c$ we see that having rectangles with a width/height ratio near to $k$ starts to predominate and difference between areas starts to grow.

Conclusion

We have seen in this post the ways in which a rectangle can be split in another rectangles and how we can optimize the layout with different criteria. This of course has more applications than splitting the window of a video-conference application in different camera views, being the most well known one optimizing the placing of components on an integrated circuit. In any case, I had not expected to go this far when I started to work on this problem, but it was a fun ride that hopefully you have enjoyed as well, especially if you have gone this far in the reading!

Tag: python

How to share your video-conference window among attendees – or, the many ways of splitting a rectangle in many