# A not so brief remark about time complexity

## Introduction

Occassionally I find myself in a situation where what I'm trying to accomplish is in direct contrast with other goals that I have at the same time. A not too revealing interface makes it harder to test, efficient algorithms are less legible than an also ran, an obvious mental model is harder to implement than the other designs etc.

For this particular article I'll be presenting the case when a inefficient algorithm breathes brevity, and is a maintanence dream, but it is an order of magnitude less efficient for the average use case and is disjunct from the more sane mental model that fits nicely with how one would expect the behaviour to be in a more "real" scenario.

I've chosen to focus on the stark contrast in time complexity between the two presented solutions, while still highlighting the trade-offs. Ultimately I'll disclose which one I decided to use and why.

## Theory

In this section of the article I'll explain the concept of time-complexity to no greater extent. Those who have encountered the concept before should readily find themselves reminded in regards to what it is all about. If the concept is entirely new to you it might prove a bit difficult to grasp in its entirety but I aim to have the text be an explanatory text more so than review material.

..., wenn wir durch das Zeichen $O(n)$ eine Grösse ausdrücken, deren Ordnung in Bezug auf $n$ die Ordnung von $n$ nicht überschreitet; ob sie wirklich Glieder von der Ordnung $n$ in sich enthält, bleibt bei dem bisherigen Schlussverfahren dahingestellt.

— Paul Bachmann,

Die analytische Zahlentheorie^{1}

In the year 1894 Paul Bachmann would introduce a convenient notation for approximation in his work *Die Analytische Zahlentheorie*. That notation was the $O$-notation (Big O notation), and it allows us to readily replace "$\approx$" with "$=$" and quantify the degree of accuracy. In the following years the notation would be popularized by Edmund Landau and others. ^{3}

Generally the notation $O(g(n))$ is applicable when $g(n)$ is a function of the positive integer $n$. The notation itself denotes a quantity which we do not know explicitly but instead the notation provides us with the magnitude of the quantity ^{3}

##### Definition 1:

If $f(n) = O(g(n))$ then that means that there are positive constants $n_0$ and $C$ such that

\begin{equation} |f(n)| \le C |g(n)|\ \textrm{for $n \ge n_0$} \end{equation}

and we say that $f(n)$ *is* $O(g(n))$. ^{2}

### Notation Example

We'll now study the consequences of this in an example that highlights the traits of the notation that I find the most important and have left the more subtle aspects out of it, it is beyond the scope of this article.

^{This example occurs in both Concrete Mathematics and The Art of Computer Programming, vol. 1.}

We know the sum of the first $n$ squares to be

\begin{equation} \sum\limits_{i=1}^{n} i^3 = \frac{1}{3}n \left( n + \frac{1}{2} \right) (n + 1) = \frac{1}{3}\ n^3 + \frac{1}{2}\ n^2 + \frac{1}{6}\ n \textrm{;} \end{equation}

then it follows that

\begin{equation} \sum\limits_{i=1}^{n} i^3 = O(n^3) \end{equation}

because $\left| \frac{1}{3}n^3 + \frac{1}{2}n^2 + \frac{1}{6}n \right| \le \frac{1}{3}|n^3| + \frac{1}{2}|n^2| + \frac{1}{6}|n| = |n^3|$ for all $n$.

We can also be sloppy and write

\begin{equation} \sum\limits_{i=1}^{n} i^3 = O(n^{10}) \end{equation}

which, while formally correct, is regarded as poor practice. ^{1}

## Context

Recently I've been working on an assignment for a course I'm taking.

The problem demands that I first present the different parts that are at play here, so we will do a precursory dive into that before adding detail on an as needed basis.

Luckily the problem doesn't demand an introduction to a vast and encumbersome class hierarchy, but we will need to analyze the interplay of two Classes: *Sensor* and *Network*.

The class that I call a *Sensor* has a *communication distance*, and for reasons outside of the class a *Sensor* **must** keep track of the *Sensors* that are within it's *communication distance*. Furthermore a *Sensor* also has a *Coordinate* which indicates where in space it is. (2-dimensional)

A *Sensor*, let's call her *Alice*, can communicate with another *Sensor*, let's call him *Bob*, provided that *Bob* is within the communication distance of *Alice*. (That *Alice* can communicate with *Bob* does not implicate that *Bob* can communicate with *Alice*.)

A *Network* is a class which keeps track of one or several *Sensors* and can provide meaningful data by letting these sensors do stuff together.

When working with these *Sensors* inside of a *Network* it has proven prudent that, when all of the *Sensors* are created and put into place that to keep track of any *Sensors* within the *communication distance* of one *Sensor* (as aforementioned) we provide a method which when used (in a prudent manner) on a *Sensor* updates its list of neighbours.

Particularily, while initially setting up a *Network*, any *Network*, I'd like to iterate over all the *Sensors* constituting the network and update their neighbour list before anything else.

### Legible solution

So suppose I can tell a *Sensor* *Alice* to add a neighbour, i.e.

```
alice.addNeighbour();
```

which may be implemented thusly

```
public boolean addNeighbour(Sensor s) {
if(!s.equals(this) && this.canCommunicateWith(s)) {
neighbours.add(s);
return true;
}
return false;
}
```

so we get a `boolean`

value indicating whether or not adding the neighbour was successful and we also can't make *Alice* her own neighbour. Great!

**Remark**While I haven't written enough Lisp to have grown properly accustomed to prefix notation there is one thing I oftentimes find myself missing which is having a question mark in method names. In this case

`.canCommunicateWith?(s)`

might detract from the legibility of the code but more often than not I readily find myself deprived when I cannot express myself the way in a natural way (which would demand question marks).
Now all the *Network* (let's call it *Skynet*) has to do is maintain all of its *Sensors* somehow and then we could go ahead and do this:

```
for (Sensor alice : sensors) {
for (Sensor bob : sensors) {
alice.addNeighbour(bob);
}
}
```

But oh-noes! look at that time-complexity. For each and every *Sensor* in *Skynet* we go through *all* the other sensors in *Skynet* before moving onto the next one. I.e. we get the following time-complexity assuming $n$ is the number of *Sensors* in *Skynet*:

But, before presenting the other solution we remark that while the code isn't the most efficient (I cringe when I imagine *Skynet* being made up of a mere 2500 *Sensors*, the above code would run at $6,250,000$ operations) it is readily legible.

### Flood-fill

Let's consider another alternative which is implementing a *flood-fill* algorithm.

Let's discard our previous notion that to add a neighbour of *Alice* we need to tell *Alice* to add *Bob* explicitly. *Alice* is aware of how far she can communicate, and knows her own *Coordinate* in the plane. So postulate that we provide the method *updateNeighbours()* which makes *Alice* ping every adjacent *Sensor* within her communication distance.

We can imagine it looks a little something like this (assuming a *Coordinate* is made up of an x- and a y-value both of which are integers. Furthermore *Alice* needs to be able to get at *Skynets*, which is a *Network*, set of sensors.),

```
public void updateNeighbours() {
for (int y = -comDistance; y <= comDistance; y++) {
for (int x = -comDistance; x <= comDistance; x++) {
relX = this.x + x;
relY = this.y + y;
if ((skynet.hasSensorAt(relX, relY)) &&
isWithinCommunicationDistance(relX, relY))
{
if (x != 0 && y != 0) {
// A sensor isn't its own neighbour
neighbours.add(skynet.getSensorAt(relX, relY));
}
}
}
}
}
```

^{Here isWithinCommunication(...) uses the Pythagorean theorem to determine whether or not the sensor is within the communication distance of the calling sensor.}

Okay, so let's analyze this. We go over each *Coordinate* within the ± *communication distance* of the *Sensor* which calls the method, with an assumption about how *Coordinate* is implemented (integers, which may be something that is subject to change)

Applying this to all the *Sensors* in *Skynet* we'd get

```
for (Sensor alice : sensors) {
alice.updateNeighbours();
}
```

which, assuming that $n$ represents the *Sensors* in *Skynet* and that $m$ is the *communication distance* (which we'll assume to be consistent across *Sensors* for this calculation) nets us a time complexity of

Since we know *communication distance* to denote a radius we can make that claim that while $2m^2 < n$ then the *flood-fill* algorithm is more efficient time-complexity wise. This relationship is illustrated as a plot in fig. 1 using the $2500$ *Sensors* for $n$ and thus the function depends on the *communication distance*.

Figure 1: Time complexity

**Remark ***

The graph in fig. 1 plotted using **Octave**. Curiously to properly plot the entirely horizontal line in fig. 1 I had to include a magic one, i.e. $x/x$ for Octave to not riddle the green line with colour artifacts.

### Conclusion

In the end I opted for the legible version, as it makes no assumptions about how *Coordinates* are implemented (imagine the headache if it is later discovered that they are better represented as floats) and because it is more readily understood by the reader (me at different times of the day, eventually my group, and lastly the grader). Lastly, while the *flood-fill* algorithm is better for the average use-case there is no guarantee that it, the distance, won't be over $35$.

Finally, while it is possible to tell the program to use different methods depending on the *communication distance* I'm adverse to having maintain that when there is no noticable difference between the two besides the theoretical one.

#### References

^{[1] Paul Bachmann, Die analytische Zahlentheorie Taubner, Leipzig, 1894 ↩}

^{[2] Ronald. L. Graham, Donald E. Knuth and Oren Patashnik, Concrete Mathematics: A Foundation for Computer Science. Addison-Wesley, 1989; second edition, 1994. ↩}

^{[3] Donald E. Knuth, The Art of Computer programming, volume $1$: Fundamental Algorithms. Addison-Wesley, 1968; third edition, 1997. ↩}

^{[4] Lars-Erik Janlert and Torbjörn Wiberg, Datatyper och algoritmer, Studentlitteratur, 1990; second edition, 2000. ↩}