Silence.
Chained Unit Tests – CUT
29. March, 2023The CUT approach allows to test logically related parts or to gradually replace integration tests with pure unit tests.
Let’s start with the usual app: There is a backend server with data and a frontend application. Logically speaking, those are connected but the backend is using a Java and the frontend uses TypeScript. At first glance, the only way to test this is to
- Set up a database with test data.
- Start a backend server.
- Configure the backend to talk to the database.
- Start the frontend.
- Configure the frontend to talk to the test backend.
- Write some code which executes an operation in the frontend to test the whole.
There are several problems with this:
- If the operation changes the database, you sometimes have to undo this before you can run the next test. The usual example is a test which checks the rendering of a table of users and another test which creates a new user.
- The test executes millions of lines of code. That means a lot of causes for failures which are totally unrelated to the test. The tests are flaky.
- If something goes wrong, you need to analyze what happened. Unlike with unit tests, the problem can be in many places. This takes much more time than just checking the ~ 20 lines executed by a standard unit test.
- It’s quite a lot of effort to make sure you can render the table of users.
- It’s very slow.
- Some unrelated changes can break these tests since they need the whole application.
- Plus several more but we have enough for the moment.
CUT is an approach that can help here.
Step 1: Rendering in the Frontend
Locate the code which renders the table. Ideally, it should look like this:
- Fetch list of elements from backend using REST
- Render each element
Change this code in such a way that the fetching is done independent of the rendering. So if you have:
renderUsers() {
const items = fetchUsers();
return items.map((it) => renderUser(it));
}
replace that with this:
renderUsers() {
const items = fetchUsers();
return renderUserItems(items);
}
renderUserItems(items) {
return items.map((it) => renderUser(it));
}
At first glance, this doesn’t look like an improvement. We have one more method. The key here is that you can now call the render method without fetching data via REST. Next:
- Start the test system.
- Use your browser to connect to the test system.
- Open the network tab.
- Open the users table in your browser.
- Copy the response of
fetchUsers()
into a JSON file. - Write a test that loads the JSON and which calls
renderUserItems()
.
This now gives you a unit test which works even without a running backend.
We have managed to cut the dependency between frontend and backend for this test. But soon, the test will give us a false result: The test database will change and the frontend test will run with outdated input.
Step 2: Keeping the test data up-to-date
We could use the steps above to update the test data every time the test database changes. But a) that would be boring, b) we might forget it, c) we might overlook that a change affects the test data, d) it’s tedious, repetitive manual work. Let’s automate this.
- Find the code which produces the JSON that
fetchUsers()
asks for. - Write a unit test that connects to the test database, calls the code and compares the result with the JSON file in the frontend project.
This means we now have a test which fails when the JSON changes. So in theory, we can notice when we have to update the JSON file. There are some things that are not perfect, though:
- If the test fails, you have to replace the content of the JSON file manually.
- It needs a running test database.
- The test needs to be able to find the JSON file which means it must know the path to the frontend project.
Step 2 a: Update the JSON file
There are several solutions to this:
- Use an assertion that your IDE recognizes and which shows a diff when the test fails. That way, you can open the diff, check the changes, copy the new output, open the JSON file, paste the new content. A bit tedious but if you use keyboard shortcuts, it’s just a few key presses and it’s always the same procedure.
- Add a flag (command line argument, System property, environment variable) which tells the test to overwrite the JSON when the test fails (or always, if you don’t care about wear&tear of your hardware). Since all your source code is under version control, you can check see the diff there and commit or revert.
- Optional: If the file doesn’t exist, create it. This is a bit dangerous but very valuable when you have a REST endpoint with many parameters and you need lots of JSON files. That way, the first version gets created for you and you can always use the diff/copy/paste pattern.
You probably have concerns that mistakes could slip through when people mindlessly update the JSON without checking all the changes, especially when there are a lot.
In my experience, this doesn’t matter. For one, it will rarely happen.
If you have code reviews, then it should be caught there.
Next, you have the old version under version control, so you can always go back and fix the issue. Fixing it will be easy because you now have a unit test that shows you exactly what happens when you change the code.
Remember: Perfection is a vision, not a goal.
Step 2 b: Cut away the test database
Approaches to achieve this from cheapest to most expensive:
- Fill the test database from CSV files. Try to load the CSV in your test instead of connecting to a real database.
- Use an in-memory database for the test. Use the same scripts to set up the in-memory database as the real test database. Try to load only the data that you need.
- If the two databases have slightly different syntax, load the production script and then patch the differences in the test to make the same script work for both.
- Have a unit test that can create the whole test database. The test should verify the contents and dump the database in a form which can be loaded by the in-memory database.
- Use a Docker image for the test database. The test can then run the image and destroy the container afterwards.
Step 2 c: Project organization
To make sure the backend tests can find the frontend files, you have many options:
- Use a monorepo.
- Make sure everyone checks out the two projects in the same folder and using the same names. Then, you can just go one up from the project root to find the other project.
- Use an environment variable, System property or config file to specify the path. In the last case, make sure the name of the config file contains the username (Java: System property user.name) so every developer can have their own copy.
What else can you do?
There are several more things that you can add as needed:
- Change
fetchUsers()
so you can get the URL it will try to fetch from. Put the URL into a JSON file. Load the JSON in the backend and make sure there is a REST endpoint which can handle this URL. That way, you can test the request and make sure the fetching code in the frontend keeps working. - If you do this for every REST endpoint, you can compare the list from the tests against the list of actual endpoints. That way, you can delete unused endpoints or find out which ones don’t have a test, yet.
- You can create several URLs with different parameters to make sure the fetching code works in every case.
Conclusion
The CUT approach allows you to replace complex, slow and flaky integration tests with fast and stable unit tests. At first, it will feel weird to modify files of another project from a unit test or even trying to connect the two projects.
But there are several advantages which aren’t obvious:
- You now have test data for the default case. You can create more test cases by copying parts of the JSON, for example. This means you no longer have to keep all edge cases in your test database.
- This approach works without understanding what the code does and how it works. It’s purely mechanical. So it’s a great way to start writing tests for an unknown project.
- This can be added to existing projects with only small code changes. This is especially important when the code base has few or no tests since every change might break something.
- This is a cheap way to create test data for complex cases, for example by loading the JSON and then duplicating the rows to to trigger paging in the UI rendering. Or you can duplicate the rows and the randomize some fields to get more reasonable test data. Or you can replace some values to test cases like very long user names.
- It gives you a basis for real unit tests in the frontend. Just identify the different cases in the JSON and pick one example for each case. For example, if you have normal and admin users and they look different, then you need two tests. If there is special handling when the name is missing, add one more test for that. Either get the backend to create the fragments of the JSON for you or load the original JSON and then filter it. Make sure you fail the test when expected item is no longer in the list.
- The first test will be somewhat expensive to set up. But after that, it will be cheap to add more tests, for example for validation and error handling, empty results, etc.
Why chained unit test? Because they connect different things in a standard way like the links of a chain.
From a wider perspective, they allow to verify that two things will work together. We use the same approach routinely when we expect the compiler to verify that methods which we call exist and that the parameters are correct. CUT allows to do the same for other things:
- Code and end user documentation.
- Code and formulas in Excel files.
- Validation code which should work exactly the same in frontend and backend.
How Quantum Entanglement Works – For Dummies
5. December, 2022So you’ve heard about this “quantum entanglement” stuff and how Einstein was apparently worried it might break the speed of light.
It doesn’t and he wasn’t.
Here is the simple version: Take a piece of paper. Rip it apart once. Check the pieces that you got. Maybe scribble something on it. Mix the two pieces behind your back. Give one of them to a friend without looking. Send the friend to the end of the universe. If he refuses, find a real friend. Wait a few billion years. Open your hand. In that instant, no matter how far away your friend is, you will know what piece he has in his hands.
The complicated version: Quantum is weird but some stuff is actually easy to understand. There are just a few ways to create entangled particles. All of them have something in common: The pieces must add up exactly to what you put in. This isn’t magic or some badly written Star *beep* episode. Imagine you put in a piece of paper (or a photon – a blip of light). You can’t have more paper (or more light) after splitting it. In the case of the photon: If you add energy, you get more light but no entanglement.
So the entangled photon pairs are always half of the original in terms of energy (which roughly translates to “half as bright”). And they always go in exactly opposite directions (see conservation of momentum). Things like that. Which means you know everything about the two particles except one thing: You don’t know which is which unless you look.
If you keep one of them around and sent the other away, and then at some later point look at what you kept, you know exactly and instantly what the other must look like, no matter how far away it is now.
But you can’t change the far particle anymore. Same as you can’t add text to the paper which your friend at the edge of the universe is holding. So you can’t use this to beam information.
Sorry.
Be grateful for having friends like that. A pity that you sent him away.
When to put generated code under version control
29. June, 2022Many people think that when a computer generates code, there is no point to put it under version control. In a nutshell: If you generate the code once with a tool that you’re confident with, there is no point to put under version control. If you need to tweak a lot, version control will make your life so much easier.
Decision tree:
- Do you need to tweak the options the code generator until everything works? If so, then yes.
- How confident are you with using the code generator? If not very, then yes.
- Is the code generator mature? Then not.
Some background: Let’s compare a home-grown code generator which is still evolving with, say, the Java Compiler (which generates byte code). The latter is developed by experienced people, tested by big companies and used by thousands of people every day. If there is a problem, it was already fixed. The output is stable, well understood and on the low end of the “surprise” scale. It has only a few options that you can tweak and most of them, you’ll never even need to know about. No need to put that under version control.
The home grown thing is much more messy. New, big features are added all the time. Stuff that worked yesterday breaks today. No one has time for writing proper tests. In this kind of situation, you will often need to compare today’s output with a “known good state”. There is a dozen of roughly understood config options for many things that might make sense if you were insane. Putting the generated code under version control in this situation is a must have since it will make your life easier.
The next level is that the code generator itself is mature bit it offers a ton of config options. Hypothetically, you will know the correct ones to use before you use the generator for the first and only time. Stop laughing. In practice, your understanding of config options will evolve. As you encounter bugs and solutions, you will need to know what else a config change breaks. Make your life easy and use version control: Config change, regenerate, look at diff, try again.
In a similar fashion, learning to use the code generator in an efficient and useful way will take time. You will make mistakes and learn from them. That won’t stop a co-worker from making the same mistakes or other ones. Everyone in the team has to learn to use the tool. Version control will prevent from one person breaking things for others.
How
Write a parameterized unit test which generates the code in a temporary folder. In the end, each file should be a single test which compares the freshly generated version with the one in the source tree.
Add one test at the end which checks that the list of files in both folders is the same (to catch newly generated files and files which have to be deleted).
Add a command line option which overwrites the source files with the ones created by the test. That way, you can both catch unexpected changes in your CI builds and efficiently update thousands of files when you want.
The logic in the test should be:
expected = content freshly generated file
actual = content of the file in the source tree
or just the file name if the file doesn't exist (makes it
easier to find the file when the test itself is broken).
if expected != actual, then
if (overwrite) then copy expected to actual
assert expected == actual
Use a version of the assert that shows a diff in your IDE. That way, you can open the file in your IDE and use copy&paste out of the diff window to fix small changes to get a feeling how they work.
Or you can edit the sources until they look the way they should and then tweak config options until the tests confirm that the code generator now produces the exact desired result.
Bonus: You can tweak the generated code in your unit test. It’s as simple as applying patches in the “read content of the freshly generated file” step. One way you can use this is to fix all the IDE warnings in the generated code to get a clean workplace. But you can also patch any bugs that the code generator guys don’t want to fix.
Workaround
If you don’t want to put all generated code under version control, you can create a spike project to explore all the important features. In this spike, you create an example for every feature you need and put the output under version control. That way, you don’t have to put millions of lines under version control.
The drawback is that you need a team of disciplined individuals who stick to the plan. In most teams, this kind of discipline is shot in the back by the daily business. If you find yourself in a mess after a few weeks: Put everything under version control. It’s a bit of wasted disk space. Say, $10 per month. If you have to discuss this with the team for more than five minutes, the discussion was already much more expensive.
Children can become anything they want
26. June, 2022The difference is that some people think “they” means “the children” while other people think it means themselves.
Dark Forest is a Fairy Tale
16. December, 2021The Dark Forest, an idea developed by Liu Cixin for his Remembrance of Earth’s Past series (also known for the first book “The Three-Body Problem” is just a fairy tale: Interesting to think about, there is a morale but it’s not based on reality.
Proof: We are still here.
The Dark Forest fairy tale is a solution to the Fermi paradox: If there are billions of planets like earth out there, where is everyone? The Dark Forest claims that every civilization that is foolish enough to expose itself gets gets wiped out.
Fact: We have exposed ourselves for millions of years now. Out planet has sent signals “lots of biological life here” for about 500 million years to anyone who cares.
Assuming that the Milky Way has a size of 100’000 light years, this means every civilization out there know about Earth for at least 499.9 million years. If they were out to kill us, we would be long gone by now. Why wait until we can send rockets to space if they are so afraid of any competition?
How would they know about us? We can already detect planets in other star systems (the current count at the writing of this article is 4604, see http://www.openexoplanetcatalogue.com/). In a few years, we’ll be able to tell all the planets close to us which can carry life, for values of close around 100 light years. A decade later, I expect that to work for any star system 1’000 light years around us. In a 100 years, I’ll expect scientists to come up with a trick to scan every star in our galaxy. An easy (if slow) way would be to send probes up and down out of the disk to get a better overview. Conclusion: We know a way to see every star system in the galaxy today. It’s only going to get better.
Some people worry that the technical signals we send could trigger an attack but those signals get lost in the background noise fairly quickly (much less that 100 light years). This is not the case for the most prominent signal: The amount of oxygen in Earth’s atmosphere. If you’re close to the plane of the ecliptic (i.e. when you look at the sun, the Earth will pass between you and the sun), you can see the Oxygen line in the star’s spectrum for thousands of light years. Everyone else has to wait until Earth moves in front of some background object.
There is no useful way to hide this signal. We could burn the oxygen, making Earth inhospitable. Or we could cover the planet with a rock layer; also not great unless you can live from a rock and salt water diet.
For an economical argument: When Ruanda invaded the Democratic Republic of Congo to get control of Coltan mining, they made roughtly $240 million/yr from selling the ore. China makes that much money by selling smart phones and electronics to other states every day (source: Home Deus by Yuval Harari). My take: killing other civilizations is a form of economical suicide.
Conclusion: The Dark Forest is an interesting thought experiment. As a solution for the Fermi paradox, I find it implausible.
What are Software Developers Doing All Day?
6. November, 2021Translate.
Mathematics? Nope. I use the trigonometric functions like sin(x) to draw nice graphics in my spare time but I never use them at work. I used logarithmic last year to round a number but that’s about it. Add, multiply, subtract and divide is all the math that I ever do and most of that is “x = x + 1”. If I have to do statistics, I use a library. No need to find the maximum of a list of values myself.
So what do we do? Really?
We translate mumble into Programming Language of the Year(TM).
Or more diplomatic: We try to translate the raw and unpolished rambling of clients into the strict and unforgiving rules of a programming language.
We’re translators. Like those people who translate between human languages. We know all the little tricks how to express ourselves and what you can and can’t easily express. After a while, we can distinguish between badly written code and the other kind, just like an experienced journalist.
#TeamSeas
2. November, 2021Upset by all the plastic in the oceans?

Check https://teamseas.org/ for details.
Another Reason to Avoid Comments
16. April, 2022There is this old saying that if you feel you have to write a comment to explain what your code does, you should rather improve your code.
In the talk above, I heard another one:
Share this:
Like this: