When to put generated code under version control

29. June, 2022

Many people think that when a computer generates code, there is no point to put it under version control. In a nutshell: If you generate the code once with a tool that you’re confident with, there is no point to put under version control. If you need to tweak a lot, version control will make your life so much easier.

Decision tree:

  • Do you need to tweak the options the code generator until everything works? If so, then yes.
  • How confident are you with using the code generator? If not very, then yes.
  • Is the code generator mature? Then not.

Some background: Let’s compare a home-grown code generator which is still evolving with, say, the Java Compiler (which generates byte code). The latter is developed by experienced people, tested by big companies and used by thousands of people every day. If there is a problem, it was already fixed. The output is stable, well understood and on the low end of the “surprise” scale. It has only a few options that you can tweak and most of them, you’ll never even need to know about. No need to put that under version control.

The home grown thing is much more messy. New, big features are added all the time. Stuff that worked yesterday breaks today. No one has time for writing proper tests. In this kind of situation, you will often need to compare today’s output with a “known good state”. There is a dozen of roughly understood config options for many things that might make sense if you were insane. Putting the generated code under version control in this situation is a must have since it will make your life easier.

The next level is that the code generator itself is mature bit it offers a ton of config options. Hypothetically, you will know the correct ones to use before you use the generator for the first and only time. Stop laughing. In practice, your understanding of config options will evolve. As you encounter bugs and solutions, you will need to know what else a config change breaks. Make your life easy and use version control: Config change, regenerate, look at diff, try again.

In a similar fashion, learning to use the code generator in an efficient and useful way will take time. You will make mistakes and learn from them. That won’t stop a co-worker from making the same mistakes or other ones. Everyone in the team has to learn to use the tool. Version control will prevent from one person breaking things for others.

How

Write a parameterized unit test which generates the code in a temporary folder. In the end, each file should be a single test which compares the freshly generated version with the one in the source tree.

Add one test at the end which checks that the list of files in both folders is the same (to catch newly generated files and files which have to be deleted).

Add a command line option which overwrites the source files with the ones created by the test. That way, you can both catch unexpected changes in your CI builds and efficiently update thousands of files when you want.

The logic in the test should be:

expected = content freshly generated file
actual = content  of the file in the source tree 
      or just the file name if the file doesn't exist (makes it
      easier to find the file when the test itself is broken).

if expected != actual, then
    if (overwrite) then copy expected to actual
    assert expected == actual

Use a version of the assert that shows a diff in your IDE. That way, you can open the file in your IDE and use copy&paste out of the diff window to fix small changes to get a feeling how they work.

Or you can edit the sources until they look the way they should and then tweak config options until the tests confirm that the code generator now produces the exact desired result.

Bonus: You can tweak the generated code in your unit test. It’s as simple as applying patches in the “read content of the freshly generated file” step. One way you can use this is to fix all the IDE warnings in the generated code to get a clean workplace. But you can also patch any bugs that the code generator guys don’t want to fix.

Workaround

If you don’t want to put all generated code under version control, you can create a spike project to explore all the important features. In this spike, you create an example for every feature you need and put the output under version control. That way, you don’t have to put millions of lines under version control.

The drawback is that you need a team of disciplined individuals who stick to the plan. In most teams, this kind of discipline is shot in the back by the daily business. If you find yourself in a mess after a few weeks: Put everything under version control. It’s a bit of wasted disk space. Say, $10. If you have talk about it with the team for more than five minutes, that’s already more expensive.


Another Reason to Avoid Comments

16. April, 2022

There is this old saying that if you feel you have to write a comment to explain what your code does, you should rather improve your code.

In the talk above, I heard another one:

A common fallacy is to assume authors of incomprehensible code will somehow be able to express themselves lucidly and clearly in comments.
– Kevlin Henney


What are Software Developers Doing All Day?

6. November, 2021

Translate.

Mathematics? Nope. I use the trigonometric functions like sin(x) to draw nice graphics in my spare time but I never use them at work. I used logarithmic last year to round a number but that’s about it. Add, multiply, subtract and divide is all the math that I ever do and most of that is “x = x + 1”. If I have to do statistics, I use a library. No need to find the maximum of a list of values myself.

So what do we do? Really?

We translate mumble into Programming Language of the Year(TM).

Or more diplomatic: We try to translate the raw and unpolished rambling of clients into the strict and unforgiving rules of a programming language.

We’re translators. Like those people who translate between human languages. We know all the little tricks how to express ourselves and what you can and can’t easily express. After a while, we can distinguish between badly written code and the other kind, just like an experienced journalist.


When stupidity meets cleverness

31. October, 2021

we get a potency of impossibilities.


Good and Bad Tests

16. January, 2017

How do you distinguish good from bad tests in your code?

Check these criteria. Good tests

  • Nail down expectations
  • Monitor assumptions
  • Help to locate the cause of a failure
  • Document usage patterns
  • Allow to change code
  • Allow to verify changes
  • Are short (LOC + time)

Bad tests

  • Waste development time
  • Execute many, many lines of code
  • Prevent code changes
  • Need more time to write than the code they test
  • Need a lot of code to set up
  • Take ages to execute
  • Are hard to run

Expectations

There are a lot of checks in your compiler. Those help to catch mistakes you make. Do the same with your tests. There are a lot of things that compilers don’t check: File encodings, existence of files, existence of config options, types of config options.

Use tests to nail down your expectations. Read config files and validate the odd option.

Create a test which collects the whole configuration of your program and checks it against a known state. Check that each config option is set only once (or at least that it has the same value in all places).

When you need to translate your texts, add tests which make sure that you have all the texts that you need, that texts are unique, etc.

Assumptions

Convention over configuration only works when everyone agrees what the convention is. Conventions are assumptions. Your brain has to know them since they are no longer in the code. If this approach fails for you, write a test that validates your assumptions.

Check that code throws the exceptions that you expect.

If you have found a bug in a framework and added a workaround, add a test which fails when the bug is fixed. Add a comment “If this test fails, you can remove the workaround.”

Speed

The world speeds up. No one can afford slow tests, tests that are hard to understand, hard to maintain, tests that get in the way of Get-Things-Done™. Make sure you can run your tests at the touch of a button. Make sure they never fail unless they should. Make sure they fail when they should. Make sure they are small (= execute fast, few lines to understand, little code to write, easy to change, cheap to delete). Ten tests, each asserting a single fact, are better than one test that asserts ten facts. If your tests run for more than ten seconds, you lose.

Documentation

There is code rot. But long before that, there is documentation rot. Who has time to update the comments and documentation after a code change?

Why not document code usage in tests? Tests tell you the Truth™. Why give someone 100 pages of words they can’t trust when you can give them 100 unit tests they can execute?

Conclusion

Make your life easier. Stop wasting time in your debugger, begging for production log files, running code in your head. Write a good test today, it will watch your back for as long as the project lives. Write a thousand good tests and they will be like an army of angels, warding you from suffering, every time you press that button.


Testing Fonts for Software Developers

11. September, 2015

Characters that you need to be able to distinguish clearly:

0O – Zero and upper case o
l1I – Lower case L, one, upper case i
Z2 – Upper case z and two
S5 – Upper case s and five
G6 – Upper case g and six
B8 – Upper case b and eight
71 – seven and one
lI – Lower case L, upper case i
vy – lower case v and lower case y

Just copy and paste into your favorite code editor.

Fonts you should try:


TNBT: Proactive IDEs

13. February, 2015

Imagine this situation: You’re working on some code and you get an exception when you run the unit tests. Next to the output is a link with the text: “User Joe had the same exception two months ago and fixed it with the commit b8cfda02.”

How would that work? We’re using big data for all kinds of things, tracking customer happiness, searching the Internet and discovering terrorist threats (or not).

Standard development teams have about 10 people. That means you have a super computer with 40-80 cores, 160 GB of RAM and 20 TB of disk space connected with a fast LAN in your office already. That beast is usually idling while it waits for the developers to press keys. It would be pretty simple to install a clustered log analyzer on this hardware which simply reads all the log files and reports which Maven and running JUnit test creates. It would be as simple to connect the same database to your version control. That means this system could track all the errors and exceptions that you get when you run unit tests or the whole application.

This information could then be used to detect when someone in the team gets a new exception plus the change sets which fixes them. If the system detects an exception which it has seen before, it can tell you which developer has fixed it or who is currently working on it – instead of wasting your time, you could see the code which contains the solution or ask someone who has already solved the problem.

With proper filtering, the data could be split into internal and framework code. That way, the system could report to library projects where consumers struggle most.

On the large scale of things, this system can tell you which parts of the system are most brittle.

As usual with big data, there are some downsides. The same system would tell you which developer breaks the code most often. Who writes the worst code. If your manager isn’t able to see the human value in his charges, this might not be your best bet.

Related Articles:

  • The Next Best Thing – Series in my blog where I dream about the future of software development

Jazoon 2013 – 33 things you want to do better

25. October, 2013

Jazoon 2013 badgeTom Bujok listed a lot of methods, technologies and frameworks that you should be aware about in his talk “33 things you want to do better” (slides on speakerdeck)

At the beginning he reminded us how quickly a well designed system goes bad due to hurried changes. We need to be aware of our technical debt and we need to allocate time to spend on reducing it (slides 3-12).

As an example, car batteries are easy to find. They are a replacement part, designers and engineers make it easy to find. Compare this to the configuration of your project. If you need to change it, how easy is it to find the file that needs to be changed and then the place in the file?

Another important point is skills. In most other professions, you have some mastery of a skill before you use it. You train hundreds of hours before you play your first football game. In Software, we show you a computer, we show you the programming language of the year (not necessarily this year’s). There is no time to master the tools you have to use from day one (slides 13-15).

“We are what we repeatedly do; excellence then, is not an act but a habit.” – Aristotle

Or as Wikipedia defines it:

“Habits […] are routines of behavior that are repeated regularly and tend to occur subconsciously.”

Stop wondering why you always make the same mistakes – they’re habits. Eliminate them ASAP (slide 19):

Bad Habits – Katherine Murdock “The Psychology of Habit”:

  • Recognize bad habits and eliminate them ASAP
  • The older you get the more difficult it is to remove a bad habit
  • Each repetition leaves its mark!

Turning bad habits into good ones – Dr. Michael Roussell, PhD.:

  • You can’t erase a habit, you can only overwrite one.
  • Insert the new habits into the current habit loops

Bad Habits

Configure your IDE properly and remove bad defaults. Replace “ex.printStackTrace();” with “throw new RuntimeException(ex.getMessage(), ex);” (slides 43-45).

One bad habit is empty catch blocks with “can never happen” comments. If you see one during a code review, replace it with “System.exit(-1);”. It can never happen, right? Right? (slides 46-47).

Note: I have create a “ShouldNotHappenException” for this case 🙂

Another one is to make every method in a static helper class public. Maybe some of them can be package private? (slide 48)

Learn about other good habits. Read books like “Effective Java” (Joshua Bloch) and “Clean Code” (Robert C. Martin) (slide 49)

Use code reviews to notice bad habits and to spread knowledge in your team but prevent blame games. (slide 73)

Learn the keyboard shortcuts of your IDE (slide 78)

Remember (slide 79):

“Any jackass can kick down a barn, but it takes a good carpenter to build one.” – Sam Ryburn

Projects You Should Know

Project lombok and lombok-pg – In a nutshell, these hook into the Java compiler and generate additional bytecode when certain annotations are present. Bored with getters, setters, hashCode() and equals() plus a nice toString()? Use @Data (slides 21-28).

Guava (slides 29-33)) is a great library with many tools that you have been missing in Java for years. You might also want to look at commons-lang.

Want to use lambda expressions but can’t upgrade to Java 8? Then lambdaj is for you (slides 34-39).

Logging? slf4j (40-42). Especially nice when combined with the @Slf4j annotation from lombok.

Bored to write all that boiler plate code to create all those services and managers that form your app? Look at Guice or Spring.

Use Spock to make tests more compact and easier to understand. (slides 50-52)

Unitils contains all the helper functions that we always missed in JUnit and Hamcrest (53-56).

JUnitParams will help you run tests with different parameters (57-59).

Need to wait for something during a test? Awaitility will help. (60-61)

When mocking isn’t enough and you need to inject code during a test, Byteman is the tool you want to look at (62-63)

Getting bored writing boiler plate code in Java to make a compiler happy? Have a look at Groovy. (64-67)

How about adding dependencies to your scripts? Try Grape. (68-69)

Is your build a mess? Do you feel Maven is too verbose or too limiting? Gradle might be for you. (70-72)

Version control is slowing you down? Have a look at Mercurial or Git (75)

Use bash or Python to automate man-/menial work. If you’re on Windows, look at Babun or Cygwin. (76-77).


Quick: How Does Quicksort Work?

24. June, 2013

Every now and then, we need to remember how some complex algorithm works. For some, code works well. Or puzzles. Others prefer an explanation with examples. A third group likes to see what is going on.

The AlgoViz.org portal contains visualizations of all kinds of algorithms for the visually inclined.

 


Things Users Don’t Care About

8. February, 2013

Things users don’t care about” is something every software developer needs to know about.

Kudos go to Thomas E. Deutsch for finding and telling me about it.


%d bloggers like this: