The 0 Code Solution

The short story

Last week I had a meeting with a client that was using Anki to teach english to his students.

He wanted to extend it in order to allow for a new use case, he wanted the students to be able to talk to the app and have a real time transcription in the answer field of the english question he had delivered to them.

This was so that students could use their true speaking grammar and vocabulary instead of their written skills while doing exercises.

In these times of Whisper and artificial intelligence tools, I thought this could be somewhat plausible, and told him I would research this.

Adding a new feature to Anki

I took a look at the desktop app, researched how it was implemented, saw it was really using QtWebEnginePage for the cards, which launches an instance of the Chromium web engine to create hybrid apps.

I first tried implementing a widget using HTML and JavaScript that would ask permission for the microphone, make a recording, then send it to a FastAPI server running whisper.

The idea was to inject this HTML and javascript code into the HTML of the cards for seamless integration.

It did not work because of the security model of the QtWebEnginePage, uppon reaching the instruction for asking the required audio permissions to make the recording, the javascript line was simply dropped and the execution stopped.

I tried other approaches, such as making an Anki plugin to access the microphone using the native interface that Anki provides, but I could not find a clear way of accessing the microphone using Anki’s existing add-on API.

I progressed towards the AnkiDroid app, and made some tests in my environment to verify if the first approach would work, it didn’t.

Then I saw the keyboard of the Android Pixel 7 emulator I was using in Android Studio, and saw the microphone for Google’s transcription services.

You could simply set up Google’s keyboard in your device and use it to achieve this client’s use case by pressing in the voice transcription button, speaking to the phone, and letting it write the transcription in the text box.

All without a single line of custom code.

After 3 days of intense research and years of ego investment as a software developer, this realization hitted me like a bag of bricks and spiralled me into a trip of existential dread.

This made me think about what was the real value of my work in a world where everything was already invented, if I was going to get paid at all or what was going to be my value as an engineer in the years to come.

I got paid, but I realized that I had to improve my thinking and analysis skills in future projects.

The long story

This is not the first time this happens, there is a legendary story between Doug McIlroy and Donald Knuth where McIlroy made a critique of a 10 pages long program written in Pascal by Knuth.

You should check this article, it has interesting points about what a good engineering solution is.

McIlroy was able to provide the same correct solution as Knuth using the following Unix script.

tr -cs A-Za-z '\n' |
tr A-Z a-z |
sort |
uniq -c |
sort -rn |
sed ${1}q

In Knuth and McIlroy era they had Unix pipes, today we have ChatGPT, GitHub’s Copilot or the Google’s keyboard for Android.

The computer programming industry has been reinventing the wheel since its inception.

Sometimes because of ignorance, such as not knowing that Google’s keyboard had transcription services, and sometimes because of actual need due to changes in computer architecture, mistakes separating the concerns of the code itself, or hitting the limits of a system.

Other times because trying to replicate existing solutions is an excellent learning opportunity, sometimes it might even be fun.

However, with such an explosion and availability of tools in the modern technological ecosystems of the web, mobile and desktop, we shouldn’t have to do this as often.

The answer that McIlroy gave was that most of the computation tasks are really transformations of data that could be executed by a finite set of well-designed utilities joined by pipes.

Those utilities by themselves seem useless, it is only after understanding pipes that the user can really leverage their power.

The insight here is that through the composition of off-the-shelf tools and utilities, we can reach a much larger space of solutions than with monolitic ones.

But only if we know they exist and if we know how to use them.

What is a 0 code solution?

I would define it as a solution that is using existing tools configured adecuatedly by a user without having the esoteric knowledge about programming languages that we as computer engineers hold.

However, you could argue that all technological knowledge is, at some level, esoteric.

The difference between computer technology and all others is that you just cannot see the complexities involved or leverage your intuitions about physical reality to use them.

Where do you put the line? If you achieve your solution using an Excel spreadsheet, is that a 0 code solution?

If you use a system call on your operating system without doing complex algorithmic procedures, is that a 0 code solution?

Probably speaking about code or codeless solutions is the wrong frame to operate in.

When we get outside the ego investment we as technologists are involved regarding a particular set of technologies that fit us in a role such as “I am a C++ programmer”, “I am a web dev” or “I use arch BTW”, what we find is that these are just tools.

Tools are the main concept of the human experience that separates us from other beings on this planet.

“For the man who has only a hammer, everything looks like a nail.”

The lesson is about detachment

Be detached from the tools you use.

If you want to achieve excellence, you need to use the best thing available to reach your desired outcome or vision.

Miyamoto Musashi believed that a true warrior should not have a favorite weapon, nor likes and dislikes.

He believed that becoming over-familiar with one weapon was as much a fault as not knowing it sufficiently well.

This philosophy emphasized the importance of versatility and adaptability in combat.

How can we adapt this to software engineering?

An option is by working towards knowing as much as possible about the platform we will be working with.

If that is the web, you should know about the existance and capabilities of the HTML5 standard and the APIs that modern web browsers need to provide.

If that is the mobile, desktop or server world, you need to know how to use the API of your operating system.

Recurrent solutions are grouped into frameworks that provide an additional layer of (hopefully) clean abstractions over these APIs to avoid reinventing the wheel.

You should learn at least one of them per group to at least catch yourself when facing a problem that has already been solved.

It is not about ReactJs or Angular, but about reactive front-end frameworks.

Some of these abstractions, such as in the case of Docker and container technologies, are about being able to achieve composable and replicable systems.

And the final product of all this should be reusable applications that you need to know how to use in creative and new ways to achieve a certain outcome.

It is the outcome, the expectation or the desired experience what should be the focus. Not the tool itself.

But if we think more generally in terms of problem-solving, you need to get even further as an analyst or researcher.

You need to precisely understand the problem at hand, all its variables, which might involve multiple fields and causes such as economics, politics, physics, mathematics…

You need to understand the environment of the problem you are figuring out a solution for, which in some cases might require years of work, a team of extraordinary people with different expertises and talents or be a specialized generalist in the right areas.

For example, when talking about privacy on the internet or the problem with piracy and intellectual property, one of the common mistakes of technology people is to think in terms of technological solutions when the problem has political, social and economic factors along its technological component.

The same could be said about some political or legal problems, which could be solved by rethinking current systems and the dialectic relationships present in the environment in completely new ways by leveraging technology.

Regarding piracy, the availability of digital assets tends to decrease their price towards zero, the solution that Netflix proposed was about providing additional value in the form of convenience, something for which people with money will always pay for. This solution factored in technology, economics and human psychology, ignoring any of them would have made it effectively unreachable.

That is the promise and the danger that digitalization holds, about which we will talk another day.

Conclusion

In the future I will try to think in terms of layers of user level interaction with these systems to understand where should the modification or extension be made.

First start from the point of view of the user, what it needs to achieve.

Follow up with the desired user experience, then adapt the existing technologies to it, not the other way around.

The story of this article is just another proof that we should spend more time thinking about the problem before jumping to a solution.

If I had started thinking about this problem slowly through different lens instead of directly go to a web widget, I might have catched Google’s keyboard approach faster.

This also shows the importance of being able to detach from a frame or solution to be able to navigate the whole range of possible approaches in order to get the optimal one.

This is a skill uppon itself that requires detachment from our part as developers and engineers, as one of the problems of seeing a path that could work is that we cannot unsee it and can become blinded by what seems the obvious way of doing things.

Regarding pricing, what the client is paying is for the process when the solution is not around.

In the end, even if this solution was 0 code, it was not the optimal one.

The client wants an integrated solution for his use case and we will find one.

See you in the next one.

whoami

Jaime Romero is a software developer and cybersecurity expert operating in Western Europe.