Choosing the right digital tools

Choosing a programming language

Back in 2014 I participated in Fosscomm with the talk "What programming language should I choose next". The concept of the talk was to introduce the steps we have to take in order to choose the programming language for our next project.

Now I am much more experienced and I can give more insights to this subject. I always wanted to write about that since then. Let’s hope that it will be a good starting point for me to start writing in this blog more frequently again :-)

The main points that I will go through are:

Programming language characteristics
Occasions for choosing a programming language
Conclusion

The following content is mostly based on my personal thoughts and experience and it is intended to be considered as food for thought rather than anything else.

Programming language characteristics

There are countless programming languages. Others are created to solve problems of specific domains (domain specific languages such as AWK for text processing) and others try to be more generic and solve problems of many domains (general purpose languages such as C and Java).

Domain specific languages may be the best/easiest way for solving particular problems and comparing them with other non-specific may be unfair, depending the context.

Let’s better have generic programming languages in mind from now on when referring to programming languages.

Every programming language has some main characteristics that distinct it from other languages and unite it with others. These characteristics implement particular concepts that may be more suitable for some circumstances than others.

Mastering these concepts is essential for choosing the most well-fit language for a particular task and furthermore for easily and quickly using a new language that implements those known concepts.

Let’s see the basic concepts that I think that are worthy to mention.

Compiled vs Interpreted

A main characteristic of a language is if it is compiled or interpreted.

By compiled we mean that the default implementation of the language provides a compiler that translates the source program to machine code.

An example of such a language is C or C++.

By interpreted we mean that the default implementation of the language provides an interpreter that executes the source code on the fly.

An example would be Perl or PHP.

I refer to the default implementation because the truth is that any language may be compiled or interpreted. However, the default implementation is the one that is widely adopted and best supported. Some languages as Haskell may provide both compilers and interpreters.

The advantages of a compiled language is that the program will run faster. However a binary executable has to be built against each desired target architecture. Changes to the source code require re-compilation in order to run the updated program. This would be time consuming in large programs such as a video game and would disrupt more the development process.

The advantages of an interpreted language is that we may do changes to the source code and test at once. This makes development much easier as we may test our changes much faster. Also, the source code is distributed as it is and it may run everywhere an appropriate interpreter is installed. Of course running the program is slower than the previous solution as the translation happens at run time.

Some languages as Java combine the two mentioned techniques. Java compiles the source code to bytecode, an intermediate representation of the program and the interpreter(jvm) runs the bytecode. Also, in the case of Java and jvm there are other optimizations that take place such as JIT (just in time compilation) that makes the performance of the programs much better.

Programming paradigms

The compiler/interpreter provided by the default implementation isn’t actually a language characteristic but in my opinion it is essential for one to have at least a basic understanding for these stuff.

The next thing we are going to mention is definitely a language characteristic. It is the programming paradigms a language encapsulates.

I won’t mention which the programming paradigms are. You can check wikipedia for that. I’ll just mention that a programming paradigm dictates the way we will face a problem and how we will use the programming language at hand in order to model a solution. It highly affects the development process. Some paradigms are better suited for specific tasks and may result in more elegant solutions and more readable code (for those familiar with the used paradigm at least).

A programming language may implement at least partially more than one paradigm. Having a basic understanding of some paradigms may be a great asset in one’s programming arsenal.

Also, some other important things such as design patterns may be specific to specific programming paradigms. Usually, most design patterns are related to object-oriented paradigm actually.

Type checking system

The next thing that we should have in mind is the type checking system.

The type checking system of a language is the mechanism that checks if a type contains a permitted value accordingly to some specified rules. In other words a type safe language is one that shields the programmer from using the types in a not desired way.

Java is such a language. Here is an example.

//this fails with a compiler error
//error: incompatible types: int cannot be converted to String
String one = 1;

Other than that the extend and the way the type safety is enforced may vary. For example some languages check for type errors at compile time while others check at run time.

Of course there are languages that have limited or none type safety. This doesn’t mean that a non type safe language is inferior to others. There are some merits such as the development process may be faster and the learning curve of such a language quicker as the syntax tends to be simpler.

Such a language is Javascript. Here is an example:

var h = "hello, ";
var w = 5;
alert(h+w);   // prints hello, 5
alert(w+4);   // prints 9
alert(h+w+4); // prints hello, 54

As you can see there aren’t many restrictions on how the types may be used.

Type checking may be more important than it looks. Facebook had created the hack language a while back aiming to produce code of better quality faster.

As it is mentioned in hack’s website "Hack reconciles the fast development cycle of a dynamically typed language with the discipline provided by static typing".

If big companies consider type safety such an important subject and invest time and money on this, then we should definitely have at least a basic understanding of type checking systems.

Memory management

Another important characteristic of a programming language is the memory management system that it uses.

Some languages require the user to allocate and free the memory by hand which of course requires more work. However, it provides better performance and control with higher risk of human errors. Such a language is traditionally C.

On the other hand there are languages that provide a garbage collection mechanisms. This is much safer and new programs are easier and faster to write but of course it costs on performance. Such a language is Java.

As always the best approach depends on the application that we desire to write and the problem we aim to solve.

In the case of garbage collected languages it is essential to have a basic understanding of the available collection strategies and the lifecycle of the objects. In the case of manually freeing the memory one should be very careful not to cause memory leaks or other security holes.

Syntactic sugar

Another factor of a programming language that I find important is the syntactic sugar.

By syntactic sugar I refer to all the ways a programming language can express a solution to a particular problem, elegantly. Often the elegant solutions are the compact ones and the more readable.

This is not an important factor but languages that provide these facilities may be used somewhat faster when developing a new application and may aid the developer to be more expressive.

An example of syntactic sugar would incrementing an integer in Lua and in C:

Lua

i = i + 1

i++;

These "shortcuts" may be harder to read for a newbie but they feel pretty natural if you get familiar with them.

Libraries, frameworks and community

Last but not least an important factor is the toolset that is available for a specific language and the whole ecosystem around it.

A vivid community means that learning material is broadly available, the development of the language will be more active and mature libraries, frameworks and other tools should already exist.

All these small details are actually very important. For example a powerful framework that let you build your next web app in a fraction of the development time compared to other frameworks would be a pretty good reason to choose a particular language.

Furthermore, a larger community means a larger user base. This means more projects in the particular language that leads to more jobs around the particular language and the relevant technologies.

Occasions for choosing a programming language

Now that we have a basic understanding of the characteristics of a programming language and we can detect these characteristics between various languages lets see the occasions that we may face when having to choose a programming language for our next big project.

There are 2 occasions, the project may be a personal one for fun or a professional one for profit.

This is the most important factor that we should have in mind for making the particular choice.

If the project is personal then one has to aim gaining experience through it. So, choosing new languages, frameworks, libraries and technologies is the way to go. Mind that learning a new language is really meaningful if it introduces new concepts. Learning just another language that is similar to the previous known doesn’t add extra value. In the end, it’s just syntax.

By new concepts I refer to the previous characteristics that we have analyzed. One familiar with these characteristics and hands on experience with them may easily jump to a new language that encapsulates similar concepts. Learning syntax is easy, having to learn more advanced things such as a paradigm is the real challenge.

On the other hand on a professional environment we have to choose the most easy solutions. This means the solution that will provide faster the desired result with the available work force. In case the work force is insufficient we have to take into account the availability of new hires with the necessary skills or the time required for a new inexperienced hire to master the required technologies.

Any existing code base has to be taken into account at this decision. We have to be familiar with the legacy technologies used or take into account the time required to move the project to newer technologies. In that case, we should consider if there are reasons that will force us to move to newer technologies in the future anyway.

A formula that I have created for the decision making between two languages A and B during my fosscomm talk is the following. First we calculate the COST of the 2 languages LA and LB:

COST_FUNCTION = (LTLx + ECTLx + FTECLx + LIBECLx + TECHECLx + OTHERECLx) + (TLx + FTLx + LIBLx + TECHLx + OTHERLx) * TIME

Where:

LTLx: The time required to master the language Lx. LTLx = 0 if Lx is already mastered

TLx: The time required to maintain and develop further the application if we plan to use the language Lx for TIME period of time

ECTLx: The time required to rewrite the existing code base to the target language

FTECLx: The time required to master the used frameworks of the existing code-base

LIBECLx: The time required to master the libraries of the existing code-base

TECHECLx: The time required to get familiar the other technologies already used

OTHERECLx: Other factors related to the existing code base

FTLx, LIBLx etc. are the time required for each factor if we are going to use these things for TIME period of time.

So, in order to choose what language is the best to use we have to compare their costs and pick the one with the smaller cost.

COST_FUNCTION(La,YEARS) > COST_FUNCTION(Lb,YEARS)

If the cost functions are equals then we may use some further things as criteria such as the personal preferences of the development team, the average cost of hiring programmers in the particular language or the availability of programmers for the particular language in the market.

Conclusion

The point that I would like to make is that when we have the opportunity we should experiment and learn new things. In the context of programming languages these new things are all the characteristics that I have mentioned before. By getting familiar with all these we will be able to understand better and learn quicker new languages when it will be required.

All this gained knowledge will also be a great asset when we should make choices in our work environment, especially if we are encouraged to use new technologies in order to enrich our company’s know how. This is more usual for smaller companies.

On the other hand if we work on larger companies it is not easy to experiment. Business is about profit and the tech related choices should be the ones that will profit the company the most. Our gained knowledge will be again useful in order to make the best choices.

So study new concepts on your spare time, no syntax! Don’t waste your time! Stick to what you have mastered in your work environment except if there are opportunities for experimentation.

Please, feel free to share in the comments other language characteristics that you think that are important. I would love to read your thoughts on the subject.

Have fun and keep coding!