Programming Languages

Being the earliest step in development cycle code is written, how it is written and what it can be written in can help us make improve the overall quality and security of the system.

Programming languages exist to make writing and thus also maintaining software an easier, faster, safer and more pleasant experience. It should be trivial to see how assembly was a step up machine code, and how languages such as Fortran, and later C was a step up from both assembly and each other. Most modern code today is not written in C, it has manual memory management, undefined behaviour, no modules and no interface-like construct. Programming languages continue to evolve, providing new abstractions, features and tooling with it. We will in this section aim our attention to some older and newer ways we can improve our programs in regards to robustness and readability.

Mutability

RefinementNeat

A bad default.

// Javascript / Typescript
const a = 32;
// vs
let b = 32;

rust

// Rust
let mut a = 32;
// vs
let b = 32;

Very Relevant on Topic

https://research.google/pubs/mutable-value-semantics/

https://www.swift.org/documentation/articles/value-and-reference-types.html

Something about aliasing?

Mutability in reference types

rust

fn foo(t: &Thing) { ... }
// vs
fn foo(t: &mut Thing) { ... }

As with code, most data is read more than it is written. To conform to the principle of least privilege, a variable should only be declared as mutable if it is required.

As with Nullability, it is common for mutability to be the default.

Most languages have support for declaring immutable variables, such as with const, readonly or final. However the exact meaning of these keywords can be tricky, or constants (known at compile-time), which is different from a variable that can be assigned at runtime at initialization, but not reassigned.

This goes for both placement (left side) and the value (right side) at the variable expression. Modern languages will also compound types such records to have the fields of the record be immutable.

Static Type Checking

RefinementReadabilitySafety

Discoverability
Type Errors on compile time

Programming languages are split into the categories statically and dynamically typed. The difference being a given type is resolved at runtime for dynamically typed languages, while statically typed languages can resolve and type check theirs at compile time. The first advantage of static type checking it ensures type safety before the program is run, while dynamically typed languages must perform this check at runtime. The second advantage is it hints the capabilities and invariants of an object.

In comparison with dynamic typing, it by the very definition provides the compiler (and future reader) with the types of the program.

This also provides static type safety, as type discrepancies can be found before the program is actually run or tested. We can thus remove a whole class of logic errors from our program.

'Support' for static typing have even appeared in previous dynamic type languages, such as Javascript with either Typescript or JSdoc, and Python with type hints, or Ruby by using Crystal. Consider the following python code:

python

def surface_area_of_cube(edge_length):
    return f"The surface area of the cube is {6 * edge_length ** 2}."

def surface_area_of_cube(edge_length: float) -> str:
    return f"The surface area of the cube is {6 * edge_length ** 2}."

The typed version tells us the edge_length must be a float, and not a string or other type, and that it outputs a string, which is not something that the function name hints at. While python does support multiplying a number with a string, it does not support exponentiation operator on string, obtain a dynamic type error. In contrast, JavaScript will instead silently output NaN when multiplying a string and a number, which might progress further up. The statically typed version will simply fail the type check.

Nullability

RefinementNeatReadability

null, the billion dollar mistake.

NullPointerExceptions in Java and other languages have since the inception of the null reference, been a plague, as every reference type are allowed to point to the null reference, thus making calling methods or accessing variables a fallible operation. But it does not have to be like this.

Modern approaches is to separate nullability from reference types such it is clear when an object can be null, as null is of cause useful. The two main ways is through nullable types (C#, Kotlin, TypeScript) usually written as T? and option types (Rust, Swift, Haskell, Scala) which use generics Option<T>. Some languages (C++, Java) have option types but they are not strictly necessary, as the regular references can still be null.

Besides the obvious benefit of knowing when your values can be null, we also obtain some syntax and methods to handle the null cases, be it null coalescing, unwrapping or optional chaining. Examples:

=??, ??, ?., ! (C#, TypeScript, Swift)
?., ?:, !! (Kotlin)
?, unwrap(), unwrap_or() (Rust)

Support for null safety is very language dependant, and as such should be considered when choosing a specific language, otherwise you might enable null safety in languages where it is considered an optional feature (such as C# or Typescript). Nullable types of course requires static typing to useful, otherwise we can't distinguish between nullable and non-nullable types at compile time.

Assertions and Panics

SafetyReadability

Handling logic errors.

A check you would rather want at compile-time but can't.

Enforce invariants
When the programmer does something illegal
Example: Null-checking, bounds-checking, divide-by-zero

Crash when invariants or preconditions are broken

Error, RuntimeException <> Execption (java)
panic <> Error (Rust, Go)
logic_error <> runtime_error (C++)

Assertions are most known from test code, where they assert a given behaviour, from the code. However, while somewhat unorthodox they can also be used to enforce behaviour in regular code.

However they should enforce that the programmer is using a function correct (bounds-checking, null-checking, etc.), which would be a logic error, but now regular exceptions and errors, such parsing validation, file handling and such.

It is much preferred to use types to uphold invariants instead, since it encodes the invariant into the parameter directly, instead of forcing the caller of the function to perform the validation check.

Use them where you otherwise would fail deeply or have otherwise unsafe or unsound behaviour if the preconditions were broken.

https://docs.oracle.com/javase/tutorial/essential/exceptions/runtime.html

Errors as Values

ReadabilityRobustness

Exceptions

Hide control flow
Every function becomes fallible
Hard to reason with async programming

Difference between errors that should panic or abort vs errors as exceptions that should be handled.

python

def fib(n: int) -> int:
    if (n < 0):
        raise ValueError
    ...

try:
    fib(n)
except ValueError:
    print("oops")

python

def fib(n: int) -> int | ValueError:
    if (n < 0):
        return ValueError
    ...


res = fib(n)
if res isinstance(ValueError):
    print("oops")

Exceptions normally provide an escape mechanism for which errors can flow separate to the regular returned values.

In most languages, except Java, exceptions are unchecked and untyped, as such you cannot see if or what exceptions you should handle statically.

Furthermore can exceptions obscure control flow, as they bubble up through different means than regular returned values. This can lead to complex cases when involving lambda functions, futures and asynchronous programming.

Furthermore is the typical try-catch verbose and leads to leads to boilerplate and thus disuse of checked/typed exceptions in languages such as Java.

Parse, Don't Validate

Refinement

A lot of programming involves validating that some value upholds some invariant. This can be that it is a valid email, non-negative number or valid UTF-8. But instead of validating the input, parse it into a type.

rust

let email = "john@example.com";

if !validate_email() {
    return Error::BadMail;
}

do_something(email);

rust

let email : Result<Email, _> =  Email::try_from("john@example.com");

do_something(email?);

By instead of validating a value is correct and using it as-is, parse it into a type. This encodes to semantic meaning that the type is always correctly formed, and always consumers of the type to avoid the check. Furthermore is couples the validation check to the type directly, enforcing this invariant.

With the former method, one can forget to validate the type before usage, and as such potentially accept malformed or malicious input.

Structured Concurrency

ReadabilitySafetyRobustness

Probably the easiest way to explain structured concurrency is by (un)structured programming.

Structured programming is/was the shift away from goto or jump statements to more "modern" control flow operatives function calls and if/else blocks, for-loops and while-loops. The goto statement does not make promise that the program flow will return to after it, and thus improve the ability to reason about the program.

In the same vain structured concurrency is the disuse of go or spawn While different than goto, since the original flow is continued past the go statement, the newly spawned task is detached from the parent scope.

func main() {
    var wg sync.WaitGroup
    for i := 0; i < 10; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            // crashes on panic!
            doSomething()
        }()
    }
    wg.Wait()
}

func main() {
    var wg conc.WaitGroup
    for i := 0; i < 10; i++ {
        wg.Go(doSomething)
    }
    wg.Wait()
}

In contrast structured concurrency will have all child tasks awaited in the parent scope, thus when a function returns it is complete and no background tasks hang about. This provides advances in reasoning about cancellation, error handling and resource clean-up.

Structured concurrency as a concept is not strictly supported by many languages, but support can be found in either the standard library or third party libraries.

The scope of concurrent processes is the same as the lexical scope of the function creating it.

This forms a tree like structure of concurrent tasks

Cancellation

Language Support

Rust: futures-concurrency, moro
Go: conc
Kotlin: coroutines (built-in)
Swift: built-in
JS: effection
Java: JEP 453 (preview)
C#: ???

Programming Languages ​

Mutability ​

Very Relevant on Topic ​

Static Type Checking ​

Nullability ​

Assertions and Panics ​

Errors as Values ​

Parse, Don't Validate ​

Structured Concurrency ​

Language Support ​

References ​

Programming Languages

Mutability

Very Relevant on Topic

Static Type Checking

Nullability

Assertions and Panics

Errors as Values

Parse, Don't Validate

Structured Concurrency

Language Support

References