Introduction for Gophers

Caution! Have syntax has changed since this post was written, making it obsolete. More information here.

This is an introduction to Have, crafted specially for Go developers. Have is very closely related to Go, and its users (if there happen to be any, besides me) will probably be Go users too, so it is the first and (at the time of writing) the only introduction.

Bear in mind that this is an early stage project, and many things don’t work yet. Also, lots of stuff is likely to change after people point out all the blunders I made. So don’t rely on anything yet.

Hopefully, eventually there will be a proper spec too, but right now there’s just this loose and informal document.

And one more thing, in case you notice language (English, not Have) errors, please submit them too, I will appreciate it. If something doesn’t sound natural, it counts too. I did my best, but I’m not a native speaker so I never know.

What Have is

Have is a new programming language that transpiles to Go. It shares a lot with Go, the Go type system is a subset of the Have one, and libraries written in Go should feel natural to use in Have. Code written in Have should be accessible from Go too, in a way that is as elegant as possible (though not always idiomatic).

Code generation is already popular in the Gophersphere, it is often advised by the Go authors themselves and Go comes with tools that make it convenient. Have compiler can be seen just as yet another code generator.

Have syntax is inspired by Go and Python, but semantics is usually the same as in Go.

Goals

The main goal of Have is that of any other hobby project: to fill spare time with interesting activity.

As for more technical ones, assuming that Have gets any popular, I see it as a Go’s companion that is useful in some situations (e.g. writing generic, zero-cost data structures). Besides that, I wanted to test some ideas that could help write correct code. I use Go daily (and love it), writing code that ranges from video streaming servers to cluster orchestration tools. This means that faulty Go code wakes me up at night during my on-call duty. Guys from the Kubernetes project made a list of Go landmines - and I personally have a beef with each one of them. I tried to either fix or make it easier to steer clear of some of them in Have (which resulted in some more or less controversial changes).

In the long run, it would be great to grow an ecosystem of tools, just like Go, and I hope that the compiler implementation will be modular enough to allow it. And Have will need those tools, even more than Go: source mappers (so that stack traces are readable), code generation automation and caching tools, and so on.

Go has a cute mascot (too), but that’s just disguise - it is a serious project backed by a large corporation. Have is not a serious project, it’s just something I do at night when my wife and kid are asleep.

Syntax

Indentation

The most visible difference from Go is that Have is an indentation based language. It works similarily to Python, instead of { and }, use consistent indentation before every statement in a block (and put : before it). Tabs are preferred, and unfortunately we don’t have a tool like gofmt yet.

Instead of this:

if a() {
    b()
} else {
    c()
}

You should do this:

if a():
    b()
else:
    c()

Variable declaration

I wish this section was as simple as the previous one. It’s not sufficient to just decribe the differences, because variable declaration in Have is not a finished thing yet. I’m aiming at solving some issues that I have personally experienced, so I’ll explain these first.

In Go, it’s a pretty common issue to inadvertently overshadow a variable. Ponder for a while over this snippet:

func SomeFunc(bToo bool) (err error) {
	a, err := A()
	log.Println(a, err)
	if bToo {
		b, err := B()
		log.Println(b, err)
	}
	return
}

Now, will the error returned by SomeFunc(true) come from A() or B()?

From A(), of course, because that’s how := works. It doesn’t declare a new variable if it was already declared, unless it was in a different scope.

For me, it was a big source of bugs when I was beginning to use Go, until I’ve built muscle-memory to avoid it. But still, every new developer who joins our team without prior Go experience has problems with overshadowing. Static analysis tools help, but if something could be fixed in the language directly, why not?

So, the only thing that := does and var doesn’t, is how it treats variables that were already declared (and, to me, does it too inconsistently). That’s something I don’t want, at least in the current form, so there’s no := operator in Have. I like the convenience of := though, so there will be something similar too, I just don’t know what exactly yet - but variable re-use will need to be explicit, that’s for sure.

Okay, so how does the syntax look like? Right now, you should just use var everywhere, even in places where var wasn’t allowed in Go, like this:

if var _, err = a(); err != nil:
    return err

Or this:

for var i = 0; i < n; i++:
    a(i)

There’s another difference. When declaring multiple variables in a single statement, this still works:

var a, b, c = x(), y(), z()

But this is allowed as well:

var a = x(), b = y(), c = z()

Structs

This is a big one. Have takes a more traditional approach, where methods are visually placed within the struct, so instead of this:

type A struct {
    x int
}

func (a A) String() string {
    return fmt.Sprintf("x:%d", a.x)
}

You’d write this:

struct A:
    x int
    func String() string:
        return fmt.Sprintf("x:%d", a.x)

What about pointer receivers? They are there, too. You declare them by putting a * before function name. The method receiver is always referred to as self.

struct A:
    counter int
    func *Inc():
        self.counter++

Go’s way of declaring methods is more flexible when it comes to their placement. It is sometimes useful, for example when you want to group methods that implement an interfce from different structs together. Have will have something similar as well - the plan is to implement something called structure opening, where you’ll be able to open a struct and add methods to it. It’s not ready yet, though.

Interfaces

Interfaces are declared similarily to structs. Just note the func before method names - it’s different than in Go, but it’s there for sake of consistency with struct declarations. Example:

interface Reader:
    func Read(p []byte) (n int, err error)

pass, empty interface and one-line blocks

If you happen to know Python, you know pass. It is a statement that does nothing, and it’s useful in situations when the language requires you to write a statement but you actually don’t want to.

So, how does the interface{} look like in Have? Like this:

interface:
    pass

Uh, 2 lines are a bit cumbersome in this case. But, in cases where the whole block consists on a single line of code, Have doesn’t require you to put that code in a separate line. So in Have, we have this:

interface: pass

“go”

The go statement is not implemented yet at the time of writing (but it’s close to the top on the TODO list), but it will will work similarily, taking either a function call expression (as in Go) or a block of code.

So either:

go someFunc()

Or:

go:
    pass # Do something

If, switch, for and the rest

No revolution here beyond making them compatible with the overall syntax changes. Instead of else if there is elif. And that’s how switch looks like:

switch x
case 1:
    pass
case 2, 3, 4:
    pass
default:
    pass

OK, there might be a small revolution in for loops. Probably the most common landmine in Go is that about loop variable scoping. If you’ve never heard about it before, see this example:

for _, el := range []{10, 20, 30} {
	go func() {
		fmt.Println(el)
	}()
}

What will this (most likely) print? Usually this:

30
30
30

I have yet to see a person who wasn’t surprised by that. In my opinion, it’s a very fertile source of bugs. The reason why this happens is that Go binds closures to the variable, and the loop variable is scoped outside of it (it is reused in every iteration). In Have, it’s different (but just for for loops with range) - loop variable is created from scratch in every iteration. It means that this:

for var _, el range {10, 20, 30}:
    go: fmt.Println(el)

Will print 10, 20, 30.

If, for some reason, you really want to reuse the loop variable, you can explicitly scope it outside the loop:

var el int
for _, el range {10, 20, 30}:
    go: fmt.Println(el)

That will print 30, 30 and 30, just as in Go.

In case of non-range for loops, scoping didn’t change, mostly because that would diverge too much from what everbody expects from a language with origins in C. But, in that case, letting an iterating variable “escape” the loop is not allowed. It means that trying to put it in a closure or defer, or extract its address will result in a compilation error. (Actually, this is not yet implemented, but will be.)

Composite literals

Composite literals are almost the same in Have, they even use { and } instead of indentiation. The only difference is that as long as their type can be inferred from assignment, they can be untyped. So you can do this:

var x []string
x = {"a", "b"}

Or this:

var x map[int]string
x = {1: "hare"}

And so on, it works on structs and nested literals too.

When an untyped composite literal is used for variable initialization and type of the variable is not specified, a default type is applied. For list-like composite literals, slice type is used by default (assuming that all values have identical types - assignability is not enough in this case!), and for map-like literals, a map is used (again, assuming that all keys and values are of the same type).

Example:

var sliceOfInts = {1, 2, 3}
var mapOfStringsAndInts = {"a": 1, "b": 2}

Generics

That’s a broad topic and this section will mosty likely be growing & changing after this post goes public. Generics will definitely be changing, too. I don’t recommend relying on anything in Have yet, but generics escpecially so. I expect that people will point out lots of flaws in my design that will need to be fixed. Please keep it in mind!

In general, generics are a programming language construct that is used to parametrize certain other language constructs with type parameters. Have has generic structs and functions. Binding a list of types to a generic is called instantiation and produces a new struct or function.

Where other languages usually use < and > to mark generic paremeters, Have uses [ and ].

Generic structs

That’s how a simple generic struct with one generic paremeter T looks like:

struct SampleStruct[T]:
    func Method(p T) T:
        return p

As you see, T is used within the struct definition as a type name. But the actual type behind T depends on the instantiation. The example below instantiates two structs from one generic struct:

var a SampleStruct[int], b SampleStruct[string]
var x int = a.Method(1), y string = b.Method("one")

Variables a and b have two different types - a’s is SampleStruct[int] and b’s is SampleStruct[string]. After instatiation, they are completely independent from each other and behave as named types in Go spec speak (named types in Go’s specification are used to define what values are assignable, etc.).

Generic functions

Generic functions are similar:

func SampleFunc[T](p T) T:
    return p

Well, that’s pretty much the same. We can use SampleFunc now:

var a int
a = SampleFunc[int](1)
a = SampleFunc(2)

That’s simple, too, but something new happened in line #3. A function was instantiated implicitly - generic parameters were inferred from the arguments.

OK, now I can lay down the rules (yup, they are somewhat redundant - but it’s not a specification yet):

  • only package-level functions can be generic
  • functions can be instantiated implicitly by being called, generic parameters are inferred from function parameters (and only from them at the moment)
  • when function is assigned but not called, it has to be insantiated explicitly (no inference from assignment)
  • methods can’t be generic

Again, the rules aren’t final and can change in the future.

Now we can code something useful: a function that sorts slices using provided comparison function (kids, don’t use bubble sort in real world):

package main

func SortSlice[V](slice []V, cmp func(V, V) bool):
    # Bubble sort!
    for var i = 0; i < len(slice); i += 1:
        for var j = i + 1; j < len(slice); j += 1:
            if cmp(slice[i], slice[j]):
                slice[i], slice[j] = slice[j], slice[i]

func main():
    var numbers = {1, 5, 6, 3, 1, 3, 4, 10, 3}
    SortSlice(numbers, func(a, b int) bool: return a > b)

    for var _, x range numbers:
        print(x)

    var strings = {"dummy", "hare", "I", "don't", "know"}
    # Sort in descending order
    SortSlice(strings, func(a, b string) bool: return a < b)

    for var _, x range strings:
        print(x)

Builtin functions

Go has a package called builtin that contains functions which couldn’t be created in Go. There’s no need for such exceptions in Have, they all can be defined using generics. Usually their generic parameters are inferred, with the exception of make and new, where you need to explicitly specify them.

Example:

var slice = make[[]int](100)
var anonStruct = new[struct: x int]()

Specializations

Specializations are a construct that allows a generic to use different code in instantiations based on its parameters.

You usually use them when just one version of code wouldn’t work for the types you’d like to handle. For example, when writing something that adds and subtracts things, you’d probably use + and - operators for built-in numeric types. But if you wanted to handle custom types too, it wouldn’t work. With specializations, you can handle both in one generic. (BTW. Another popular use of specializations is optimization.)

Specializations in Have are a bit less declarative (or at least appear to be) than in most other languages (that’s the Go spirit!). Basically, they rely on one new statement called when, which can be used inside functions (not outside of them, which means that specializations can’t change members of a generic struct).

The when statement is, in theory, unrelated to generics. It consists of branches with checks and blocks of code, checks control which branches are activated, and the first active branch is allowed to pass its block of code further down the compiler chain.

Code inside when branches has to parse correctly, and it also can’t refer to unknown identifiers. But unless a branch is activated in some instantiation, it’s not typechecked.

Example:

func SomeFunc():
    when int
    is int:
        print("int is int, who would've thought?")
    is string:
        print("int is string, better report a compiler error!")

That was pretty dumb, because when is checked during compilation, so instead of the code above, I could’ve written this:

func SomeFunc():
    print("int is int, who would've thought?")

That would have resulted in the same code being generated, because when is evaluated completely during compile time. Therefore, as you just saw, when is pretty much useless outside generics.

Now, something that seems more useful:

func CopyBytes[T, K](src T, dst K):
    when T, K
    is []byte, implements io.Writer:
        # copy from a slice to io.Writer
        pass
    is []byte, []byte:
        # as above, but between 2 slices
        pass
    implements io.Reader, is []byte:
        pass
    implements io.Reader, io.Writer:
        pass
    default:
        # what now?
        pass

We have just (almost) written a function that can copy from []byte and io.Reader to []byte and io.Writer. It’s missing the code that does the actual copying, but it’s not really important here. You can learn a few more things from this example:

  • when works for multiple arguments
  • besides is, there is also implements and default (BTW. I’m thinking about adding assignable_to or similar, becase is doesn’t work with derived types)
  • if you want to apply the same kind of check to multiple arguments, you can group them together (as in implements io.Reader, io.Writer)

The above example has also exposed another problem. If default is activated, we actually should throw a compilation error. It is possible to do it in a hacky way, by putting a piece of code with a type error in the last branch, but it is something that should be solved in a cleaner way (maybe with a magical compiler_error() function - but other ideas are welcome).

Finally, probably I’ll also implement pattern matching for types in when checks, so that composite types can be caught better.

Code organization

GOPATH, HAVESRCPATH

Have relies on your $GOPATH when it compiles code to .go files. Have code is expected to sit in $HAVESRCPATH, and the resulting code ends up in $GOPATH/src. Directory structure of $HAVESRCPATH mimics that in $GOPATH/src.

Setting $HAVESRCPATH is optional. By default, when the $HAVESRCPATH environment variable is not set, Have internally sets it to $GOPATH/src, so that Have and Go files are interwined in the same directory structure. If you want to separate them (resulting Go files are generated and hence rather fleeting, so it might make sense), set $HAVESRCPATH to something different.

have run && have trans

have run is the equivalent of go run. You can “run” a Have source file with it.

have trans compiles a Have package (and its dependencies) and writes results to .go files.

go generate

It makes sense to use go generate to run have trans. Just create a placeholder Go file that runs go generate.

What next?

A lot of functionality available in Go is still missing, implementing it has the highest priority.

The second most important thing is interoperability with Go code. At first, the focus will be on making Go code callable from Have.

New functionality has the lowest priority at the moment, which doesn’t mean that nothing new will be added until more important things are done. It’s a hobby project, so I need to manage my motivation, and adding new stuff is a good way of doing that.