Tuesday, December 11, 2018

Marlowe: financial contracts on blockchain

[From the IOHK blog https://iohk.io/blog/marlowe-financial-contracts-on-blockchain/]

The first computers were programmed in “machine code”. Each different kind of system had a different code, and these codes were low-level and inexpressive: programs were long sequences of very simple instructions, incompressible to anyone who had not written them. Nowadays we are able to use higher-level languages like C, Java and Haskell to program systems. The same languages can be used on widely different machines, and the programs’ structures reflects what they do; on blockchain, their equivalents are languages like Solidity and Simplicity. These modern higher-level languages are general purpose – they can be used to solve all sorts of different problems – but the solutions they express are still programs, and they still require pogromming skills to use them effectively.

In contrast, Marlowe is a domain-specific language (DSL) which is designed to be usable by someone who is expert in a particular field: in the case of Marlowe, financial contracts, rather than requiring programming skills to use it. Using a DSL has many advantages beyond its use by non-programmers:

  • we can ensure that certain sorts of bad programs cannot even be written by designing those possibilities out of the language;
  • we can also more easily check that programs have the properties that we want: for example, in the case of a financial contract we might want to make sure that the contract can never fail to make a payment that it should.
  • Because it is a DSL, we can build special-purpose tools to help people write programs in the language. In the case of Marlowe we can simulate how a contract will behave before it is run for real on the system; this helps us to make sure that the contract we have written is doing what it is intended to.

Marlowe is modelled on financial contract DSLs popularised in the last decade or so by academics and enterprises such as LexiFi. In developing Marlowe we have adapted these languages to work on blockchain. Marlowe is implemented on the settlement layer of the Cardano blockchain, but could equally well be implemented on Ethereum/Solidity or other blockchain platforms; in this respect it is “platform agnostic” just like modern programming languages like Java and C++. The online Meadow simulation tool allows you to experiment with, develop and interact with Marlowe contracts in your web browser, without having to install any software for yourself.

What does a Marlowe contract look like? It is built by combining a small number of building blocks that describe making a payment, making an observation of something in the “real world”, waiting until a certain condition becomes true, and so on. Where we differ from earlier approaches is in how we make sure that the contract is followed. This means not only that the instructions of the contract are not disobeyed, but also that the participants don’t walk away early, leaving money locked up in the contract forever. We do this using two tools, commitments and timeouts: a commitment requires a participant to “put their money on the table”, and through timeouts we make sure that this commitment happens in a timely manner or remedial action is taken. Putting these constructs together we are able to incentivise participants to continue with the contract once they have committed to it.

We’re working on a full release of Marlowe for mid 2019, when it will be available on Cardano SL. At the moment you're able to explore Marlowe for yourself using Meadow, and find out much more detail from our online paper. In the next six months we’ll be polishing the language design itself and developing a set of templates for popular financial instruments, as well as using formal logic tools to prove properties of Marlowe contracts, giving users the highest level of assurance that their contracts behave as intended.

Links

Research paper on Marlowe: Financial contracts on blockchain: https://iohk.io/research/papers/#2WHKDRA8

The Marlowe project on github: https://github.com/input-output-hk/marlowe

Meadow: online simulation and visualisation of Marlowe contracts: https://input-output-hk.github.io/marlowe/

Wednesday, August 22, 2018

Review of: CoSMed: a confidentiality-verified social media platform

CoSMed: a confidentiality-verified social media platform Bauerei T., Pesenti Gritti A., Popescu A., Raimondi F.  Journal of Automated Reasoning

What do we mean when we say that a system is verified? In theory it means that certain properties of a system have been proved to hold. In practice, the answer is rather more complicated.

The first thing to clarify is the form of proof: computer scientists have developed a variety of logics, both general and special purpose, for proofs of properties of correspondingly various kinds. The proofs themselves might be by pencil-and-paper or, increasingly, formalized in a computer in some way. Trust in the verification relies on trusting the proof assistant--this is not the place to go any further down that particular rabbit hole, but anyone interested should read MacKenzie [1] and Nipkow et al.’s paper [2] on the Isabelle proof assistant, which is a sophisticated and mature system that assists users to construct machine-checked proofs “by hand” using a battery of existing libraries and tactics.

Now that we have a sense of what will be used to perform the verification, we need to identify what will be verified. Typically (and this is the case here) we work with a model of the real system and argue that this represents the essential properties of the real thing. The authors do something subtly different: the model is a state machine, represented by its step function in the internal language of Isabelle, which is a relatively straightforward functional language. They then use the extraction facility of Isabelle to export this model to Scala, so that it is rendered directly into code that can be executed in a (real) web framework, which could in principle be verified itself, although that is not the focus of this work.

The paper, an extension of a paper published in an established conference series on interactive theorem proving, gives clear explanations of the application domain and the proof, in particular the approach to information flow that the verification uses. Essentially, the authors prove that Alice can only see posts from Bob that were made when they are friends, a property which can change dynamically. So, even if they are currently friends, posts made when they were not are not visible. Technically, this expands the authors’ earlier work on a similar “conference” system in which roles are statically rather than dynamically defined.

The paper gives a clear exposition of the model, the properties proved, and the proof techniques, and I would expect that the work could be reproduced from the descriptions given. What I missed, and what would have added greatly to the value of the paper for others, is an account of how the verification and the modeling interacted. Is it the case that the modeling entirely preceded the verification, so that the model was built correctly from the start, or was it the case that the very process of attempting to verify the system required it to be modified itself? Insights like this would be very valuable to someone embarking upon a similar exercise, and would potentially help readers overcome the considerable learning curve that there is in going from novice to experienced user of systems like this.

Nevertheless, the paper makes a definite contribution to the growing literature on verified systems, and the work in general is particularly noteworthy because it was done in conjunction with a third-sector organization to support their online social media presence.

[1] MacKenzie, D. Mechanizing proof: computing, risk, and trust. MIT Press, Cambridge, MA, 2001.

[2] Nipkow, T.; Paulson, L. C.; Wenzel, M. Isabelle/HOL: a proof assistant for higher-order logic (LNCS 2283). Springer, Berlin, Germany, 2002.

Monday, October 16, 2017

Is Haskell the right language for teaching functional programming principles?

The “park bench” panel at the Haskell eXchange last week talked about a lot of things, but some of them can be summarised by the question “Is Haskell the right language for teaching functional programming principles?”.

The point about this is that no one is in disagreement about how good a language Haskell is for doing real work – and that’s underlined by a meeting like the Haskell eXchange, which is primarily for people using Haskell in practice, or wanting to use it. Rather, the question is about whether the principles that Haskell is built on get obscured by the fact it’s a large and complicated language. You can see an example of this at the foot of this post: what you can see there is a perfectly plausible beginner interaction with ghci.

Some alternatives came up: there were advocates for Elm, http://elm-lang.org, and for PureScript, http://www.purescript.org, both of which are substantially younger than Haskell. It’s easier to see the attraction of Elm: it’s a lot simpler than Haskell, and so presents fewer impediments to beginners; it has also got a nice online environment. What are the downsides? While a number of books are in development (just google …) there aren't the mature texts that Haskell has. Also, it is arguably too tied to the “web development” paradigm, and so will skew learning towards that. 

PureScript also looks interesting, but perhaps more for developers than learners? While Elm is definitely simpler than Haskell, and will remain so, PureScript seems to be on a path to replicating as much of Haskell as possible, in the JavaScript world, so nothing particularly to be gained from making a switch. But to be clear, I’d be interested to hear arguments for PS that I might have missed … 

There’s a trend of fitting new syntax to old languages. Reason, https://reasonml.github.io, has done this for OCaml and Elixir, https://elixir-lang.org, for the Erlang VM. The languages are intended to be more approachable for those with experience of JavaScript and Ruby, but also benefit from their youth: libraries are less sprawling, and contributions more convergent; on the flip side, they can re-use all the previous developments and libraries for their host languages too. Still, neither seems to be an obvious answer for learners, and, of course, they are both impure and strict!

A final point: why not go for subsets of Haskell, like the Dr Racket approach, https://racket-lang.org? It would be interesting to see development of different preludes: most of the existing initiatives (just google …) seem to be designed to enhance the standard prelude, but working in the opposite direction might allow us to replicate the helium environment, for example. Another suggestion (due to Dominic Orchard) is to be able to close certain classes, so that there’s no way that an instance of Num can be declared for Int->Int, thus avoiding a set of error messages unhelpful to beginners.

What’s the best way forward? It’s hard to know. Maybe we should stick with the system we have, and for which we have some idea about teaching …  but it would be interesting to see how moving to Elm worked in practice in delivering effective functional programming teaching.

Full disclosure: at Kent, after teaching Haskell for many years, we currently a compulsory second year course on “functional and concurrent programming” using Erlang as the main vehicle, but introducing a little Haskell. An optional final year course on “programming language implementation” builds on a small but complete compiler written in OCaml. It’s possible to get started with Erlang quickly because it’s syntactically very lean, both for functional and for concurrent programming; the major disadvantage is that it’s substantially more difficult to do type-driven development.

There is some more material on the web comparing elm and PureScript (and others): 

Exploring the length function …




Thursday, August 17, 2017

Review of Graham Hutton's Programming in Haskell, 2e

Graham Hutton's Programming in Haskell is a leading textbook on the functional programming language Haskell that first came out in 2007, and it is instructive to see how its second edition reflects what has changed over the last decade. A particular strength of the first edition is its brevity – it comes in at some 150 pages - and the second edition doubles that length. Even taking into account a less compressed layout, this probably reflects some 50% more material, which comes in the form of more examples, more exercises, some sample solutions and coverage of brand new topics.

The book is now structured into two parts: “Basic Concepts" and “Going Further". With a little rearrangement the first part is made up of chapters from the first edition (augmented as noted above) and covers the basics of functional programming in general: function definitions, recursion, types and higher-order functions; as well as more Haskell-specific aspects: list comprehensions, type classes and, of course, the GHC system. The section is rounded off with a capstone example: a “Countdown" program, that shows how Haskell is used to solve a particular concrete problem. Hutton writes very clearly throughout, and illustrates his exposition with a well-chosen set of examples.

Knowledgeable readers may well have spotted that two of the most distinctive aspects of Haskell - monads and lazy evaluation - are not included in the introductory half of the book. Hutton is not alone in doing this: other authors of Haskell texts [2] and (I should declare an interest) myself included [1] have taken the same position in concentrating first on the essential aspects of functional programming, only looking at the particulars of how it works in Haskell after that.

While some parts of the second half – lazy evaluation and reasoning about functional programs - are treated much as before, and there is an entirely new chapter on “calculating compilers" that gives readers a glimpse of Hutton's research interests, the major changes from the first edition - both pedagogical and technical - come in the second half of the book.

The first edition hardly treated monads beyond introducing the type class, and using the do notation in describing interactive programs, whereas the second edition foregrounds monads much more, and presents parsers as an extended example of monads. A change like this would have been entirely foreseeable even at the point when the first edition was written, but other updates reflect much more recent changes in the Haskell ethos. Haskell has taken on a much more explicitly “algebraic" flavour, reflected by a range of new type classes that embody patterns of computation inspired by category theory.

Just as monads can be seen to embody the idea of "effectful" (potentially side-effecting) computation, "applicatives" capture the notion of applying a pure function to effectful arguments. Similarly, "monoids" are abstract structures in which elements can be combined by a binary operation with a unit (think multiplication and 1), and these can then be "folded" over collections of values, just as a set of numbers can be multiplied together. This new material is also very clearly and concisely presented, and gives an approachable introduction to more abstract ideas.

Haskell is "growing old disgracefully": it was first defined in 1990, and for twenty years or so was seen primarily as an academic, research-oriented language. The last decade has seen it grow in popularity as a practical language for program development from startups to financial multinationals. The language has also moved on, and taken up the categorical / algebraic approach more fully, and it is good to see this reflected in Hutton's new edition and HaskellBook [2]; other authors will need to do the same.

To conclude, Hutton's revisions of Programming in Haskell have taken the best aspects of the original – conciseness, clarity and approachability – and built on this foundation to accommodate how Haskell has evolved. I recommend it highly.


[1] Simon Thompson, Haskell, The Craft of Functional Programming (3ed), Addison-Wesley, 2011. http://www.haskellcraft.com/craft3e/Home.html

[2] Christopher Allen, Julie Moronuki, Haskell Programming from First Principles, Gumroad, 2017. http://haskellbook.com/

[To appear in Computing Reviews.]

Wednesday, September 7, 2016

Monitoring Distributed Systems – Lessons from the RELEASE project

The observer effect tells us that we cannot observe a system without disturbing it; on the other hand, without observation we are unable to correct, evolve or optimise systems. In this post we first discuss the general question of observing distributed systems, and then we look at some of the ideas and prototypes for more effective monitoring that come out of the RELEASE project, which was focussed on building scalable distributed systems in Erlang.

Monitoring


We begin by exploring what it means to monitor a distributed system, how it is done, and the uses to which monitoring data can be put. Sloppily, we know, we will use the term “monitoring” to cover activities of monitoring, observation, tracing and visualisation: what they have in common is the fact that they involve extracting information from a system which goes beyond the information it is required to produce to function and gives information about how it is performing that function.

After setting out the various dimensions of monitoring, we conclude by discussing the ways in which monitoring can load a system. This sets the scene for a discussion of how to mitigate this loading, in the context of Erlang distributed systems.

Why monitor?


We monitor to make sure that a system is delivering what it should. This might be a matter of ensuring that it can maintain performance over time, as unread messages accumulate in mailboxes, perhaps. It could mean observing the way in which (eventual) consistency is managed, so that the system is able to respond appropriately despite there being disruptions to infrastructure. It could also have a more active purpose: to allow the system to adapt to changing circumstances, either through real-time reactive behaviour, or through longer term code evolution.

What are we monitoring?


In a distributed system two kinds of activity can be observed: first there are data which come from a single node, such as CPU usage, or trace messages to particular function calls; there are also data arising from the traffic between different nodes.

As the examples just given show, these data can be event-based, such as log messages about function calls or other system events; or they can be observations, such as CPU usage. While it is in the user’s control how often to observe a property of the system, trace messages will be generated according to how the system is behaving, and so the rate and scale of their production is not under user control.

Different levels of monitoring are possible, too. Simple data, such as CPU or memory usage, can be gathered for all systems, whereas in many cases the most useful data is application-aware, more directly reflecting the semantics of the system that is being observed.

When do we monitor?


Monitoring can be online: information about a system is available in real-time, so that problems can be identified and rectified as a system runs. A drawback of this approach is that data analysis is limited by the fact that it must produce results on the basis of what has happened up to a point during execution. It must also do this in a timely way, using a limited amount of data.

On the other hand, more complex and complete analyses can be produced post hoc using data generated during the system execution. The two are combined in a hybrid approach, in which  offline analyses are repeatedly generated for portions of an execution.

How do we react?


It may be that the results of monitoring are presented to an operator, and it is up to her to decide how to react to what she sees, reconfiguring or rebooting parts of a system for instance. It is also possible to design automated systems that use monitoring data: these range from a simple threshold trigger fired, for example, by CPU loading reaching a particular level, to a fully autonomic system.

The burden of monitoring


Monitoring requires data to be produced by the system, and for these data to be shipped around the system to a central monitoring point where storage, analysis and presentation can take place. We’ll look at these in more detail once we have discussed a particular concrete case: Erlang/OTP.

Erlang


Erlang is a concurrent, distributed, fault-tolerant programming language that comes with the Open Telecom Platform (OTP) middleware library. In distributed Erlang each node runs a copy of the BEAM abstract machine and runtime: separate nodes can be run on the same or different hosts. The “out of the box” Erlang distribution model provides full connectivity between all nodes; but this can be modified by manual deployment and connection of “hidden” nodes, which are unconnected by default. Erlang nodes can also be partitioned into global groups, with each node in at most one group. (Note: there is current work on improving the scalability of distributed Erlang: link).

SD-Erlang is an adaptation of Distributed Erlang designed to be more scalable by limiting all-to-all connectivity to groups (called “s_groups”). These groups may overlap, thus providing indirect connectivity between nodes in different s_groups, SD Erlang also allows non-local spawning of processes on nodes determined by various attributes including measures of proximity.

Improving monitoring 


In this section we  look at a number of ways of improving monitoring of distributed systems; these are illustrated by means of tools built to monitor Erlang and SD Erlang systems, but the lessons are more generally applicable.

SD-Mon is a tool designed specifically to monitor SD-Erlang systems; this purpose is accomplished by means a “shadow” network of agents, which collect and transit monitoring data gathered from a running system. The agents run on separate nodes, and so interfere as little as possible with computations on system nodes; their interconnections are also separate from system interconnects, and so transmission of data gathered does not impose an additional load on the system network.

SD-Mon is configured automatically from a system description file to match the initial system configuration. It also monitors changes to the s_group structure of a system and adapts automatically to the changes, as illustrated here:



Filter data at source: BEAM changes


The Erlang BEAM virtual machine is equipped with a sophisticated tracing mechanism that allows trace messages to be generated by wide range of program events. Tracing in this way typically generates substantial data, which can be filtered out downstream. We were able instead to modify the tracing system itself to filter some messages “at source” and so avoid the generation and subsequent filtering of irrelevant data. These changes make the data gathered for Percept2 more compact.

Enhanced analytics: Percept2


Percept2 provides post hoc, process-level, visualisation of the working of the BEAM virtual machine on a single Erlang node. When a node runs on a multicore device, different processes can run on the different cores, and Percept2 enhances the analysis of the original Percept in a number of ways (Percept is a part of the Erlang distribution). For instance, it can distinguish between processes that are potentially runnable and those that are actually running, as well as collecting runtime information about the dynamic calling structure of the program, thus supporting application-level observation of the system. Multiple instances of Percept2 can be run to monitor multiple nodes.

Improved infrastructure: Percept2


The post hoc analysis of Erlang trace files for the visualisation in Percept2 is computationally intensive, but is substantially speeded up through parallel processing of the logs using  concurrent processes on a multicore machine. Logging of some trace information is also performed more efficiently using OS-level tracing in DTrace.

Network information: Devo / SD-Mon


Existing monitoring applications for Erlang concentrate on single nodes. In building Devo which visualises inter-node messaging intensity in real time, it was necessary to analyse Erlang trace information to find inter-node messages; Devo and SD-Mon also show information about messaging through nodes that lie in the intersection of two s_groups, thus providing the route for inter-group messages.

Monitoring and deployment: WombatOAM


WombatOAM provides support for Erlang and other distributed systems that run on the BEAM virtual machine. As well as providing a platform for deploying systems, it also provides a dashboard for observing metrics, alerts and alarms across a system, and can be integrated with other monitoring and deployment infrastructure. WombatOAM is a commercial product from Erlang Solutions Ltd, developed on the RELEASE project.

Conclusions


The aim of this post is to raise the general issue of monitoring of advanced distributed systems. After doing this we have been able, inter alia, to illustrate some of the challenges in the context of monitoring Erlang distributed systems.

We are very grateful to the European Commission for its support of this work through the RELEASE project: EU-ICT Specific targeted research project (STREP) ICT-2012-287510. Thanks also to Maurizio Di Stefano for his contributions to this post.

Tuesday, May 10, 2016

Trustworthy Refactoring project

Research Associate Positions in Refactoring Functional Programs and Formal Verification (CakeML)


The Trustworthy Refactoring project at the University of Kent is seeking to recruit postdoc research associates for two 3.5 year positions, to start in September this year.

The overall goal of this project is to make a step change in the practice of refactoring by designing and constructing of trustworthy refactoring tools. By this we mean that when refactorings are performed, the tools will provide strong evidence that the refactoring has not changed the behaviour of the code, built on a solid theoretical understanding of the semantics of the language. Our approach will provide different levels of assurance from the (strongest) case of a fully formal proof that a refactoring can be trusted to work on all programs, given some pre-conditions, to other, more generally applicable guarantees, that a refactoring applied to a particular program does not change the behaviour of that program. 

The project will make both theoretical and practical advances. We will build a fully-verified refactoring tool for a relatively simple, but full featured programming language (CakeML https://cakeml.org), and at the other we will build an industrial-strength refactoring tool for a related industrially-relevant language (OCaml). This OCaml tool will allow us to explore a range of verification techniques, both fully and partially automated, and will set a new benchmark for building refactoring tools for programming languages in general. 

The project, which is coordinated by Prof Simon Thompson and Dr Scott Owens, will support two research associates, and the four will work as a team. One post will focus on pushing the boundaries of trustworthy refactoring via mechanised proof for refactorings in CakeML, and the other post will concentrate on building an industrial strength refactoring tool for OCaml. The project has industrial support from Jane Street Capital, who will contribute not only ideas to the project but also host the second RA for a period working in their London office, understanding the OCaml infrastructure and their refactoring requirements.

You are encouraged to contact either of the project investigators by email (s.a.owens@kent.ac.uk, s.j.thompson@kent.ac.uk) if you have any further questions about the post, or if you would like a copy of the full research application for the project. We expect that applicants will have PhD degree in computer science (or a related discipline) or be close to completing one. For both posts we expect that applicants will have experience of writing functional programs, and for the verification post we also expect experience of developing (informal) proofs in a mathematical or programming context.

To apply, please go to the following web pages:

Research Associate in Formal Verification for CakeML (STM0682): Link

Research Associate in Refactoring Functional Programs (STM0683): Link

Simon and Scott

Thursday, July 9, 2015

What feedback should a programming MOOC provide?



What sort of feedback models can be used in a MOOC and other online learning activities? In this note we look at the different styles of giving feedback before looking at automated feedback in a little more detail, and come to a conclusion about what is essential in the next iteration of the Erlang MOOC.

Different feedback styles



One mechanism is to assign a Teaching Assistant (TA) to perform the work. The advantage of this is that the TA can tune their feedback to the particular piece of work, and the individual submission for that work. Feedback can be given on partial or incorrect programs, and can cover all aspects of the program’s behaviour. This can also be given ay any stage during the assessment, and not only when a completed solution is available.

One limitation is that it can be impossible to give human feedback alone on a large assessment, for which some testing data and other automated analyses may be necessary to provide the TA with suitable diagnostic information. A more fundamental difficulty is that this is difficult to scale, or to provide in a timely way to suit the participants’ different speed of working. So, while this can work for a small, focussed, group, isn’t feasible for the sort of numbers expected in a MOOC (or even in a university class of 150).

Participant self-assessment provides one route to scaling. The model for this is to give the student a comprehensive mark scheme and marking instructions, and to ask them to apply that scheme to their own work. This can then be moderated by a TA or other teacher. This approach has the pedagogical advantage that it gives the student a chance to gain insight into their own work through applying the marking scheme, but it can only be used on final or complete solutions to assessments, and certainly can’t be used to help students towards a final solution.

A variant of this is peer assessment, under which pairs of students are asked to give feedback to each other on their work. Again, this has a pedagogical advantage of allowing  students to gain further insights into how work is assessed, and in giving them a better perspective on their work through seeing another’s. It can also be used to give students support during the development of the solution, and can indeed become a form of peer programming. 

There are two variants of the peer assessment approach: the pairs can be assigned “statically” in advance of the work being done, or the process can be more dynamic than that, assigning the pairing only when the two have each made a submission. It is obviously a scalable solution, though without the consistency of having a single marker; it can be supported with appropriate materials, including marking schemes etc., just as for self-assessment.

The final option, and the one that we look at in more detail in the next section, is providing some form of automated feedback to students. Of course, it it possible to combine these modes – as hinted earlier – supplementing some form of person to person feedback with data from a variety of automated tests and analyses.

Automated feedback



On the whole, a pre-requisite for giving feedback is that the program unit – call it a module here – needs to be compilable. Without that it is difficult to give any meaningful automated feedback. Though difficult, it is not impossible, and some sorts of lexical-based analysis are possible even for programs that are incomplete or contain errors. Nevertheless, the rest of this section will assume that the program can be compiled successfully.


What kind of feedback can be provided using some sort of automated analysis?


  • Static checks
    • type checks
    • compiler warnings and compiler errors
    • abstract interpretations and other approximate analyses e.g, dead code analysis
  • Style analysis
    • decidable e.g. lengths of identifiers
    • undecidable (requires approximation), e.g. the module inclusion graph.
    • intensional properties: e.g. this is / is not tail recursive
  • Tests
    • hand-written unit tests, performed by the user
    • unit tests within a test framework; can be performed by the user, or the system
    • integration tests of a set of components, or of a module within context
    • these may be tested against mocked components
    • user interface testing
  • Properties
    • logical properties for functions
    • property-based testing for stateful APIs
    • both of these can be accompanied by “shrinking” of counter-examples to minimal
  • Non-functional properties
    • efficiency
    • scalability
    • fault-tolerance (e.g. through fuzz testing)


What role does testing play in online learning?


  • Confirming that a solution is (likely to be) correct or incorrect.
  • Pointing out how a solution is incorrect.
  • Pointing out which parts of a solution are correct / incorrect.
  • Pointing out how a solution is incomplete: e.g. a case overlooked.
  • Assessing non-functional properties: efficiency, scalability, style etc


Rationale for using automated feedback


Reasons in favour of using automated feedback, either on its own or as one of a number of mechanisms.
  • Timeliness: feedback can be (almost) immediate, and certainly substantially quicker than from a peer reviewer.
  • Comprehensiveness: can do a breadth of assessment / feedback which would be infeasible for an individual to complete.
  • Scalability: in principle it scales indefinitely, or at least to the limit of available resources; if the evaluation is done client-side then there is no resource cost at all.
  • Consistency of the feedback between different participants.
  • Complements peer interactions: can concentrate on “why did this fail this test?”


There are, however, drawbacks to using automated feedback.
  • Can miss the high-level feedback – covering “semantic” or “pragmatic” aspects – in both positive and negative cases.
  • Restricted to programs that can be compiled or compiled and run. 
  • Difficult to tie errors to faults and therefore to the appropriate feedback.

There is a cost to automating testing, as it needs to be implemented, either on the client side – typically within a browser – or on a server. There is a potential security risk if the testing is performed server-side, and so some sort of virtualisation or container will generally be used to test programs that could potentially interact with the operating system. On the server side, any technology can be used, whereas on the client it will typically use whatever facilities are provided by the browser.


Recommendations



At the base level we should provide two modes of feedback when the Erlang MOOC is run next.

Dynamic peer feedback, where pairs are matched dynamically as final submissions are made. This allows solutions that are pretty much complete – and hopefully correct – can receive feedback not only on correctness but also on style, efficiency, generality etc. This gives qualitative as well as quantitative feedback, but requires at least one other participant to be at the same stage at roughly the same time.

Unit tests and properties should be provided for participants to execute themselves. While this is less seamless than testing on the server side, it gives participants freedom to use the tests as they wish, e.g. in the course of solving the problem, rather than just on submission.

Understanding tools. We should include information about interpreting output from the compiler and dialyzer; some information is there, but should make it clearer.

Other feedback is desirable, and particularly integrating as much of this feedback into the browser would make an attractive user experience, particularly if it is integrated with the teaching materials themselves, à la Jupyter. As an alternative, the feedback support could be integrated with the server, but this will require effort to craft a bespoke solution for Moodle if we’re not yet using a mainstream MOOC platform.