Categories
Uncategorized

Blazor WebAssembly, Monaco and Antlr – Building the AutoStep Editor as a Blazor App

I’m writing this post to show people the possibilities of WebAssembly and Blazor, using an open-source project I’m working on right now.

In this post we’ll cover:

  • Integrating the Monaco Code Editor with Blazor (and Razor Component Libraries in general)
  • Blazor to TypeScript Interop Tips
  • Manual Tokenisation of Code in Monaco (by a .NET Assembly!), including a quick look at performance.
  • Feeding Compilation Results from .NET to Monaco.

With the tools available to me, I can do real-time syntax highlighting and compilation of AutoStep tests in-browser, using WebAssembly to run my .NET library that does a lot of the heavy lifting, and the Monaco editor to provide the actual text editor behaviour. You can do some really cool stuff when you combine the power of .NET with a web-based user interface.

You can find all the code for the AutoStep Editor I’m going to be referencing in the GitHub repository, https://github.com/autostep/AutoStep.Editor.

Before we dive in, there’s a bit of background to cover.

Background

To give a little context, I’m currently building the AutoStep Toolkit.

AutoStep is a new compiler, linker and runner for BDD (Business Driven Development) tests, based on Gherkin syntax, but with some extra language features.

You can find the core library that provides this functionality at https://github.com/autostep/AutoStep.

I need to build a User Interface for writing AutoStep tests that is targeted at non-developer users, so using Visual Studio or VS Code as an editor doesn’t give the user experience I want.

I’ve chosen Blazor because:

  • I can load my netstandard AutoStep package directly into WebAssembly, so I don’t need a server component to run compilation.
  • I prefer to keep the amount of Javascript I have to write to a minimum.
  • I can share types between the front-end and the AutoStep project system.

Right now, I’m just building the basic editor control, before we build the rest of the user interface around it.

Below you can see a little demo GIF of how the editor control looks right now. You can see real-time syntax highlighting and test compilation as you type, with syntax errors being presented by the editor.

The rest of this post is basically going to go over how it works, and some of the WebAssembly magic that gives us this behaviour.

Integrating Monaco

Monaco is the VS Code editor, released as a standalone package that anyone can use; it’s really powerful, and gives us loads of basic text editor behaviour out of the box, even before we add the syntax highlighting and IDE-type functionality.

The first task was to get Monaco working as a Blazor component. I knew that I would need at least some Javascript code to function as the Interop layer, so rather than put that code in my main Blazor Client project (AutoStep.Editor.Client), I decided to put all the Monaco behaviour in a new Razor Component Library (AutoStep.Monaco), which I can use from my main project.

That way, I can keep the node_modules out of my main application project, nice and self-contained in it’s own folder.

I feel like it’s a pretty pleasing pattern to keep any JS interop out of the main Blazor app, in separated components. That way, the main application stays clean and only has to worry about components and general app state.

It also makes each component easier to test on its own.

I’m going to use TypeScript for my interop code, partly because I just like being in a typed world, but also because I can then consume the type definitions exposed by Monaco.

The only actual npm package I need to install and redistribute is monaco-editor, but I also need Webpack to compile my TypeScript and bundle everything together, plus the various Webpack plugins.

You can check the full package.json file for the set of required packages. There’s only 10 packages listed, but even these dependencies result in 5483 installed packages!

To configure Webpack correctly, I used the Monaco Webpack Plugin, which just simplifies getting Monaco building under Webpack. If you follow the instructions in their README, you can’t really go wrong.

Static Files in Razor Component Libraries

One nice feature of Blazor is that if you put your static files in the wwwroot folder of a Razor Component project, when you reference your Component project from your main Blazor App project, you can reference those static resources in your HTML, just by using the special _content path:

<!-- Use the name of the referenced project (or package) -->
<script src="_content/AutoStep.Monaco/app.bundle.js"></script>

For this to work with the Webpack build, I had to do two things:

  • Configure Webpack to output my bundles to the wwwroot folder
  • Configure Monaco to load its dependencies from the _content/AutoStep.Monaco path

The first part was straight-forward, you just have to change the Webpack output path:

//...
output: {
  globalObject: "self",
  filename: "[name].bundle.js",
  path: path.resolve(__dirname, 'wwwroot')
},
//...

For the Monaco configuration, the _content path has to be configured in three different locations. The first two are in the Webpack configuration file:

module: {
    rules: [
        // Other rules here...
        {
            test: /\.ttf$/,
            loader: 'file-loader',
            options:
            {
                publicPath: "/_content/AutoStep.Monaco"
            }
        }]
},
plugins: [
    new MonacoWebpackPlugin({publicPath: '/_content/AutoStep.Monaco/', languages: []})
]

I’ve also told the MonacoWebpackPlugin to not include any built-in languages in the output, because I’m not going to need them.

Finally, in the ‘entry point’ of your Javascript/Typescript (my MonacoInterop.ts), you need to tell Monaco where to load its web workers from:

// @ts-ignore
self.MonacoEnvironment = {
    getWorkerUrl: function (moduleId, label) {
        return "./_content/Autostep.Monaco/editor.worker.bundle.js";
    }
};

Once all the above is done, I can just include the app bundle in my Blazor Client index.html file, and it will load in all the Monaco dependencies:

<body>
    <app class="d-flex">Loading...</app>

    <div id="blazor-error-ui">
        An unhandled error has occurred.
        <a href="" class="reload">Reload</a>
        <a class="dismiss">đź—™</a>
    </div>
    <script src="_content/AutoStep.Monaco/app.bundle.js"></script>
    <script src="_content/Blazor.Fluxor/index.js"></script>
    <script src="_framework/blazor.webassembly.js"></script>
</body>

Blazor JS Interop & TypeScript

Once I’ve got the Monaco code loading in, I now need to use it. I’ll just go over a few tips for using TypeScript for doing Blazor JS Interop.

Interop Classes

The first tip is to define a sensible boundary between your .NET code and your TypeScript. First up, let’s define an entry-point TypeScript class attached to ‘window’:

class MyInterop 
{
    doSomething() 
    {
    }

    getSomething() : string
    {
    }
}

window['MyInterop'] = new MyInterop();

In your C# code, create an internal class of the same name, and encapsulate those methods (I’ve also defined wrappers for the IJSRuntime methods that automatically prefix the name of my TypeScript class):

internal class MyInterop
{
    private readonly IJSRuntime jsInterop;
    private readonly ILogger logger;

    private const string InteropPrefix = "MyInterop.";

    public MyInterop(IJSRuntime runtime, ILogger<MyInterop> logger)
    {
        this.jsInterop = jsInterop;
        this.logger = logger;
    }

    public async ValueTask DoSomething()
    {
        await InvokeVoidAsync("doSomething");
    }

    public async ValueTask<string> GetSomething()
    {
        return await InvokeAsync<string>("getSomething");
    }

    private ValueTask<TResult> InvokeAsync<TResult>(string methodName, params object[] args)
    {
        var fullname = InteropPrefix + methodName;
        logger.LogTrace("InvokeAsync: {0}", fullname);
        return jsRuntime.InvokeAsync<TResult>(fullname, args);
    }

    private ValueTask InvokeVoidAsync(string methodName, params object[] args)
    {
        var fullname = InteropPrefix + methodName;
        logger.LogTrace("InvokeVoidAsync: {0}", fullname);
        return jsRuntime.InvokeVoidAsync(fullname, args);
    }
}

Log your JS Interop calls! This will help a lot with debugging later.

In my AutoStep.Monaco library, I’ve got precisely this set-up (albeit with more methods), with the TypeScript in MonacoInterop.ts, and the C# in MonacoInterop.cs.

I added an extension method to my Razor Component Library that adds my MonacoInterop class to the Service Collection; I can call this during startup in my Blazor App.

public static class ServiceCollectionExtensions
{
    /// <summary>
    /// Add services for the Monaco component.
    /// </summary>
    public static IServiceCollection AddMonaco(this IServiceCollection services)
    {
        services.AddSingleton<MonacoInterop>();
        return services;
    }
}

Then I can inject the MonacoInterop class into any of my Razor Components inside my AutoStep.Monaco project, and invoke my TypeScript methods that way.

Calling Back into .NET Code from TypeScript

When an ‘event’ of some form happens inside the Monaco Editor, I need to invoke a method in my .NET Code.

So far, I’ve found the following pattern to be pretty useful.

First up, add a method to your Interop class to register an event handler.

public async ValueTask RegisterLanguageTokenizer(string languageId, string extension, ILanguageTokenizer tokenizer)
{
    // Wrap the 'tokenizer' in a DotNetObjectReference.
    await InvokeVoidAsync("registerLanguageTokenizer", languageId, extension, DotNetObjectReference.Create(tokenizer));
}

The DotNetObjectReference passes the object to JS in a way that tracks the original object.

In the implementation of ILanguageTokenizer, I have a couple of methods, all marked as [JSInvokable], which indicates they can be called from Javascript.

In your TypeScript Interop class, add the registerLanguageTokenizer method:

registerLanguageTokenizer(languageId: string, extension: string, blazorCallback: IBlazorInteropObject)
{
  // Store the blazorCallback object somewhere to call it in an event handler.
}

The IBlazorInteropObject is something I’ve added; it’s a simple TypeScript interface that defines the useful methods available on the object wrapper Blazor actually passes as that parameter.

/**
 * Interface that defines the useful methods on the .NET object reference passed by Blazor.
 */
export interface IBlazorInteropObject {
    invokeMethodAsync<T>(methodName: string, ...args: any[]): Promise<T>;
    invokeMethod<T>(methodName: string, ...args: any[]): T;

}

I can then use this IBlazorInteropObject to invoke my .NET code.

export class AutoStepTokenProvider implements languages.TokensProvider {
    private callback: IBlazorInteropObject;

    constructor(blazorCallback: IBlazorInteropObject) {
        this.callback = blazorCallback;
    }

    getInitialState(): languages.IState {
        return new AutoStepTokenState(this.callback.invokeMethod<number>("GetInitialState"));
    }

    tokenize(line: string, state: languages.IState): languages.ILineTokens {

        if (state instanceof AutoStepTokenState)
        {
            var result: any = this.callback.invokeMethod("Tokenize", line, state.tokenState);

            return { tokens: result.tokens, endState: new AutoStepTokenState(result.endState) };
        }

        throw "Invalid start state";
    }
}

Line Tokenisation & Syntax Highlighting

For people unfamiliar with it, syntax highlighting code usually involves tokenising a given line of code, which uses a lexer to go through a block of text and produce a set of tokens that give the position of named language constructs, like keywords, variables, strings, etc. The editor then knows which colours to apply to different parts of a line of text.

Monaco allows you to define a ‘grammar’ for a language you want to apply syntax highlighting to., using their Monarch system for describing languages using JSON. Monaco then does the tokenising for you, based on that configuration.

The problem with using Monarch in my situation is that the tokenisation would not be context-sensitive. By that, I mean that the tokenisation can only work off the content of the file it is highlighting, and cannot base the set of returned tokens on anything else.

In my situation, I want to highlight the Given/When/Then lines of a test a different colour if there is no backing step to call; in addition, I only know which part of a step is an argument (in red) based on which step it binds against.

This contextual information cannot be obtained just through using a declarative grammar; I need a more manual approach.

Luckily, Monaco lets you define a manual token provider, using the setTokensProvider method. By implementing the Monaco-defined interface languages.TokensProvider, we can run our own custom code when Monaco needs to re-tokenise a line.

I showed you the TypeScript implementation of that interface earlier, when we were looking at how to call a .NET object from Javascript. All that the AutoStepTokenProvider TypeScript class does is call into an object in our Blazor .NET code, the AutoStepTokenizer, to handle the actual tokenisation.

The JS call for tokenisation must be a synchronous call because the Monaco tokenisation methods don’t allow me to return a promise (although it does execute in a background web worker).

Typically you’d want to make asynchronous calls into your .NET code where possible, but we can’t do that here.

To achieve the required tokenisation performance, I added Line Tokenisation support in the core AutoStep library, which is effectively a special-cased fast path through the normal compilation and linking process.

[JSInvokable]
public TokenizeResult Tokenize(string line, int state)
{
    try
    {
        var castState = (LineTokeniserState)state;

        logger.LogTrace("Tokenise Start in State {0}: {1}", castState, line);

        // Use the project compiler (in the core library) to tokenise.
        var tokenised = projectCompiler.TokeniseLine(line, castState);
        
        // Create the set of models that Monaco expects
        var tokenArray = tokenised.Tokens.Select(x => 
            new LanguageToken(x.StartPosition, TokenScopes.GetScopeText(x.Category, x.SubCategory)));

        return new TokenizeResult((int)tokenised.EndState, tokenArray);
    }
    catch (Exception ex)
    {
        logger.LogError(ex, "Tokenisation Error");
    }

    return new TokenizeResult(0, Array.Empty<LanguageToken>());
}

Once the AutoStep Core library returns the set of tokens for a line, I need to convert those tokens into TextMate scopes. Scopes are effectively names for the different tokens you can get, and Monaco can style each scope differently.

I put the scope mapping configuration in a static array in a TokenScopes class:

static TokenScopes()
{
    // Set up our scopes.
    InitScope("comment.line.number-sign", LineTokenCategory.Comment);
    InitScope("keyword", LineTokenCategory.StepTypeKeyword);
    InitScope("keyword", LineTokenCategory.EntryMarker);
    InitScope("entity.name", LineTokenCategory.EntityName);
    InitScope("entity.name.section", LineTokenCategory.EntityName, LineTokenSubCategory.Scenario);
    InitScope("entity.name.section", LineTokenCategory.EntityName, LineTokenSubCategory.ScenarioOutline);
    InitScope("entity.name.type", LineTokenCategory.EntityName, LineTokenSubCategory.Feature);
    InitScope("entity.annotation", LineTokenCategory.Annotation);
    InitScope("entity.annotation.opt", LineTokenCategory.Annotation, LineTokenSubCategory.Option);
    InitScope("entity.annotation.tag", LineTokenCategory.Annotation, LineTokenSubCategory.Tag);
    InitScope("string", LineTokenCategory.BoundArgument);
    InitScope("string.variable", LineTokenCategory.BoundArgument, LineTokenSubCategory.ArgumentVariable);
    InitScope("variable", LineTokenCategory.Variable);
    InitScope("markup.italic", LineTokenCategory.Text, LineTokenSubCategory.Description);
    InitScope("text", LineTokenCategory.Text);
    InitScope("entity.step.text", LineTokenCategory.StepText);
    InitScope("entity.step.text.bound", LineTokenCategory.StepText, LineTokenSubCategory.Bound);
    InitScope("entity.step.text.unbound", LineTokenCategory.StepText, LineTokenSubCategory.Unbound);
    InitScope("table.separator", LineTokenCategory.TableBorder);
}

Finally, I define my own theme for Monaco so I can style the scopes:

editor.defineTheme('autostep', {
    base: 'vs',
    inherit: true,
    rules: [
        { token: "markup.italic", fontStyle: 'italic' },
        { token: "string.variable", fontStyle: 'italic' },
        { token: "variable", fontStyle: 'italic' },
        { token: "entity.step.text.unbound", foreground: '#969696' },
        { token: "entity.annotation.opt", foreground: '#fbad38' },
        { token: "entity.annotation.tag", foreground: '#fbad38' }
    ],
    colors: {} 
});

editor.setTheme('autostep');

Performance

It’s important to measure performance of code like this, especially because it needs to update the display in real-time as the user types.

If you run the profiler in Chrome DevTools, you can see the activity happening on the background thread that calls into the WebAssembly system, and get an idea of how long your code is spending in the .NET world.

I’ve highlighted which bits are doing what in the call stack, along with some timings.

It’s pretty quick! Even considering the hops into the WebAssembly space and back, tokenisation generally ranges between 3 and 6ms.

A lot of that performance, though, is down to the awesome parser engine we use in the AutoStep Core library, Antlr.

Antlr Overview

Antlr is a parser generator. It can take a grammar describing your language, and output a parser that will turn a block of text into a structured parse tree.

Considering the complexity of the task it has to perform, it produces really efficient parsers.

The Antlr generator is written in Java, but there are runtimes for the parser for a number of platforms, including .NET.

I’m not going to go into loads of depth on how Antlr works, because it is a really broad topic, but I can strongly recommend the excellent book by Terrence Parr, which is a great intro and reference for Antlr.

The full lexer grammar and parser grammar for the AutoStep language can be found in the AutoStep repo.

Line Tokenising Parser

The full parse tree for AutoStep works over the entire file, validating positions and order in a detailed way. That won’t work for tokenising a single line at a time (and risks being too slow), so I added a simpler line-by-line entry-point into the parser (AutoStepLineTokeniser) that helps me tokenise just for this syntax highlighting purpose:

You might ask, why do I even need a parser for this? Surely a lexer is all I need to generate the tokens?

I generate a parse tree for each line because:

  • I want the parser to tell me what ‘alternative’ of the possible line structures I’m looking at.
  • The set of tokens that I report for syntax highlighting are based on similar structures to the full compile, which expect at least a partial parse tree.

Once I have the Antlr parse tree for the single line I can built a set of line tokens with the appropriate categorisations for each token.

If the line is a Step Reference (Given/When/Then), I ask the AutoStep linker if the Step Reference can be bound to an existing step.

The high-level pseudo-code for this whole process looks a little like this:

var parseTree = GetAntlrParseTree(lineText);

if(parseTree is StepReference stepRef)
{
    if(linker.TryBindStep(stepRef))
    {
        return GetTokensForBoundStep(stepRef);
    }

    return GetTokensForUnboundStep(stepRef);
}

return GetRegularTokens(parseTree);

Once the tokens are handed back to the Blazor App, they get turned into scopes and handed off to Monaco for rendering.

Compilation, Linking, and Message Markers

Ok, so we’ve got line tokenisation, and syntax highlighting. Now I want to show underline markers when something is wrong.

Monaco makes this an absolute breeze, with a concept called ‘markers’, which are for precisely this purpose, but let’s take a look at how this is arranged. First, let’s look at the line in the Razor file that renders our custom MonacoEditor component:

<MonacoEditor Uri="@currentFile.FileUri.ToString()" 
              Value="@currentFile.Source.OriginalBody" 
              ModelMarkers="currentMarkers" 
              OnModelChanged="m => CodeChangedHandler(m.CurrentValue)" 
              LanguageId="autostep" />

When the content of the Monaco Editor changes, after a short delay (so we don’t recompile after every keystroke), our CodeChangedHandler will be invoked, with the new content of the editor as an argument.

When currentMarkers changes, the MonacoEditor component will pass those new markers down to the Monaco Javascript.

When the code for the file is changed, we ask the Project Compiler to compile and link the entire project. Only those files that have changed actually get compiled.

When that has completed, we have a set of Compilation & Linker Messages for the file, for example:

(8,17,8,25): Error ASC00011: Not expecting an Examples block here; did you mean to define 'My Scenario' as a Scenario Outline rather than a Scenario?
(3,1): Error ASC20002: There are multiple matching step definitions that match this step.

To use those in Monaco, we just need to convert them into MarkerData structures, i.e. the format Monaco understands. I’ve defined a MarkerData class in C# that serialises directly to the equivalent Javascript structure.

private static MarkerData GetMarkerDataFromMessage(CompilerMessage msg)
{
    var severity = msg.Level switch
    {
        CompilerMessageLevel.Error => MarkerSeverity.Error,
        CompilerMessageLevel.Warning => MarkerSeverity.Warning,
        _ => MarkerSeverity.Info
    };

    var endPosition = msg.EndColumn;

    if(endPosition is null)
    {
        endPosition = msg.StartColumn;
    }
    else
    {
        // Expand message end to the location after the token
        endPosition++;
    }

    return new MarkerData($"ASC{(int)msg.Code:D5}", msg.Message, severity, msg.StartColumn, msg.StartLineNo, endPosition.Value, msg.EndLineNo ?? msg.StartLineNo);
}

Once I have the correct data structures, I can just pass those over to my TypeScript class using regular JS Interop, and call editor.setModelMarkers to update the set.

/**
    * Set the model markers for a text model.
    * @param textModelUri The URI of the text model.
    * @param owner The owner of the markers.
    * @param markers The full set of new markers for the model.
    */
setModelMarkers(textModelUri: string, owner: string, markers: editor.IMarkerData[])
{
    var modelCtxt = this.models[textModelUri];

    if (!modelCtxt) {
        throw "Specified model not created.";
    }

    editor.setModelMarkers(modelCtxt.textModel, owner, markers);
}

Hey, presto! Compilation errors, syntax highlighting, all in the browser with no server work beyond static file hosting!

What’s Next

Features going into the AutoStep Editor over the next few months include:

  • An actual User Interface, rather than just an Editor!
  • Intellisense, and automatic step suggestions as you type.
  • Hover documentation, showing step documentation if you hover over one.
  • Go-To-Reference for steps, that navigates to the Step Definition for a step if you defined a step in an AutoStep file.

Keep an eye on the repository if you want to see how it goes, there may well be another couple of follow-up posts as we make progress.

2 replies on “Blazor WebAssembly, Monaco and Antlr – Building the AutoStep Editor as a Blazor App”

I saw on Github that the project it abandoned, but this blog post was still very useful for me, especially “line tokenization”

Like

Thanks! I switched over to using VS Code for the IDE rather than building my own, but its a similar tokenisation process.

Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s