EDDYMENS

Updated 2 months ago

Creating A Browser-based Interactive Terminal (Using XtermJS And NodeJS)

Table of contents

Why a terminal in the browser?

A browser-based terminal has many practical use cases.

Instead of sharing server access through SSH keys, you can provide access via a web-based terminal that is secured behind a login screen, which you control. For instance, DigitalOcean offers browser-based terminal access to its droplets, and Play with Docker [↗] allows terminal access to Docker containers directly in the browser.

If you build developer tools, you may eventually need to include an embedded terminal in your app.

Another useful scenario is offering controlled terminal access, where you can restrict certain commands from being run for security or management purposes.

Get the source code

The source code used for the interactive terminal can be found on Github [↗]. The README.md file contains all the steps you need to get it up and running.

The purpose of this post is to explain how the different parts of the code work.

How it works

XtermJS structure [→]

There is the frontend implementation which is a terminal emulator recreated using good old HTML, CSS, and Javascript.

Whenever the user types a command in the emulator it's sent over to a NodeJS backend through a WebSocket connection.

The reason for using WebSocket instead of the standard HTTP protocol is that the real time communication helps us better mimic how a real terminal works.

When the backend receives a command from the terminal emulator, it runs the command in an actual terminal. For simple commands like ls, the terminal returns the list of files and folders, making it easy to send back to the emulator—this works fine with one-way communication.

But for commands like netstat, the terminal updates the screen continuously, and it would be hard to handle those updates using HTTP. With WebSocket, the real terminal can send updates whenever they happen, and the frontend emulator can also send input in real-time. This way, even things like progress bars work smoothly.

Frontend setup

What you see in the browser is an emulator as mentioned earlier, it's meant to emulate the experience you get with a real terminal. For this demo, I used XtermJS [↗].

It provides the look of a terminal and as well as a bunch of hooks. XtermJS does not handle the interpretations of terminal commands, it only provides the interface and captures and makes the user's input available to you through keystroke hooks. It also deals with character encoding and all the little things that deliver that terminal experience.

You can pull in XtermJS as a Node module or use a CDN. I went with the latter for this demo.

There is also the stylesheet for the emulator styling.

01: <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/xterm@4.19.0/css/xterm.css" />

And the Javascript library for initiating and accessing the functionalities of XtermJS such as the hooks.

01: <script src="https://cdn.jsdelivr.net/npm/xterm@4.19.0/lib/xterm.js"></script>

You will also need to let XtermJS know which HTML element to embed the terminal DOM elements.

In the case of the demo thus:

01: <div id="terminal"></div>

Ok so now let's look at some of the frontend Javascript code. The following code instantiates Xterm.

01: var term = new window.Terminal({ 02: cursorBlink: true 03: }); 04: term.open(document.getElementById('terminal'));

View on GitHub [↗]

Once we have XtermJS pulled in through the CDN it adds the `Terminal`` function. This contains everything we need to work with XtermJS, we just need to instantiate it with all the options we want, in the case of the demo we just set the cursor to blink.

The last line binds the terminal to an HTML element i.e: DIV in this case.

There is also some WebSocket code we should look at:

The line below instantiates the WebSocket and establishes a connection to the backend.

01: const socket = new WebSocket("ws://localhost:6060");

View on GitHub [↗]

When a user types a command, it is sent over the WebSocket to the backend. A newline character is added at the end of the command to simulate pressing the "Enter" key, so when the backend runs the command in the actual terminal, it behaves as if the user hit "Enter" after typing.

01: socket.send(command + '\n');

View on GitHub [↗]

Once a response is received from the backend, the onmessage handler will fire up. The handler outputs the data directly into the emulator i.e: term.write(event.data);.

01: socket.onmessage = (event) => { 02: term.write(event.data); 03: }

View on GitHub [↗]

Backend implementation

Now, let's discuss the backend implementation. It serves as a bridge between the terminal emulator and the real terminal, forwarding commands and their responses between the two.

This is done through a WebSocket connection, using the WS [↗] Node.js package.

When the frontend sends a command to the backend, we need to pass it to the actual terminal. While we could use Node.js's exec method to run the command, there is a significant drawback: exec closes the execution session as soon as it receives the first output. This becomes a problem if the command expects further input from the user, as the connection would be closed after the prompt is displayed, leaving the user unable to respond.

To solve this, we can manipulate the standard buffer [↗] as part of the command, ensuring it remains open for further input. A better solution, though, is to use a pseudo-terminal. The node-pty [↗] package does exactly this, it spawns a shell instance and keeps the terminal session active until we explicitly close it.

01: const spawnShell = () => { 02: return pty.spawn(shell, [], { 03: name: 'xterm-color', 04: env: process.env, 05: }); 06: };

View on GitHub [↗]

The code below is responsible for sending commands to the actual terminal whenever the frontend sends them over. ptyProcess.write performs a stdin directly to the terminal.

01: ws.on('message', command => { 02: const processedCommand = commandProcessor(command); 03: ptyProcess.write(processedCommand); 04: });

View on GitHub [↗]

And we listen for the response from the terminal using the code below, we then send this back to the frontend using the websocket connection

01: ptyProcess.on('data', (rawOutput) => { 02: const processedOutput = outputProcessor(rawOutput); 03: ws.send(processedOutput); 04: });

View on GitHub [↗]

Shared and individualized sessions

If you spawn the shell instance when the backend server starts, all frontend users will be limited to a single shared shell session. This means no matter how many browser instances are opened, they will all connect to the same terminal session.

This setup works well for scenarios where the app only needs to support a single user.

However, if you need to support multiple independent shell sessions, you should spawn a new shell instance whenever a WebSocket connection is established. Alternatively, you could add a button in the frontend that prompts the backend to create a new shell instance.

The demo app demonstrates the first approach, but the second can be easily implemented.

In /src/server.js, when you set setSharedTerminalMode(false); View on GitHub [↗], a new shell instance will be spawned whenever a WebSocket connection is established [↗].

Cleaning Up

For individualized sessions, the spawned shell is destroyed [↗] when the WebSocket connection is closed, preventing unused shell instances from lingering on your server. However, shared sessions persist, meaning there will always be at least one active shell instance running.

And that's pretty much it. 😊