5

I have a <canvas> that I'm updating every 100 ms with bitmap image data coming from a HTTP request:

var ctx = canvas.getContext("2d");

setInterval(() => {
    fetch('/get_image_data').then(r => r.arrayBuffer()).then(arr => {
        var byteArray = new Uint8ClampedArray(arr);
        var imgData = new ImageData(byteArray, 500, 500);
        ctx.putImageData(imgData, 0, 0);
    });
}, 100);

This works when /get_image_data gives RGBA data. In my case, since alpha is always 100%, I don't send A channel through the network. Question:

  • how to efficiently do this when the request delivers RGB binary data?
  • and also when the request delivers grayscale binary data?

(Can we avoid a for loop which might be slow in Javascript for megabytes of data 10 times per second?)

Example in the grayscale => RGBA case: each input value ..., a, ... should be replaced by ..., a, a, a, 255, ... in the output array.

Here is a pure JS solution: ~10 ms for a 1000x1000px grayscale => RGBA array conversion.

Here is an attempt of a WASM solution.

Basj
  • 41,386
  • 99
  • 383
  • 673
  • [wasm](https://developer.mozilla.org/en-US/docs/WebAssembly) will be the fastest. You can also do the transform in a [worker](https://developer.mozilla.org/en-US/docs/Web/API/Worker) to keep the main thread free. – jsejcksn Aug 01 '22 at 09:38
  • Yes, wasm is bytecode, so it'll have to be precompiled. I haven't already written code for this case, but I'm sure it's already a solved problem and you'll find a solution if you do a bit of research: all that needs to be done is to add a `255` for every fourth array element. – jsejcksn Aug 01 '22 at 09:42
  • Re: compiling: The link I shared in my initial comment has guides for C-family languages and Rust. – jsejcksn Aug 01 '22 at 09:47
  • Re: the byte order: It sounds like you understand. See the code sample [here](https://developer.mozilla.org/en-US/docs/Web/API/ImageData/ImageData#javascript). – jsejcksn Aug 01 '22 at 09:49
  • "_Isn't there a more efficient way than to have to put full RGBA to a canvas?_": No, canvas data is always drawn in rectangles, and `ctx.putImageData` is very efficient. Do note that if you don't need transparency, then you should set [`alpha`](https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/getContext#alpha) to `false` when getting the canvas context. – jsejcksn Aug 01 '22 at 09:56
  • "_we still have to provide 4 byte per pixel data?_": Correct. – jsejcksn Aug 01 '22 at 10:09
  • 2
    I wouldn't bet on wasm to outperform this case by a lot. At the end of the day that's something the browser's optimizer should be good enough with, and I guess you'll get compiled code out of it anyway. If someone has the time and energy to write it I'd be curious to see the difference in perfs but I'm not sure it's worth it honestly. – Kaiido Aug 04 '22 at 03:28
  • @Kaiido I'm less familiar with JavaScript, but in the case of Python (interpreted as well), a `for x in range(1024): for y in range(768): do_something_on_bitmap(x, y)` is often 10x or 100x slower than the equivalent in Cython or C or Numpy (which finally calls compiled code). For this kind of task, I think compiled code can be much faster, thus the expectation about WASM :) – Basj Aug 04 '22 at 05:36
  • Sure compiled code is faster, but JS engines have JIT (Just In Time) compilers, so after a few rounds the engine will see that this path is hot enough for it to be worth being compiled and you'll actually get machine code running. Once again I'd really like to see the outcome of this particular case, but I believe wasm comes with its own tradeoffs and the (probably little) time spent on compiling while running should not be that noticeable. But I may be wrong, that's something one would have to test, on different environments. – Kaiido Aug 04 '22 at 06:08
  • @Kaiido On my machine, 1000x1000px grayscale => RGBA takes [~ 10 milliseconds](https://stackoverflow.com/a/73231895). We can probably beat this with more clever techniques or WASM? – Basj Aug 04 '22 at 07:28
  • @Basj A single, isolated render like in your answer doesn't give the JS engine a chance to [compile optimized machine code from a hot code path (JIT)](https://en.wikipedia.org/wiki/Just-in-time_compilation). Rather, you should test with hundreds or thousands of render iterations. – jsejcksn Aug 04 '22 at 08:32
  • @jsejcksn I tried repeating 1000 times just the array conversion function `function grayscale_to_rgba(array, rgba) { for (var i = 0; i < width*height; i++) { rgba[4*i] = array[i]; rgba[4*i+1] = array[i]; rgba[4*i+2] = array[i]; rgba[4*i+3] = 255; } }`, and then it decreases to ~ 4 ms per call on average indeed. Still, on the road for a WASM test :) https://stackoverflow.com/questions/73232586/how-to-pass-a-buffer-pointer-to-a-webassembly-function – Basj Aug 04 '22 at 08:36
  • Your image is only 1000 x 1000px? I don't know where I got that from but I was under the impression you were working with 5000x5000px images, the difference is huge. If it's really just 1000x1000px you shouldn't have much trouble to do this in 100ms using almost any naive implementation on most modern systems. – Kaiido Aug 04 '22 at 08:56
  • @Kaiido Yes 1000x1000 or max 2000x2000. However, 100ms is too much since I need to do this maybe 10 or 15 or 30 times per second :) If I can do it in 1 - 5 ms per call it's good. – Basj Aug 10 '22 at 19:27
  • @Basj Sorry for the suspense. Let me know if the _"using GPU"_ version helps. Also if it's closer to a solution for your specific problem, then we can try working to improve it. – VC.One Aug 11 '22 at 15:03

4 Answers4

6

Converting an ArrayBuffer from RGB to RGBA is conceptually straightforward: just splice in an opaque alpha channel byte (255) after every RGB triplet. (And grayscale to RGBA is just as simple: for every gray byte: copy it 3 times, then insert a 255.)

The (slightly) more challenging part of this problem is offloading the work to another thread with wasm or a worker.

Because you expressed familiarity with JavaScript, I'll provide an example of how it can be done in a worker using a couple of utility modules, and the code I'll show will use TypeScript syntax.

On the types used in the example: they are very weak (lots of anys) — they're present just to provide structural clarity about the data structures involved in the example. In strongly-typed worker application code, the types would need to be re-written for the specifics of the application in each environment (worker and host) because all types involved in message passing are just contractual anyway.

Task-oriented worker code

The problem in your question is task-oriented (for each specific sequence of binary RGB data, you want its RGBA counterpart). Inconveniently in this case, the Worker API is message-oriented rather than task-oriented — meaning that we are only provided with an interface for listening for and reacting to every single message regardless of its cause or context — there's no built-in way to associate a specific pair of messages to-and-from a worker. So, the first step is to create a task-oriented abstraction on top of that API:

task-worker.ts:

export type Task<Type extends string = string, Value = any> = {
  type: Type;
  value: Value;
};

export type TaskMessageData<T extends Task = Task> = T & { id: string };

export type TaskMessageEvent<T extends Task = Task> =
  MessageEvent<TaskMessageData<T>>;

export type TransferOptions = Pick<StructuredSerializeOptions, 'transfer'>;

export class TaskWorker {
  worker: Worker;

  constructor (moduleSpecifier: string, options?: Omit<WorkerOptions, 'type'>) {
    this.worker = new Worker(moduleSpecifier, {...options ?? {}, type: 'module'});

    this.worker.addEventListener('message', (
      {data: {id, value}}: TaskMessageEvent,
    ) => void this.worker.dispatchEvent(new CustomEvent(id, {detail: value})));
  }

  process <Result = any, T extends Task = Task>(
    {transfer, type, value}: T & TransferOptions,
  ): Promise<Result> {
    return new Promise<Result>(resolve => {
      const id = globalThis.crypto.randomUUID();

      this.worker.addEventListener(
        id,
        (ev) => resolve((ev as unknown as CustomEvent<Result>).detail),
        {once: true},
      );

      this.worker.postMessage(
        {id, type, value},
        transfer ? {transfer} : undefined,
      );
    });
  }
}

export type OrPromise<T> = T | Promise<T>;

export type TaskFnResult<T = any> = { value: T } & TransferOptions;

export type TaskFn<Value = any, Result = any> =
  (value: Value) => OrPromise<TaskFnResult<Result>>;

const taskFnMap: Partial<Record<string, TaskFn>> = {};

export function registerTask (type: string, fn: TaskFn): void {
  taskFnMap[type] = fn;
}

export async function handleTaskMessage (
  {data: {id, type, value: taskValue}}: TaskMessageEvent,
): Promise<void> {
  const fn = taskFnMap[type];

  if (typeof fn !== 'function') {
    throw new Error(`No task registered for the type "${type}"`);
  }

  const {transfer, value} = await fn(taskValue);

  globalThis.postMessage(
    {id, value},
    transfer ? {transfer} : undefined,
  );
}

I won't over-explain this code: it's mostly just about picking and moving properties between objects so you can avoid all that boilerplate in your application code. Notably: it also abstracts the necessity of creating unique IDs for every task instance. I will talk about the three exports:

  • a class TaskWorker: For use in the host — it is an abstraction over instantiating a worker module and exposes the worker on its worker property. It also has a process method which accepts task information as an object argument and returns a promise of the result of processing the task. The task object argument has three properties:

    • type: the type of task to be performed (more on this below). This is simply a key that points to a task processing function in the worker.
    • value: the payload value that will be acted on by the associated task function
    • transfer: an optional array of transferable objects (I'll bring this up again later)
  • a function registerTask: For use in the worker — sets a task function to its associated type name in a dictionary so that the worker can use the function to process a payload when a task of that type is received.

  • a function handleTaskMessage: For use in the worker — this is simple, but important: it must be assigned to self.onmessage in your worker module script.

Efficient conversion of RGB (or grayscale) to RGBA

The second utility module has the logic for splicing the alpha bytes into the RGB data, and there's also a function for conversion from grayscale to RGBA:

rgba-conversion.ts:

/**
 * The bytes in the input array buffer must conform to the following pattern:
 *
 * ```
 * [
 *   r, g, b,
 *   r, g, b,
 *   // ...
 * ]
 * ```
 *
 * Note that the byte length of the buffer **MUST** be a multiple of 3
 * (`arrayBuffer.byteLength % 3 === 0`)
 *
 * @param buffer A buffer representing a byte sequence of RGB data elements
 * @returns RGBA buffer
 */
export function rgbaFromRgb (buffer: ArrayBuffer): ArrayBuffer {
  const rgb = new Uint8ClampedArray(buffer);
  const pixelCount = Math.floor(rgb.length / 3);
  const rgba = new Uint8ClampedArray(pixelCount * 4);

  for (let iPixel = 0; iPixel < pixelCount; iPixel += 1) {
    const iRgb = iPixel * 3;
    const iRgba = iPixel * 4;
    // @ts-expect-error
    for (let i = 0; i < 3; i += 1) rgba[iRgba + i] = rgb[iRgb + i];
    rgba[iRgba + 3] = 255;
  }

  return rgba.buffer;
}

/**
 * @param buffer A buffer representing a byte sequence of grayscale elements
 * @returns RGBA buffer
 */
export function rgbaFromGrayscale (buffer: ArrayBuffer): ArrayBuffer {
  const gray = new Uint8ClampedArray(buffer);
  const pixelCount = gray.length;
  const rgba = new Uint8ClampedArray(pixelCount * 4);

  for (let iPixel = 0; iPixel < pixelCount; iPixel += 1) {
    const iRgba = iPixel * 4;
    // @ts-expect-error
    for (let i = 0; i < 3; i += 1) rgba[iRgba + i] = gray[iPixel];
    rgba[iRgba + 3] = 255;
  }

  return rgba.buffer;
}

I think the iterative math code is self-explanatory here (however — if any of the APIs used here or in other parts of the answer are unfamiliar — MDN has explanatory documentation). I think it's noteworthy to point out that both the input and output values (ArrayBuffer) are transferable objects, which means that they can essentially be moved instead of copied between the host and worker contexts for improved memory and speed efficiency.

Also, thanks @Kaiido for providing information that was used to improve the efficiency of this approach over a technique used in an earlier revision of this answer.

Creating the worker

The actual worker code is pretty minimal because of the abstractions above:

worker.ts:

import {
  rgbaFromGrayscale,
  rgbaFromRgb,
} from './rgba-conversion.js';
import {handleTaskMessage, registerTask} from './task-worker.js';

registerTask('rgb-rgba', (rgbBuffer: ArrayBuffer) => {
  const rgbaBuffer = rgbaFromRgb(rgbBuffer);
  return {value: rgbaBuffer, transfer: [rgbaBuffer]};
});

registerTask('grayscale-rgba', (grayscaleBuffer: ArrayBuffer) => {
  const rgbaBuffer = rgbaFromGrayscale(grayscaleBuffer);
  return {value: rgbaBuffer, transfer: [rgbaBuffer]};
});

self.onmessage = handleTaskMessage;

All that's needed in each task function is to move the buffer result to the value property in the return object and to signal that its underlying memory can be transferred to the host context.

Example application code

I don't think anything will surprise you here: the only boilerplate is mocking fetch to return an example RGB buffer since the referenced server in your question isn't available to this code:

main.ts:

import {TaskWorker} from './task-worker.js';

const tw = new TaskWorker('./worker.js');

const buf = new Uint8ClampedArray([
  /* red */255, 0, 0, /* green */0, 255, 0, /* blue */0, 0, 255,
  /* cyan */0, 255, 255, /* magenta */255, 0, 255, /* yellow */255, 255, 0,
  /* white */255, 255, 255, /* grey */128, 128, 128, /* black */0, 0, 0,
]).buffer;

const fetch = async () => ({arrayBuffer: async () => buf});

async function main () {
  const canvas = document.createElement('canvas');
  canvas.setAttribute('height', '3');
  canvas.setAttribute('width', '3');

  // This is just to sharply upscale the 3x3 px demo data so that
  // it's easier to see the squares:
  canvas.style.setProperty('image-rendering', 'pixelated');
  canvas.style.setProperty('height', '300px');
  canvas.style.setProperty('width', '300px');

  document.body
    .appendChild(document.createElement('div'))
    .appendChild(canvas);

  const context = canvas.getContext('2d', {alpha: false})!;

  const width = 3;

  // This is the part that would happen in your interval-delayed loop:
  const response = await fetch();
  const rgbBuffer = await response.arrayBuffer();

  const rgbaBuffer = await tw.process<ArrayBuffer>({
    type: 'rgb-rgba',
    value: rgbBuffer,
    transfer: [rgbBuffer],
  });

  // And if the fetched resource were grayscale data, the syntax would be
  // essentially the same, except that you'd use the type name associated with
  // the grayscale task that was registered in the worker:

  // const grayscaleBuffer = await response.arrayBuffer();

  // const rgbaBuffer = await tw.process<ArrayBuffer>({
  //   type: 'grayscale-rgba',
  //   value: grayscaleBuffer,
  //   transfer: [grayscaleBuffer],
  // });

  const imageData = new ImageData(new Uint8ClampedArray(rgbaBuffer), width);
  context.putImageData(imageData, 0, 0);
}

main();

Those TypeScript modules just need to be compiled and the main script run as a module script in your HTML.

I can't make performance claims without access to your server data, so I'll leave that to you. If there's anything that I overlooked in explanation (or anything that's still not clear), feel free to ask in a comment.

jsejcksn
  • 27,667
  • 4
  • 38
  • 62
  • Thanks a lot for this great answer! I've never used TypeScript before: am I correct, that at the end, after TS->JS compilation/transpilation, the result will be interpreted JavaScript? Then the performance of this TS solution will be the same than a pure JS solution, is this correct? If so it would be great to include a wasm version if you have time (I can add a bounty for this!) to have compiled-code speed. – Basj Aug 02 '22 at 17:04
  • @Basj That’s correct: compiling (or just type-stripping) will result in plain JavaScript. – jsejcksn Aug 02 '22 at 17:21
  • Updated with the algorithm for and detail about conversion of grayscale inputs – jsejcksn Aug 04 '22 at 00:57
  • Not sure what's the point of the `createImageBitmap` call here. For a one shot where you already have the ImageData you won't win anything over `putImageData` directly, putImageData is 2 to 3 times faster than createImageBitmap + drawImage. Once again, for a one shot, if you had to paint that image a lot of times, then yes that'd make sense, but not here. – Kaiido Aug 04 '22 at 03:26
  • @Kaiido It's a detail trade-off: It depends on the size of the binary data. `ImageData` is not [transferable](https://developer.mozilla.org/en-US/docs/Glossary/Transferable_objects) while `ImageBitmap` is, so — for large images — the move is potentially substantially faster than the copy. And — at the interval described in the question detail (100ms) — the "2–3x" difference in the op time is negligible. Also re: "`putImageData` vs `drawImage`"note that I mentioned the `ctx.transferFromImageBitmap` method in the answer. – jsejcksn Aug 04 '22 at 03:30
  • The underlying buffer of the ImageData is transferable, the ImageData object itself is super small, as small as the ImageBitmap's one that needs to be cloned too. Only the underlying bitmap is actually transferable, the wrapper JS object isn't either. And `transferFromImageBitmap` is almost as fast a drawImage, the slow path is in createImageBitmap, and yes, 2x does matter. If they really have big frames (like 5K * 5K), they already don't have much time left after doing the conversion. – Kaiido Aug 04 '22 at 03:44
  • @Kaiido I'm curious about your 2–3x slowdown claim: Can you provide some empirical data to support that? If it's as impactful as you say, I should _definitely_ revise the answer details. – jsejcksn Aug 04 '22 at 03:47
  • Oh I did forgot to add my benchmark tests!? Sorry, here it is: https://jsfiddle.net/ue1ymxdf/ ~2x on Safari & Chrome ~3x on Firefox – Kaiido Aug 04 '22 at 04:20
  • @Kaiido Thanks! I really appreciate the information you've provided here. I'll revise that part of my answer to reflect it. – jsejcksn Aug 04 '22 at 04:33
  • @jsejcksn I'm loving this pattern, but my worker tasks need access to some shared state. I'm playing around with maybe passing in the worker module itself, or some global object, as the first argument to `TaskFn`, but I thought I'd see if you have any thoughts on the best way tasks could share state. – Sasgorilla Aug 15 '22 at 19:13
  • @Sasgorilla Please [ask a new question](https://stackoverflow.com/questions/ask). – jsejcksn Aug 15 '22 at 20:14
  • @jsejcksn It turns out I was accidentally re-creating my `TaskWorker` on every render, which explains why my simple attempt at shared state was not working. It's easy enough to put a `const state = {}` in the outer scope on `worker.ts` and read and write values into it from each task. – Sasgorilla Aug 16 '22 at 22:34
1

Typed array views.

You can use typed arrays to create a view of the pixel data.

So for example you have a byte array const foo = new Uint8Array(size) you can create a view as a 32bit word array using const foo32 = new Uint32Array(foo.buffer)

foo32 is the same data but JS sees it as 32 bit words rather than bytes, creating it is zero copy operation with almost no overhead.

Thus you can move 4 bytes in one operation.

Unfortunatly you still need to index and format the byte data from one of the arrays (as gray scale or RGB).

However there are still worthwhile performance gains using typed array views

Moving gray scale pixels

Example moving gray scale bytes

// src array as Uint8Array one byte per pixel
// dest is Uint8Array 4 bytes RGBA per pixel
function moveGray(src, dest, width, height) {
    var i;
    const destW = new Uint32Array(dest.buffer);
    const alpha = 0xFF000000;  // alpha is the high byte. Bits 24-31
    for (i = 0; i < width * height; i++) {
        const g = src[i];
        destW[i] = alpha + (g << 16) + (g << 8) + g;
    }    
}

is About 40% faster than

function moveBytes(src, dest, width, height) {
    var i,j = 0;
    for (i = 0; i < width * height * 4; ) {
        dest[i++] = src[j];
        dest[i++] = src[j];
        dest[i++] = src[j++];
        dest[i++] = 255;
    }    
}

Where src and dest are Uint8Array pointing to the source gray bytes, and destination RGBA bytes.

Moving RGB pixels

To move RGB to RGBA you can use

// src array as Uint8Array 3 bytes per pixel as red, green, blue
// dest is Uint8Array 4 bytes RGBA per pixel
function moveRGB(src, dest, width, height) {
    var i, j = 0;
    const destW = new Uint32Array(dest.buffer);
    const alpha = 0xFF000000;  // alpha is the high byte. Bits 24-31
    for (i = 0; i < width * height; i++) {
        destW[i] = alpha + src[j++] + (src[j++] << 8) + (src[j++] << 16);
    }    
}

Which is about 30% faster than moving bytes as follows

// src array as Uint8Array 3 bytes per pixel as red, green, blue
function moveBytes(src, dest, width, height) {
    var i, j = 0;
    for (i = 0; i < width * height * 4; ) {
        dest[i++] = src[j++];
        dest[i++] = src[j++];
        dest[i++] = src[j++];
        dest[i++] = 255;
    }    
}
Blindman67
  • 51,134
  • 11
  • 73
  • 136
  • 1
    Beware, things aren't that simple. Different browsers will have very different results, based on the size of input. For instance, in Chrome I have the Uint32Array roughly 30% faster on a 100x100 (image size, so src is 100x100x3Uint8 and dest is 100*100 Uint32), ~20% faster on a 1000x1000 and it falls down to 11% on a 5000x5000. Then on Firefox I've got somehow inverse results, with ~30% on a 5000x5000, only ~6% on a 1000x1000 and -20% on a 100x100. Yep, in my Firefox Uint8Array is actually faster on small inputs. And that is only the results from a "benchmark": https://jsfiddle.net/1gupqt6s/ – Kaiido Aug 06 '22 at 02:14
1

Regarding your main concerns:

  • "How to avoid using a For loop...?"
  • "Can we do better with WASM or other techniques?"
  • "I need to do this maybe 10 or 15 or 30 times per second"

I would suggest you try using the GPU for processing your pixels in this task.

You can go from CPU canvas.getContext("2d")... into GPU using canvas.getContext("webgl")

Setting your <canvas> into WebGL (GPU) mode means it can now accept pixel data in more formats, including values in formats like RGB or even as LUMINANCE (where a single grey input value is auto-written across the R-G-B channels of the GPU canvas).


You can read more information here: WebGL introduction to "Data Textures"

WebGL is not fun to setup... It's a long code, but worth it for the "almost-at-light" speed that it gives back.

Below is an example code that is modified from my other answer (itself modified from this JSfiddle which I learnt from back when I was a beginner in GPU).

Example code: creates a 1000x1000 texture, re-fills it with RGB/Grey at a rate of "N" FPS.

variables:

  • pix_FPS : set FPS rate (will be used as 1000/FPS).
  • pix_Mode : set input pixel's type as "grey" or set as "rgb"
  • pix_FPS : set FPS rate (will be used as 1000/FPS).

Test it out...

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>WebGL</title>

    <style> body {background-color: white; } </style>  
</head>

<body>

    <br>

    <button id="btn_draw" onclick="draw_Pixels()"> Draw Pixels </button>
    <br><br>

   <canvas id="myCanvas" width="1000" height="1000"></canvas>

<!-- ########## Shader code ###### -->
<!-- ### Shader code here -->


<!-- Fragment shader program -->
<script id="shader-fs" type="x-shader/x-fragment">

//<!-- //## code for pixel effects goes here if needed -->

//# these two vars will access 
varying mediump vec2 vDirection;
uniform sampler2D uSampler;

void main(void) 
{
    //# reading thru current image's pixel colors (no FOR-loops needed like in JS 2D Canvas)
    gl_FragColor = texture2D(uSampler, vec2(vDirection.x * 0.5 + 0.5, vDirection.y * 0.5 + 0.5));

    ///////////////////////////////////////////////////////
    //# Example of basic colour effect on INPUT pixels ///////

    /*
    gl_FragColor.r = ( gl_FragColor.r * 0.95 );
    gl_FragColor.g = ( gl_FragColor.g * 0.3333 );
    gl_FragColor.b = ( gl_FragColor.b * 0.92 );
    */
}

</script>

<!-- Vertex shader program -->
<script id="shader-vs" type="x-shader/x-vertex">

    attribute mediump vec2 aVertexPosition;
    varying mediump vec2 vDirection;

    void main( void ) 
    {
        gl_Position = vec4(aVertexPosition, 1.0, 1.0) * 2.0;
        vDirection = aVertexPosition;
    }

</script>

<!-- ### END Shader code... -->


<script>

//# WebGL setup

//# Pixel setup for transferring to GPU
//# pixel mode and the handlimg GPU formats...

//# set image width and height (also changes Canvas width/height)
var pix_Width = 1000; 
var pix_Height = 1000;

var pix_data = new Uint8Array( pix_Width * pix_Height );

var pix_FPS = 30; //# MAX is 60-FPS (or 60-Hertz)

var pix_Mode = "grey" //# can be "grey" or "rgb"
var pix_Format;
var pix_internalFormat;
const pix_border = 0;

const glcanvas = document.getElementById('myCanvas');
const gl = ( ( glcanvas.getContext("webgl") ) || ( glcanvas.getContext("experimental-webgl") ) );

//# check if WebGL is available..
if (gl && gl instanceof WebGLRenderingContext) { console.log( "WebGL is available"); }

//# use regular 2D Canvas functions if this happens...
else { console.log( "WebGL is NOT available" ); alert( "WebGL is NOT available" ); } 

//# change Canvas width/height to match input image size
//glcanvas.style.width = pix_Width+"px"; glcanvas.style.height = pix_Height+"px";
glcanvas.width = pix_Width; glcanvas.height = pix_Height;

//# create and attach the shader program to the webGL context
var attributes, uniforms, program;

function attachShader( params ) 
{
    fragmentShader = getShaderByName(params.fragmentShaderName);
    vertexShader = getShaderByName(params.vertexShaderName);

    program = gl.createProgram();
    gl.attachShader(program, vertexShader);
    gl.attachShader(program, fragmentShader);
    gl.linkProgram(program);

    if (!gl.getProgramParameter(program, gl.LINK_STATUS)) 
    { alert("Unable to initialize the shader program: " + gl.getProgramInfoLog(program)); }

    gl.useProgram(program);

    // get the location of attributes and uniforms
    attributes = {};

    for (var i = 0; i < params.attributes.length; i++) 
    {
        var attributeName = params.attributes[i];
        attributes[attributeName] = gl.getAttribLocation(program, attributeName);
        gl.enableVertexAttribArray(attributes[attributeName]);
    }

    uniforms = {};

    for (i = 0; i < params.uniforms.length; i++) 
    {
        var uniformName = params.uniforms[i];
        uniforms[uniformName] = gl.getUniformLocation(program, uniformName);

        gl.enableVertexAttribArray(attributes[uniformName]);
    }

}

function getShaderByName( id ) 
{
    var shaderScript = document.getElementById(id);

    var theSource = "";
    var currentChild = shaderScript.firstChild;

    while(currentChild) 
    {
        if (currentChild.nodeType === 3) { theSource += currentChild.textContent; }
        currentChild = currentChild.nextSibling;
    }

    var result;

    if (shaderScript.type === "x-shader/x-fragment") 
    { result = gl.createShader(gl.FRAGMENT_SHADER); } 
    else { result = gl.createShader(gl.VERTEX_SHADER); }

    gl.shaderSource(result, theSource);
    gl.compileShader(result);

    if (!gl.getShaderParameter(result, gl.COMPILE_STATUS)) 
    {
        alert("An error occurred compiling the shaders: " + gl.getShaderInfoLog(result));
        return null;
    }
    return result;
}

//# attach shader
attachShader({
fragmentShaderName: 'shader-fs',
vertexShaderName: 'shader-vs',
attributes: ['aVertexPosition'],
uniforms: ['someVal', 'uSampler'],
});

// some webGL initialization
gl.clearColor(0.0, 0.0, 0.0, 0.0);
gl.clearDepth(1.0);
gl.disable(gl.DEPTH_TEST);

positionsBuffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, positionsBuffer);
var positions = [
  -1.0, -1.0,
   1.0, -1.0,
   1.0,  1.0,
  -1.0,  1.0,
];
gl.bufferData(gl.ARRAY_BUFFER, new Float32Array(positions), gl.STATIC_DRAW);

var vertexColors = [0xff00ff88,0xffffffff];

var cBuffer = gl.createBuffer();

verticesIndexBuffer = gl.createBuffer();
gl.bindBuffer(gl.ELEMENT_ARRAY_BUFFER, verticesIndexBuffer);

var vertexIndices = [ 0,  1,  2,      0,  2,  3, ];

gl.bufferData(  
                gl.ELEMENT_ARRAY_BUFFER,
                new Uint16Array(vertexIndices), gl.STATIC_DRAW
            );

texture = gl.createTexture();
gl.bindTexture(gl.TEXTURE_2D, texture);

//# set FILTERING (where needed, used when resizing input data to fit canvas)
//# must be LINEAR to avoid subtle pixelation (double-check this... test other options like NEAREST)


//# for bi-linear filterin
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.LINEAR);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.LINEAR);


/*
// for non-filtered pixels
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST);
*/

gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
gl.bindTexture(gl.TEXTURE_2D, null);

// update the texture from the video
function updateTexture() 
{
    gl.bindTexture(gl.TEXTURE_2D, texture);
    gl.pixelStorei(gl.UNPACK_FLIP_Y_WEBGL, true);
    gl.pixelStorei(gl.UNPACK_ALIGNMENT, 1); //1 == read one byte or 4 == read integers, etc

    //# for RGV vs LUM

    pix_Mode = "grey"; //pix_Mode = "rgb";

    if ( pix_Mode == "grey") { pix_Format = gl.LUMINANCE; pix_internalFormat = gl.LUMINANCE; }
    if ( pix_Mode == "rgb") { pix_Format = gl.RGB; pix_internalFormat = gl.RGB; }

    //# update pixel Array with custom data
    pix_data = new Uint8Array(pix_Width*pix_Height).fill().map(() => Math.round(Math.random() * 255));

    //# next line fails in Safari if input video is NOT from same domain/server as this html code
    gl.texImage2D(gl.TEXTURE_2D, 0, pix_internalFormat,  pix_Width, pix_Height, pix_border, pix_Format, gl.UNSIGNED_BYTE, pix_data);
    gl.bindTexture(gl.TEXTURE_2D, null);
};

</script>

<script>


//# Vars for video frame grabbing when system/browser provides a new frame
var requestAnimationFrame = (window.requestAnimationFrame || window.mozRequestAnimationFrame ||
                            window.webkitRequestAnimationFrame || window.msRequestAnimationFrame);

var cancelAnimationFrame = (window.cancelAnimationFrame || window.mozCancelAnimationFrame);

///////////////////////////////////////////////


function draw_Pixels( ) 
{
    //# initialise GPU variables for usage
    //# begin updating pixel data as texture

    let testing = "true";

    if( testing == "true" )
    {
        updateTexture(); //# update pixels with current video frame's pixels...

        gl.useProgram(program); //# apply our program

        gl.bindBuffer(gl.ARRAY_BUFFER, positionsBuffer);
        gl.vertexAttribPointer(attributes['aVertexPosition'], 2, gl.FLOAT, false, 0, 0);

        //# Specify the texture to map onto the faces.
        gl.activeTexture(gl.TEXTURE0);
        gl.bindTexture(gl.TEXTURE_2D, texture);
        //gl.uniform1i(uniforms['uSampler'], 0);

        //# Draw GPU
        gl.bindBuffer(gl.ELEMENT_ARRAY_BUFFER, verticesIndexBuffer);
        gl.drawElements(gl.TRIANGLES, 6, gl.UNSIGNED_SHORT, 0);
    }

    //# re-capture the next frame... basically make the function loop itself
    //requestAnimationFrame( draw_Pixels ); 
    setTimeout( requestAnimationFrame( draw_Pixels ), (1000 / pix_FPS) );

}

// ...the end. ////////////////////////////////////

    </script>
</body>
</html>
Bilal
  • 3,191
  • 4
  • 21
  • 49
VC.One
  • 14,790
  • 4
  • 25
  • 57
0

For completeness, here is a pure JS version.

1000 x 1000 px grayscale array → RGBA array

~ 9 or 10 milliseconds on my machine.

Can we do better with WASM or other techniques?

var width = 1000, height = 1000;
var array = new Uint8Array(width*height).fill().map(() => Math.round(Math.random() * 255))
var ctx = document.getElementById("canvas").getContext("2d");
grayscale_array_to_canvas(array, width, height, ctx);

function grayscale_array_to_canvas(array, width, height, ctx) {
    var startTime = performance.now();
    var rgba = new Uint8ClampedArray(4*width*height);
    for (var i = 0; i < width*height; i++) {
        rgba[4*i] = array[i];
        rgba[4*i+1] = array[i];
        rgba[4*i+2] = array[i];
        rgba[4*i+3] = 255;
    }    
    console.log(`${performance.now() - startTime} ms`);    
    var imgData = new ImageData(rgba, width, height);
    ctx.putImageData(imgData, 0, 0);
}
<canvas id="canvas"></canvas>
Basj
  • 41,386
  • 99
  • 383
  • 673
  • This _looks_ like you compiled the code in my answer, copying parts but ignoring the worker relationship. The reason it's important to do this off the main thread is that [the worker thread can perform tasks without interfering with the user interface](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers). As the resolution of the input data increases (e.g. `5_000` px ², `10_000` px ², etc.), the time required by loop increases as well — potentially hundreds of milliseconds. If this runs on the same main thread, the UI fails to respond until the loop completes. – jsejcksn Aug 04 '22 at 08:12
  • @jsejcksn Yes I'm in the process of doing benchmarks, so for completeness, I wanted to post a 100% JS version without workers. I don't have TS toolset installed (I never used TS), but as soon as I will have it, I will do benchmark on your solution as well. – Basj Aug 04 '22 at 08:17
  • 1
    If you can't / don't want to install a TS compiler, you can copy + paste each module from my answer into the code editor in the [TypeScript Playground](https://www.typescriptlang.org/play?noUncheckedIndexedAccess=true&target=99&useUnknownInCatchVariables=true&exactOptionalPropertyTypes=true#code/Q) and see the JS output in the panel to the right of the editor. (That link URL includes some configuration settings, but you can adjust those as well if you prefer.) – jsejcksn Aug 04 '22 at 08:27
  • For the grayscale version you should be able to win a few very little µs by using an Uint32Array instead of an Uint8ClampedArray, that would do a single write instead of 4, but that's not that noticeable. https://jsfiddle.net/0zustpqw/ (And as said previously, doing a single execution time measure like that on such small time ought to be misleading, the best is to test on your real code). – Kaiido Aug 04 '22 at 09:11