2

Is there a way to more strictly type the following two functions toCsv() and toArray() such that typeof csv is

[["key", "life", "goodbye"], ...[string, number, boolean][]]

instead of

[("key" | "life" | "goodbye")[], ...(string | number | boolean)[][]]

and typeof original is the same as typeof values, that is

{ key: string, life: number, goodbye: boolean }[]

instead of

{ key: any, life: any, goodbye: any }[]

I realize that iteration order of { key: 'value', life: 42, goodbye: false } using for...in is not guaranteed and I'm fine with that. Any consistent order that aligns keys with respective values for each row is acceptable, even if the TypeScript compiler doesn't produce the same order as runtime, since the usage doesn't rely on any particular ordering.

type Key<T> = Extract<keyof T, string>;
type Column<T> = [Key<T>, ...T[Key<T>][]];
type Columns<T> = [Key<T>[], ...T[Key<T>][][]];

function toCsv<T> (array: T[]): Columns<T> {
    const columns: Column<T>[] = [];

    for (const key in array[0]) {
        columns.push([key, ...array.map(value => value[key])]);
    }

    const keys: Key<T>[] = [];
    const rows: T[Key<T>][][] = array.map(() => []);

    for (const [key, ...values] of columns) {
        keys.push(key);

        for (const [index, row] of rows.entries()) {
            row.push(values[index]);
        }
    }

    return [keys, ...rows];
}

function toArray<T> (csv: Columns<T>): T[] {
    const [keys, ...rows] = csv;

    return rows.map(
        row => keys.reduce(
            (o, key, index) => Object.assign(o, { [key]: row[index] }),
            {} as Partial<T>
        ) as T
    );
}

const values = [{ key: 'value', life: 42, goodbye: false }];
const csv = toCsv(values);
const original = toArray(csv);
Patrick Roberts
  • 49,224
  • 10
  • 102
  • 153
  • Just want to get clarification, your main purpose is so that when you call `result = toArray(toCsv(xs))` you expect the type of `result` to be the same as `xs` right? – Wong Jia Hau Oct 02 '19 at 03:19
  • @WongJiaHau yes that is correct, but I am also interested in having the intermediate type of `toCsv(xs)` being the same as I requested. – Patrick Roberts Oct 02 '19 at 03:39
  • 1
    I still have some understanding issue here: Why do you want to type something (`rows`) that is `...(string | number | boolean)[][]` (I think you forget that additional `[]` layer ?) to be of type `[string, number, boolean]` instead? Wouldn't that work against your implementation? – ford04 Oct 02 '19 at 08:37
  • @PatrickRoberts Thanks for the update. The thing I did was just comparing your mentioned type for `typeof csv`, which you said is `[("key" | "life" | "goodbye")[], ...(string | number | boolean)[]]` with the one the playground IDE hints by hovering over `const csv`. The latter reveals, that it has the type `[("key" | "life" | "goodbye")[], ...(string | number | boolean)[][]]` - so different from your mentioned type with one additional array layer - , which puzzled me a bit. – ford04 Oct 02 '19 at 19:04
  • 1
    PS: Here is the [playground](https://www.typescriptlang.org/play/#code/C4TwDgpgBA0hIB4AqA+KBeKBRAHsATgIYDGwCA1vAPYBmUSANFAM4ECWAdgOYoDcAUKEhQAwlQA2AVwC2HZGkwBtOIlRMAdJqTL48gLqK9egUOhips5vIxQdqlIY1a7+w4eP9+AEwjFxhfGgaSQ5SNioOKGAqEWYAN2sACgCiEAAuekMASgzzGQ4rVChvX39AqGDQ4HDI6IBBfFSk4njciXzClBzMvU9iCNYoOMIpCGYbRQBvKEp0qAByYdH5pnE2GggMgBYAJiYuKiovACMQTYqR5mgAXw9+foLgKBa4m2jYuMSlyTGsgQfBlR8GwuJwRm8qA1UokXn9+EA) (too long URL for one comment) – ford04 Oct 02 '19 at 19:05
  • 1
    @ford04 oh, good catch! Thank _you_ for following up on your request for clarification. I'll update my question, but thankfully I already have a good solution to work with. – Patrick Roberts Oct 02 '19 at 19:08

2 Answers2

5

I wouldn't try to go the route of outputting a particular tuple ordering. As you already noted, the actual result might not be in that order, so it would be misleading to present it as such a type. Lying to the compiler is sometimes necessary or useful, but in this case I don't see a major benefit.

Furthermore, even if I wanted to do this, it's actually not easy to get the compiler to turn a union like keyof T into an ordered tuple. The type "a"|"b" is the same exact type as "b"|"a"; the compiler may very well use one or the other or both without letting you know, and so anything you do that produces ["a", "b"] vs ["b", "a"] from that is likely to switch around when you don't expect it. You can abuse the type system to make this happen, but it's really messy and fragile and I recommend against it.


If you really want to use tuples, you could avoid the ordering issue by turning a union like "a"|"b" into a union of all possible tuples like ["a", "b"] | ["b", "a"]. That is actually a bit easier to represent in the type system because it's symmetric over the union members, but is still messy because once you have a decent number of properties the number of elements in the union becomes unmanageable (yay factorial). The upside here is that you are really and truly as honest as possible about the output type. Here's one way to implement it:

type UnionToAllPossibleTuples<T, U = T> = [T] extends [never]
    ? []
    : T extends unknown ? [T, ...UnionToAllPossibleTuples<Exclude<U, T>>] : never;

type MergedColumns<T> = UnionToAllPossibleTuples<
  { [K in keyof T]: { key: K; val: T[K] } }[keyof T]
>;

type Lookup<T, K> = K extends keyof T ? T[K] : never;

type UnmergeColumns<T> = T extends any
  ? [
      { [K in keyof T]: Lookup<T[K], "key"> },
      ...{ [K in keyof T]: Lookup<T[K], "val"> }[]
    ]
  : never;

type Columns<T> = UnmergeColumns<MergedColumns<T>>;

And you can verify this works:

interface TestType {
  key: string;
  life: number;
  goodbye: boolean;
}

type ColumnsTestType = Columns<TestType>;
// type ColumnsTestType =
// | [["key", "life", "goodbye"], ...[string, number, boolean][]]
// | [["key", "goodbye", "life"], ...[string, boolean, number][]]
// | [["life", "key", "goodbye"], ...[number, string, boolean][]]
// | [["life", "goodbye", "key"], ...[number, boolean, string][]]
// | [["goodbye", "key", "life"], ...[boolean, string, number][]]
// | [["goodbye", "life", "key"], ...[boolean, number, string][]]

That's fun, but probably still too fragile and messy to be something I'd recommend.


Backing up, it seems like the thing you really care about is preserving the type T across toCsv() and toArray(), and that the original array type, while accurate, was lossy. In that case, how about this minor change to your original code?

type Columns<T> = [Key<T>[], ...T[Key<T>][][]] & { __original?: T };

Here, Columns<T> is essentially the same type as before but has an optional extra property named original with the type T. This property will never actually be present or used at runtime. Yes, you are possibly deceiving the compiler here but not actually lying; the stuff coming out of toCsv() will have no __original property, which does match {__original?: T}. The deception is useful though, since it gives the compiler enough information to understand what happens on the round trip. Observe:

const values = [{ key: "value", life: 42, goodbye: false }];
const csv = toCsv(values);
// const csv: Columns<{ key: string; life: number; goodbye: boolean; }>
const original = toArray(csv); 
// const original: { key: string; life: number; goodbye: boolean; }[]

That looks good to me and what I'd recommend.


RECAP: If you want to lie to the compiler, don't lie about tuple order. Telling the truth about tuple order is too messy. Instead, tell a small lie about an optional property.

Okay, hope that helps. Good luck!

Link to code

jcalz
  • 264,269
  • 27
  • 359
  • 360
  • Incredible TypeScript answers as always! I was anticipating a solution akin to the "honest but messy" approach, based on my reading of [a particular github issue](https://github.com/microsoft/TypeScript/issues/13298), but I think your final suggestion is most pragmatic, and the best part is it doesn't require any major refactoring. – Patrick Roberts Oct 02 '19 at 19:06
  • @PatrickRoberts isn't this the same answer that I provided to you? – Wong Jia Hau Oct 03 '19 at 06:49
  • 2
    @WongJiaHau Your answer is great but you changed the runtime behavior of the program, which I don't think works for OP. – jcalz Oct 03 '19 at 13:16
1

My solution is a little hacky, but it works. The magic lies in the fact that the type of T is passed into the original property, so that it can be retrieved back perfectly without deriving from the type of keys and values.

type CSV<T> = {values: ((keyof T)[] | (T[keyof T])[])[], original: T}

const toCsv = <T extends object>(values: T[]): CSV<T> => {
  if(values.length === 0) {
    throw new Error('Values must have length of more than one')
  }
  else {
    return {
      values: [
        Object.keys(values[0]) as (keyof T)[],
        ...values.map(Object.values) as T[keyof T][][],
      ] as ((keyof T)[] | (T[keyof T])[])[],
      original: undefined
    }  
  }
}

const toArray = <T extends object>(csv: CSV<T>): T[] => {
  const keys = csv.values[0] as (keyof T)[]
  const valuess = csv.values.slice(1) as ((T[keyof T])[])[]
  return valuess.map(values => values.reduce<T>((result, value, index) => ({...result as any, [keys[index]]: value}), {} as T))
}

const values = [{ key: 'value', life: 42, goodbye: false }];
const csv = toCsv(values);
const original = toArray(csv);

console.log(csv.values) // this will be in the required intermediate format
console.log(original)

type Result = typeof original extends typeof values ? true : never

The type of original will be the same as values. You can check yourself by hovering your cursor to Result.

Besides type checking, the implementation also works in runtime.

Note that the original property is not used anywhere in the toArray function, its only purpose is merely for passing type information around.

Wong Jia Hau
  • 2,639
  • 2
  • 18
  • 30
  • This doesn't use the same structure as I indicated in my question. The intermediate format is intended to be serialized, and must not be changed. Also your `values.map(value => Object.keys(value).map(key => value[key]))` might as well be `values.map(Object.values)` – Patrick Roberts Oct 02 '19 at 03:45
  • Can you provide an example of that intermediate format? It's not mentioned anywhere in the question. – Wong Jia Hau Oct 02 '19 at 03:48
  • It's literally at the top of my question: `[["key", "life", "goodbye"], [string, number, boolean]]` – Patrick Roberts Oct 02 '19 at 03:49
  • That is the type of the intermediate format, I mean I want a value example. – Wong Jia Hau Oct 02 '19 at 03:50
  • 1
    `[["key", "life", "goodbye"], ["value", 42, false]]`... I thought you'd be able to infer that from the type. – Patrick Roberts Oct 02 '19 at 03:51
  • I appreciate the effort, but this still doesn't meet the requirement because your `toArray()` needs an instance of the original type in the data structure, and given the use-case I mentioned in my initial response to your answer, that's not possible. – Patrick Roberts Oct 02 '19 at 04:07
  • @PatrickRoberts `toArray` function does not need the `original` value, if the `original` value is `undefined` it will still work, if you're trying to convert CSV that has the intermediate format, you can just provide the type for the `toArray` function, for example: `toArray<{ key: string, life: number, goodbye: boolean }>([])` – Wong Jia Hau Oct 02 '19 at 04:12