1

I'm trying to analyze data given as an array of nested objects. I want to use fp-ts ecosystem, and I'm trying to figure out how I could combine a grouped-by calculation with any pre-defined function (such as, calculating average, median, mode, sum, standard deviation, etc.).

Example

I have an array of objects, where each object holds data about a different student. Here we have 3 students.

const studentsGrades = [
  {
    name: 'john',
    age: 21,
    classes: {
      history: {
        grade: 89,
        semester: 'spring',
        category: 'humanities',
      },
      math: {
        grade: 95,
        semester: 'all_year',
        category: 'quantitative',
      },
      physics: {
        grade: 81,
        semester: 'fall',
        category: 'quantitative',
      },
      literature: {
        grade: 77,
        semester: 'spring',
        category: 'humanities',
      },
    },
  },

  {
    name: 'amanda',
    age: 20,
    classes: {
      history: {
        grade: 95,
        semester: 'spring',
        category: 'humanities',
      },
      math: {
        grade: 99,
        semester: 'all_year',
        category: 'quantitative',
      },
      physics: {
        grade: 89,
        semester: 'fall',
        category: 'quantitative',
      },
      literature: {
        grade: 65,
        semester: 'spring',
        category: 'humanities',
      },
    },
  },

  {
    name: 'rachel',
    age: 19,
    classes: {
      history: {
        grade: 80,
        semester: 'spring',
        category: 'humanities',
      },
      math: {
        grade: 90,
        semester: 'all_year',
        category: 'quantitative',
      },
      physics: {
        grade: 100,
        semester: 'fall',
        category: 'quantitative',
      },
      literature: {
        grade: 88,
        semester: 'spring',
        category: 'humanities',
      },
    },
  },
];

I want to perform different calculations. For example, what is the average grade for physics? What is the median grade for literature? What is the standard deviation in grades of humanities classes?


One way for me to reason about it, is to separately define independent functions that do those calculation on arrays. For example:
average

const calcMean = (arr: number[]): number => {
    return arr.reduce((acc, v, i, a) => acc + v / a.length, 0); // https://stackoverflow.com/a/62372003/6105259
};

median

const calcMedian = (arr: number[]): number => {
  if (!arr.length) return undefined;
  const s = [...arr].sort((a, b) => a - b);
  const mid = Math.floor(s.length / 2);
  return s.length % 2 === 0 ? ((s[mid - 1] + s[mid]) / 2) : s[mid];
}; // https://stackoverflow.com/a/70806192/6105259

standard deviation

const calcStandardDeviation = (arr: number[]): number => {
  const mean = calcMean(arr);
  const variance = arr.reduce((s, n) => s + (n - mean) ** 2, 0) / (arr.length - 1);
  return Math.sqrt(variance);
}; // https://stackoverflow.com/a/68258974/6105259

Alright but now what? How can I apply any function of interest (i.e., either calcMean(), calcMedian(), or calcStandardDeviation()) on studentsGrades to answer my analysis questions by grouping by the relevant key?

Emman
  • 3,695
  • 2
  • 20
  • 44
  • 1
    In addition to your calculation functions, you'll need some functions to grab the relevant data from your data structure. I'm imagining something like `getClassResults("physics", studentsGrades)` and `getCategoryResults("humanities", studentsGrades)`. Then you can combine these "accessor" type functions with your calculation functions to get the final results you want – cdimitroulas Mar 30 '22 at 08:59

1 Answers1

3

If you're using fp-ts, you should use Option instead of returning undefined for calcMedian. It's also good to type parameters as taking readonly arrays when they don't modify the array:

import * as O from 'fp-ts/Option';

const calcMean = (arr: readonly number[]): number => {
  return arr.reduce((acc, v) => acc + v, 0) / arr.length;
};

const calcMedian = (arr: readonly number[]): O.Option<number> => {
  if (!arr.length) return O.none;
  const sorted = [...arr].sort((a, b) => a - b);
  const mid = Math.trunc(sorted.length / 2);
  return O.some(
    sorted.length % 2 === 0
      ? (sorted[mid - 1]! + sorted[mid]!) / 2
      : sorted[mid]!
  );
};

const calcStandardDeviation = (arr: readonly number[]): number => {
  const mean = calcMean(arr);
  const variance = arr.reduce((s, n) => s + (n - mean) ** 2, 0) / (arr.length - 1);
  return Math.sqrt(variance);
};

For getting the subjects data:

import * as RA from 'fp-ts/ReadonlyArray';
import * as O from 'fp-ts/Option';
import {pipe} from 'function';

const gradesByClass = (className: string): readonly number[] =>
  pipe(
    studentsGrades,
    RA.filterMap(({classes}) => O.fromNullable(classes[className]?.grade))
  );

const gradesByCategory = (categoryName: string): readonly number[] =>
  pipe(
    studentsGrades,
    RA.chain(({classes}) => Object.values(classes)),
    RA.filterMap(({category, grade}) => category === categoryName ? O.some(grade) : O.none)
  );

Then you can use these functions like this:

calcMean(gradesByClass('physics'))
calcMedian(gradesByClass('literature'))
calcStandardDeviation(gradesByCategory('humanities'))
Lauren Yim
  • 12,700
  • 2
  • 32
  • 59