4

I was recently asked this in an interview (Software Engineer) and didn't really know how to go about answering the question.

The question was focused on both the algorithm of the spreadsheet and how it would interact with the browser. I was a bit confused on what data structure would be optimal to handle the cells and their values. I guess any form of hash table would work with cells being the unique key and the value being the object in the cell? And then when something gets updated, you'd just update that entry in your table. The interviewer hinted at a graph but I was unsure of how a graph would be useful for a spreadsheet.

Other things I considered were:

  • Spreadsheet in a browser = auto-save. At any update, send all the data back to the server
  • Cells that are related to each other, i.e. C1 = C2+C3, C5 = C1-C4. If the value of C2 changes, both C1 and C5 change.
  • Usage of design patterns? Does one stand out over another for this particular situation?

Any tips on how to tackle this problem? Aside from the algorithm of the spreadsheet itself, what else could the interviewer have wanted? Does the fact that its in a browser as compared to a separate application add any difficulties?

Thanks!

Rubén
  • 34,714
  • 9
  • 70
  • 166
user1530318
  • 25,507
  • 15
  • 37
  • 48
  • I wonder what kind of job you were interviewing for. Assuming some sort of engineering support for web design, I would expect your answer to show knowledge of HTML, javascript, and the document object model. The term `graph` is a little ambiguous. There are visual graphs, the kind with x and y axes and a plot of data. But there are also logical graphs, made up of vertices and edges. I have a feeling that the interviewer was hinting at the latter, to keep track of cells (vertices), and cell interactions (edges). – user3386109 May 31 '14 at 01:17

3 Answers3

5

For an interview this is a good question. If this was asked as an actual task in your job, then there would be a simple answer of use a third party component, there are a few good commercial ones.

While we can't say for sure what your interviewer wanted, for me this is a good question precisely because it is so open ended and has so many correct possible answers.

You can talk about the UI and how to implement the kind of dynamic grid you need for a spreadsheet and all the functionality of the cells and rows and columns and selection of cells and ranges and editing of values and formulas. You probably could talk for a while on the UI implications alone.

Alternatively you can go the data route, talk about data structures to hold a spreadsheet, talk exactly about links between cells for formulas, talk about how to detect and deal with circular references, talk about how in a browser you have less control over memory and for very large spreadsheets you could run into problems earlier. You can talk about what is available in JavaScript vs a native language and how this impacts the data structures and calculations. Also along with data, a big important issue with spreadsheets is numerical accuracy and floating point number calculations. Floating point numbers are made to be fast but are not necessarily accurate in extreme levels of precision and this leads to a lot of confusing questions. I believe very recently Excel switched to their own representation of a fixed decimal number as it's now viable to due spreadsheet level calculations without using the built-in floating point calculations. You can also talk about data structures and calculation and how they affect performance. In a browser you don't have threads (yet) so you can't run all the calculations in the background. If you have 100,000 rows with complex calculations and change one value that cascades across everything, you can get a warning about a slow script. You need to break up the calculation.

Finally you can run form the user experience angle. How is the experience in a browser different from a native application? What are the advantages and what cool things can you do in a browser that may be difficult in a desktop application? What things are far more complicated or even totally impossible (example, associate your spreadsheet app with a file type so a user can double-click a file and open it in your online spreadsheet app, although I may be wrong about that still being unsupported).

Good question, lots of right answers, very open ended.

On the other hand, you could also have had a bad interviewer that is specifically looking for the answer they want and in that case you're pretty much out of luck unless you're telepathic.

Samuel Neff
  • 73,278
  • 17
  • 138
  • 182
1

You can say hopelessly too much about this. I'd probably start with:

  • If most of the cells are filled, use a simply 2D array to store it.

  • Otherwise use a hash table of location to cell

  • Or perhaps something like a kd-tree, which should allow for more efficient "get everything in the displayed area" queries.

By graph, your interviewer probably meant have each cell be a vertex and each reference to another cell be a directed edge. This would allow you to do checks for circular references fairly easily, and allow for efficiently updating of all cells that need to change.

"In a browser" (presumably meaning "over a network" - actually "in a browser" doesn't mean all that much by itself - one can write a program that runs in a browser but only runs locally) is significant - you probably need to consider:

  • What are you storing locally (everything or just the subset of cells that are current visible)

  • How are you sending updates to the server (are you sending every change or keeping a collection of changed cells and only sending updates on save, or are you not storing changes separately and just sending the whole grid across during save)

  • Auto-save should probably be considered as well

  • Will you have an "undo", will this only be local, if not, how will you handle this on the server and how will you send through the updates

  • Is only this one user allowed to work with it at a time (or do you have to cater for multi-user, which brings dealing with conflicts, among other things, to the table)

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
  • It's been a long time, any chance you can comment on the following query? I can make a graph of all the cells in a sheet and able to find the circular dependency. How do I tackle when a cell is referenced from another sheet? – SachinGutte Jan 09 '19 at 10:17
  • @SachinGutte The easiest way to check for dependencies in multiple sheets would just be to have a single graph for all sheets, instead of 1 graph per sheet. Then you can just look for cycles there. – Bernhard Barker Jan 09 '19 at 16:15
-1
  1. Looking at the CSS cursor property just begs for one to create a spreadsheet web application.
  2. HTML table or CSS grid? HTML tables are purpose built for tabular data.
  3. Resizing cell height and width is achievable with offsetX and offsetY.
  4. Storing the data is trivial. It can be Mongo, mySQL, Firebase, ...whatever. On blur, send update.
  5. Javascrip/ECMA is more than capable of delivering all the Excel built-in functions. Did I mention web workers?
  6. Need to increment letters as in column ID's? I got you covered.
  7. Most importantly, don't do it. Why? Because it's already been done. Find a need and work that project.
Community
  • 1
  • 1
Ronnie Royston
  • 16,778
  • 6
  • 77
  • 91