-3

I download the end of day stock prices for over 20,000 global securities across 20 different markets. I then run my 20,000 proprietary trading setups over these securities for profitable trading setups. The process is simple but the process needs the power of cloud computing to automate because its impossible to run on a desktop.

I'm coming at this solution as a complete beginner so please excuse my lack of technical understanding.

  1. I download the prices from a single source onto my computer into Microsoft Excel Files.
  2. Do I use Apache Arrow to transport the excel files into Apache Parquet? I'm considering Parquet because its a columnar storage solution which is ideal for historical stock price file formats.
  3. To run my 20,000 proprietary trading setups I would use Apache Spark to read the parquet files in my chosen cloud environment.
  4. This would produce the high probability trade results everyday which would upload onto my web based platform.

A very simplified setup from my current research. Thank you for assistance in advance.

Kind regards Levi

Levi
  • 1

1 Answers1

0

I'm sorry but you don't have a big data setup.

What you are doing is using just one computer to convert from excel files into parquet. If you are able to read the data and write again on disk in a reasonble time it seems you don't have "big data".

What you should do is:

  1. Get data into your datalake using something like Apache NiFi
  2. Use spark to read data from datalake. For excel files see How to construct Dataframe from a Excel (xls,xlsx) file in Scala Spark?
Daniel Argüelles
  • 2,229
  • 1
  • 33
  • 56
  • That's ok. Thank you for your advice. – Levi Oct 13 '19 at 21:40
  • At what particular quantum of data is it considered big data? – Levi Oct 13 '19 at 21:40
  • There is no a magic number. Big Data gives you the possibility to threat data in a distributed way but increases the complexity of the algorithms. Many times, doing things correctly you will be able to work with just one single machine. – Daniel Argüelles Oct 14 '19 at 06:48