Anand Prakash

321
reputation
4
5

Big Data Engineer with 5+ years of work experience in developing software.

Experience:

Oct 2019 - Present Software Engineer 3

Walmart, Bangalore, India

  • Currently working for ODS data platform team which develops B2B enterprise applications like Data quality, Lineage tracking system, Metadata Harvester etc.
  • Building data pipelines using Databricks Spark, Azure data factory to load onPrem data into Azure and Google Cloud.

Jan 2019 - Oct 2019 Data Engineer CoffeeBeans Consulting, Bangalore, India

ThoughtWorks

  • Worked for a South American E-Commerce company called Falabella. Building a cloud native set of applications to replace the monolith legacy system which is based on Oracle ATG.
  • Developed kafka producer and consumer service for streaming data which was eventually consumed by Google DataFlow jobs for transformation.
  • Worked on implementing transformations of different entities like price, catalog, availability using dataflow api.

Omio/GoEuro

  • Developed an application to calculate return on investment(ROI) for marketing campaigns/Ads for german based travel booking portal called GoEuro.
  • Periodic Spark jobs deployed using Apache Airflow scheduler, reads/writes data from sources like AWS redshift, Google BigQuery table.

Wru.ai

  • Worked for a product called wru, an article recommendation System, currently used by Quint and Bloomberg.
  • Spark jobs involve reading streaming data from Kafka, performing complex transformations and finally dumping to Redis and MongoDB.

Feb 2016 - Feb 2019

Senior Software Developer Cerner Corporation, Bangalore, India

  • ETL(Extract Transform Load) of different data sources especially in AVRO format and writing it out to multiple data sinks like HP Vertica, HDFS by implementing Apache Crunch Transformations, which spawns MapReduce pipelines to process data.

  • Configure and deploy Apache Oozie workflows using Chef Client for automation of timely running of Crunch/Hadoop jobs and debugging cluster issues.

  • PopHealth Health-e-Analytics application integrates both SAP Business Objects and Tableau within itself and provides users to perform analytics without leaving Cerner domain/applications. Therefore, we authenticate and securely expose the SAP BO and Tableau APIs as REST APIs to be consumed by Cerner applications

Skills

Language: Java, Scala, Python

BigData: Hadoop: ETL, MapReduce, HDFS, Hive, Yarn, AVRO, Parquet, Apache Crunch, Apache Kafka, Apache Spark, Presto,

Cloud: Google cloud platform(GCP): PubSub, BigQuery, Google Storage, DataFlow, Dataproc Microsoft Azure: Databricks, Cosmos, AppService, Datafactory, Eventhub, Azure Devops, ADLS

Web Service : Java REST(JAX-RS), SpringBoot

DevOps : Docker, Mesosphere DC/OS, GitHub, BitBucket, GitLab, Jenkins, Splunk

Analytical Tools: Tableau, SAP BO

DB: MySQL, Vertica, Redis, MongoDB, Sql Server Data warehouse

Distribution: CloudEra