I am having 8+ years of experience in Bigdata development, coding, optimization, implementation and deployment. I have worked on various projects like "Building a serverless data pipeline to create datalake", "Building Scalable and efficient ETL Data pipeline", "Build a real-time pipeline to capture call record", "Migration of the existing on-premises ETL workflow from Informatica PowerCenter to Big Data", "Created a central data repository from various data sources".
I am having
- Proficient and good knowledge on AWS cloud services.
- Experience in writing AWS Glue Script using Pyspark.
- Experience in writing AWS Lambda function.
- Experience in creating tables in Athena and its optimization.
- Exposure to AWS services like S3, Redshift and EMR.
- Experience in working on Databricks.
- Having Hands on Experience with Hadoop Core Concepts like Hive, Spark, Sqoop, Kafka for scalability, distributed computing and high-performance computing.
- Experience in processing large sets of structured and semi-structured datasets.
- Experience in integration of various data sources like RDBMS, Spreadsheets, and Text files.
- Experience in processing real-time streaming data.
- Experience in SQL and NoSQL databases like MySQL, Oracle and MongoDB.
- Experience in using various tools such as: IDE’s such as VS Code, PyCharm etc.
- Experience in working in Agile Scrum Model.