Hadoop Developer Resume
2.00/5 (Submit Your Rating)
Chicago, IllinoiS
PROFESSIONAL SUMMARY:
- 2+ years of experience in IT industry including Big Data - Hadoop and Data Warehousing technologies.
- Hands-on experience in developing, debugging and performance tuning of Hive, Hive scripts and Spark Scala/Python Jobs.
- Strong understanding and practical experience in developing Spark applications with Scala.
- Excellent knowledge of Hadoop Architecture and various components such as HDFS, Yarn, Job Tracker, Task Tracker, Name Node and Data Node.
- Involved in developing Hive UDFs for the needed functionality.
- Developed Scala scripts, using both Data frames/SQL and RDD in Spark for Data Aggregation.
- Developed Oozie workflow to automate the loading of data into HDFS for data pre-processing
- Hands on experience on windowing function like Rank, Dense Rank,Ntile,Row Number,Lag,Lead
- Worked with customers closely to provide solutions to various problems
TECHNICAL SKILLS:
Skills: Quick learner and self starter to go ahead with any new technology.
Languages: Scala, Python, SQL, HiveQL
Hadoop: HDFS, Spark, Spark SQL, PySpark, Scala, Oozie, Hive, Hive QL, Sqoop, Hortonwork, Ambari, Yarn, Crown job, hive performance, spark submit, windowing function, Apache NIFI, Kafka
AWS: S3, EC2, AutoScalingGroups, IAM, EMR
Operating Systems: Linux
Scripting languages: html, dhtml, xml, JavaScript, Json
Databases: Oracle, MySQL, HBase
IDE: Eclipse, IntelliJ
Version Control: Git
WORK HISTORY:
Hadoop Developer
Confidential, Chicago, Illinois
Responsibilities:
- Unix shell scripting to stage the data into HDFS by watching a directory continuously for specific duration.
- Used Hive in Apache Spark, Ingested data in to HDFS files and created hive tables for easy access for data analysis purpose. Used hive through spark to retrieve data with schema into a data frame and processed further.
- Used partitioning and bucketing for better performance.
- Loaded data into Hive partitioned and bucketed tables.
- Wrote HiveQL Scripts.
- Involved in converting Hive/SQL queries into Spark transformations using Spark Dataframes and DataSets.
- Working with spark eco system using Spark SQL and Scala & Python queries on different formats like Text file, CSV file, ORC.
- Loaded data to & from HDFS using Sqoop.
- Worked on a few statistical packages
- Programmer Analyst, manage resources, manage the data my statistical analysis.
- Worked with the Business Analysts in analyzing business requirements.
- Worked on Spark job in Scala to format data as required by the BI reporting tool used by Mede analytics.
- Accessed the data from the HDFS and processed it using Spark framework to make the process efficient.
- Validated and formatted and applied business logic to the data using Spark RDD and UDF. updated the data into HDFS using Sqoop import and export command line utility interface.
- Installed LANs WANs and established Intranet and Internet access.
- Assist in examination of network servers equipment and Maintenance of networking systems