Hadoop/big Data Developer Resume
CA
SUMMARY
- Having 7+ years of IT experience in developing, delivering of software using wide variety of technologies in all phases of the development life cycle. Expertise in Java technologies as consultant, proven ability in project based leadership, teamwork and good communication skills.
- Very Strong Object - oriented concepts with complete software development life cycle experience - Requirements gathering, Conceptual Design, Analysis, Detail design, Development, Mentoring, System and User Acceptance Testing.
- Strong knowledge of Java with experience ranging from introduction of version 1.3, adoption of version 1.6 and commitment to use SWING for component architecture of product interface.
- Hands-on development and implementation experience inBig DataManagement Platform (BMP) usingHadoop, Map Reduce, Hive and other Hadooprelated eco-systems asaDataStorage and Retrieval systems.
- Experience in installation, configuration, supporting and managing - Cloud Era’s Hadoop platform along with CDH4&5 clusters
- Highly knowledgeable inWriter Comparable,Writer interfaces, Mapper and Reducer abstract classes,HadoopDataObjects such asIntWritable,ByteWritable, Text objects.
- Experience in building infrastructure by utilizing DHCP, PXE, DNS, KICKSTART, and NFS.
- Experience in designing and building complete Hadoop ecosystem comprising of PIG, HIVE, Sqoop, Oozie, Flume and Zookeeper.
- Experience in the existing Hadoop infrastructure to latest releases.
- Experience in understanding Big Data business requirements and providing them Hadoop based solutions.
- Experience in designing both time driven and data driven automated workflows using Oozie.
- Experience in analyzing existing Hadoop cluster, understanding the performance bottlenecks and providing the performance tuning solutions accordingly.
- Expertise in implementation and designing of disaster recovery plan for Hadoop cluster.
- Experience in Data transfer from structured data stores to HDFS using Sqoop .
- Experience in writing Map Reduce programs to perform Data processing and analysis.
- Experience in analyzing data with Hive and Pig.
- Good experience in writing PIG and Hive UDF’s to solve the purpose of util classes.
- Experience in using Oozie for managing Hadoop jobs.
- Experience in loading logs from multiple sources directly into HDFS using Flume.
- Experience in setting up cluster monitoring tools like Nagios and Ganglia for hadoop.
- Working experience on Amazon Web Services (EMR, S3, EC2)
- Hands on experience in Agile and Scrum methodologies.
- Extensively development experience in different IDE’s like Eclipse, NetBeans.
- Expertise in relational databases like Oracle, My SQL.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Worked on multiple stages of Software Development Life Cycle including Development, Component Integration, Performance Testing, Deployment and Support Maintenance.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Worked on multiple stages of Software Development Life Cycle including Development, Component Integration, Performance Testing, Deployment and Support Maintenance.
- Extensive experience in working with the Customers to gather required information to analyze, debug and provide data fix or code fix for technical problems, build service patch for each version release and unit testing, integration testing, User Acceptance testing and system testing and providing Technical Solution documents for the Users
TECHNICAL SKILLS
Languages: C,C++,JAVA, SQL and PL/SQL
Big Data Framework and Eco Systems: Hadoop, MapReduceHive, Pig, HDFS, Zookeeper, Sqoop, Apache Crunch, Oozie and Flume
No SQL: Cassandra, HBase and MemBase
Web Technologies: JavaScript, CSS, HTML, XHTML, AJAX, XML, XSLT
Databases: Oracle 8i/9i/10g/11g, MySQL, PostGre SQL and MS-Access
Operating Systems: Windows XP/2000/NT, Linux, UNIX
Tools: Ant, Maven, TOAD, AgroUML, WinSCP, Putty, Lucene
IDE Tools: Eclipse 4.x, Eclipse RCP, NetBeans 6, Editplus
Version Control Tools: CVS, SVN
PROFESSIONAL EXPERIENCE
Confidential, CA
Hadoop/Big Data Developer
Responsibilities:
- Dumped the data from different sources using Sqoop into HDFS for analyzing.
- Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
- Developed transformation logic using Hive for data sources.
- Performed hive transformations on data to transform into structured way to find IOS S/W numbers and Features used.
- Implemented Partitioning, Dynamic Partitions, buckets in Hive to analyze and process the data.
- Designed PDI jobs to integrate Sqoop, Hive and Hadoop file system operations.
- Streamlined Hadoop jobs and workflow operations using PDI job.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by our BI team.
- Migrated code to different environment using bash scripts.
- Installed and configured various components of Hadoop ecosystem and maintained their integrity.
- Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
- Supported Map Reduce jobs those are running on the cluster.
- Managed and reviewed Hadoop Log files.
- Worked on Hadoop cluster maintenance activities such as metadata backups, file system checks.
- Involved in product life cycle developed using Scrum methodology.
- Involved in mentoring team in technical discussions and Technical reviews.
- Involved in code reviews and verifying bug analysis reports.
Environment: JDK1.6,RedHat Linux, HDFS, Map-Reduce, Hive, Sqoop, Oozie, Netezza, Teradata, DB2, Pentaho.
Confidential, MD
Hadoop/Big DataDeveloper
Responsibilities:
- Worked on the Hadoop File System Java API to develop or Compute the Disk Usage Statistics.
- Experience in Developing the Hive queries for the transformations, aggregations and Mappings on the Customer Data.
- Worked on importing and exporting data into HDFS and Hive using Sqoop.
- Worked on analyzing/transforming the data with Hive and Pig.
- Developed map reduce programs for applying business rules on the data
- Developed and executed Hive Queries for de-normalizing the data.
- Automated workflow using Shell Scripts.
- Performance Tuning on Hive Queries.
- Involved in migration of data from one Hadoop Cluster to the Hadoop Cluster.
- Worked on configuring multiple Map Reduce Pipelines, for the new Hadoop Cluster.
- Worked on Configuring the New Hadoop Cluster.
- Configured Hive metastore, which stores the metadata for Hive tables and partitions in a relational database.
- Worked on Oozie workflow engine for job scheduling.
- Developed custom implementation for Partioner, Input / Output Formats, Record Reader and Writers.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting on the dashboard.
- Loaded the aggregated data onto DB2 for reporting on the dashboard.
- Monitoring and Debugging Hadoop jobs/Applications running in production.
- Configured Flume for efficiently collecting, aggregating and moving large amounts of log data
- Imported/exported data from RDBMS to HDFS using Sqoop.
- Developed the Pig UDF’S to pre-process the data for analysis.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing
Environment: JDK1.6,RedHat Linux, HDFS, Map-Reduce, Hive, Pig, Sqoop, Oozie, Teradata, Oracle.
Confidential, CA
Hadoop Developer
Responsibilities:
- Developed Map Reduce program for parsing and loading the streaming data into HDFS information regarding messaging objects
- Developed Hive queries to pre-process the data for analysis by imposing read only structure on the stream data
- Data pipelines setup from scratch including design and implementation.
- Developed workflow using Oozie for running Map Reduce jobs and Hive Queries
- Used Sqoop for exporting data into MYSQL.
- Configured job flows, jobs management using Fair scheduler.
- Developed Cluster coordination services through Zookeeper.
- Developed workflow to push data directly into HDFS using Flume.
- Worked with Agile Methodologies and actively participated in Scrum.
- Performance tuning for infrastructure and Hadoop settings for optimal performance of jobs and their throughput
- Design and developed ant scripts to build the application.
- Managing the version control for the deliverables by streamlining and re-basing the development streams of the SVN.
Environment: Java 1.6, Hadoop, Map Reduce, Hive,MySQL, Eclipse, JUnit, Log4j, Windows and Linux, Scrum.
Confidential
Java /J2EE Developer
Responsibilities:
- Developed the user interface screens using Swing for accepting various system inputs such as contractual terms, monthly data pertaining to production, inventory and transportation.
- Involved in designing Database Connections using JDBC.
- Involved in design and Development of UI using HTML, JavaScript and CSS.
- Involved in creating tables, stored procedures in SQL for data manipulation and retrieval using SQL SERVER 2000, Database Modification using SQL, PL/SQL, Stored procedures, triggers, Views in Oracle.
- Developed the business components (in core Java) used for the calculation module (calculating various entitlement attributes).
- Involved in the logical and physical database design and implemented it by creating suitable tables, views and triggers.
- Created the related procedures and functions used by JDBC calls in the above components.
- Involved in fixing bugs and minor enhancements for the front-end modules.
Environment: Eclipse, IBM Websphere, Java/JDK, JSP, Servlets,JDBC, PL/SQL, XML, XSLT, Struts Framework, Rational Suite, Oracle, HTML/DHTML