Sr. Hadoop/big Data Analyst/developer Resume
Minnetonka, MN
SUMMARY
- 11+ years of IT consulting experience in full life cycle (SDLC) of the software development process including requirement gathering, analyzing, designing, testing, interface developing and implementing of Client/Server, Distributed, Internet and E - Commerce application using Object Oriented Methodologies and Structured Development Methodologies.
- Complete multi-tiered application development lifecycle experience using Java/J2EE Technologies.
- Three Years of Big Data Ecosystem experience which includes Big Data processing, design, development, analysis and admin in Telecom, Insurance and Financial domains.
TECHNICAL SKILLS
- Java
- Struts
- JSP
- JSTL
- JSON
- JavaScript
- JSF
- POJO's
- Hibernate
- Hadoop
- spring
- Teradata
- PL/SQL
- CSS
- Log4j
- JUnit
- Subversion
- Informatica
- Eclipse
- Netezza
- Jenkins
- Git
- Oracle 11g
- LoadRunner
- ANT
PROFESSIONAL EXPERIENCE
Confidential, Minnetonka MN
Sr. Hadoop/Big Data Analyst/Developer
Responsibilities:
- Analyzing and researching the data of different source system.
- Experience in Enterprise Data Warehousing and design.
- Experience in data architecting and cloud designing.
- In-depth understanding of Data Structures and Algorithms.
- Experience in writing Shell Scripts (bash, SSH, Perl).
- Designed and developed Talend incremental and delta load to hive and hbase tables
- Did POC on Impala/MapR drill/Jethro along BI tools for the performance and for our use cases
- Supported various downstream teams to Data Mapping and Data Analysis
- Supported on the incremental load and history load/refresh of data to Data lake.
- Extracted, parsed and processed the raw JSON/XML file using PIG and talend job.
- Done Unit testing, data validation, reconciliation data manipulation and so on.
- Designed and developed streaming framework for the logs and streaming data using spark streaming
- Worked in Spark to read the data from Hive and write it to Cassandra using Java.
- Involved in designing Enterprise Data Warehouse Hive metastores.
- Created different flavors of Hive tables, snapshot tables, historical tables and incremental tables.
- Designed and Implemented MapReduce jobs Hive/Hbase table schemas and queries.
- Expertise in writing Shell scripts to monitor Hadoop jobs.
- Worked on Tableau/Qlikview for data visualization POC of the reporting tables on hive with various SQL engines like Jethro/Drill/Spark
- Sqoop and Hive query optimization and enhancement.
- Experience with cloud platform (AWS) security management.
- Coordinated with Hadoop Admins to increase clusters size and also configured myself on single pseudo distributed cluster.
- Exposure on spark - batch and real-time processing.
- Designed, Monitored and managed scalable and fault tolerant deployments and supported them in real-world scenarios.
- Worked on ETL packages and DQM (Data Quality Management) framework.
- Monitor System health and logs and respond accordingly to any warning or failure conditions.
- Created the workflows on Oozie to coordinate the Hadoop Jobs.
- Production support on numerous of Hadoop jobs for 24*7 environment.
Environment: MapR, Talend, Hive, Hbase, Flume, Java, Scala, Spark, Pig, Oozie, Oracle, Impala, MapR Drill, Sqoop, AWS, Kafka, Qlikview, Python, YARN, SQL, Platfora, Unix, Spark, MongoDB, Tableau
Confidential, Kansas City, MO
Hadoop/Big Data Developer
Responsibilities:
- Setup and configured Hadoop daemons and clusters and eco-system.
- Designed, Monitored and managed scalable and fault tolerant deployments and supported them in real-world scenarios.
- Worked with different data format such as Avro, Json, parquet, ORC and more
- Built Data pipeline processing to support Data Warehousing structure.
- Extracted, parsed and processed the raw JSON file using PIG.
- Involved in writing Pipelines, MapReduce jobs and different aggregation functions on Java.
- Done Unit testing with Junit.
- Developed/Maintained ETL process to move data between Oracle and Cloudera HDFS/Hive
- Used Hive or R to manipulate data in Cloudera big data platform.
- Worked closely with HDFS and MapReduce while data pipeline process.
- Experience with Splunk on creating search and indexing of the data.
- Worked with open source frameworks like Puppet/Chef for deploying and configuration
- Experience with data ingestion and forwarding data to Splunk using flume forwarder
- Involved building and managing NoSQL Database like Hbase or Cassandra.
- Worked in Web Services such as REST and SOAP.
- Worked in Spark to read the data from Hive and write it to Cassandra using Java.
- Involved in data integration, migration on ETL Informatica environment.
- Involved in designing and Enterprise Data Warehouse in Hive.
- Experienced with related/complementary Big Data open source software platforms and languages.
- Implemented the Spark technology on DEV for the processing of the streaming data and RDBMS data
- Designed and Implemented MapReduce jobs Hive/Hbase table schemas and queries.
- Expertise in writing Shell scripts to monitor Hadoop jobs.
- Worked on Tableau for data visualization of the reporting tables on hive.
- Experience on performance tuning and fine tuning on the hive queries and MapReduce.
- A significant working experience of UNIX, RHEL (Linux) commands and architecture.
- Created and maintained SOLR indexes and searches.
- Designed and implemented Logical Data Models and data service layer in Hadoop.
- Had good understanding on the OLAP/OLTP system and ETL architecture.
- Hardened Hadoop clusters for deploying into production and staging environments.
- Linux System Administrating, understanding storage, filesystem, disks, mounts nfs etc.
Environment: HDFS, Hive, Hbase, Flume, Java, Scala, Pig, Oozie, Oracle, Tez, Storm, Sqoop, AWS, Cassandra, Splunk, Kafka, YARN, Hortonworks, SQL, Platfora, Unix, Spark, MongoDB, Tableau
Confidential, Newark NJ
Hadoop Developer
Responsibilities:
- Worked on the proof-of-concept for Apache Hadoop1.20.2 framework initiation.
- Installed and configured Hadoop clusters and eco-system.
- Developed automated scripts to install Hadoop clusters.
- Monitored Hadoop cluster job performance and capacity planning.
- Hands-on experience with Hadoop technology stack (HDFS, MapReduce, Hive, Hbase, Flume)
- Involved in designing and developing of data-centric solutions for clients.
- Had experience in high scale or distributed RDBMS
- Created and Implemented highly scalable and reliable highly scalable and reliable distributed data design using NoSQL/Cassandra technology.
- Had experience in Hadoop framework, HDFS, MapReduce processing implementation.
- Good understanding of Big Data products in the market.
- Tuning Hadoop performance with high availability and involved in recovery of Hadoop clusters.
- Discovered how to add or remove nodes from the Cassandra cluster.
- Experienced managing No-SQL DB on large Hadoop distribution systems such as: Hortonworks HDP, MapR M series, Cloudera etc.
- Provided UNIX support and administration.
- Automated all the jobs starting from pulling the Data from different Data Sources like MySQL to pushing the result set Data to Hadoop Distributed File System using Sqoop.
- LeveragedETLsoftware, Implementation Architects must analyze,assemble and transform Client data filesinto a format consumable byHadoopprocessing system
- Experience developing Hadoop integrations for data ingestion, data mapping and data process capabilities.
- Deep JVM knowledge of heavy experience with Functional Programming language like Scala
- Worked with the production Environment on AWS, high availability practices and deploying backup/restore infrastructure.
- Refactored Cassandra-access code, to allow either Hector or Thrift access, replacing the original thrift code interspersed throughout the application
- Designed Hadoop jobs to verify chain-of-custody and look for fraud indications.
- Involved in ETL environment to push complex data into Hadoop and analysis.
- Application performance optimization for a Cassandra cluster.
- Knowledge on the real-time message processing systems (Storm)
Environment: Hadoop, HDFS, MapReduce, Unix, REST, Python, Pig, Hive, Hbase, Storm, NoSql, Flume, Zookeeper, Kibana, Cloudera, SAS, Vertica, Kafka, Cassandra, Informatica, Teradata, Spark
Confidential, Tampa, FL
Hadoop/Java Developer with ETL
Responsibilities:
- Worked as ETL Architect to make sure all the applications are migrated (along with server) smoothly.
- Migrated data to the HDFS from traditional DMBS.
- Deep understanding and related experience with Hadoop stack - internals, Hbase, Hive, Pig and Map/Reduce
- A deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment
- Managed mission-critical Hadoop cluster at scale especially Hortonworks.
- Deep understanding of schedulers, workload management, availability, scalability and distributed data platforms
- Involved in developing and debugging Java/J2EE
- Wrote Hive Queries and UDF’s.
- Worked closely with the Enterprise Data Warehouse.
- Experience on AWS clouding computing platform, it many services and dimensions of scalability.
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Upgrading the Hadoop Cluster to CDH4 and setup High availability Cluster. Integrated HIVE with external applications using JDBC/ODBC bridge
- Familiar with many use cases of Storm such as; real-time analytics, online machine learning, continuous learning, ETL and more.
- Provided UNIX support and administration experience.
- Converted unstructured data to structured data using Pig scripting for testing and validation.
- Experienced with Map Reduce or stream processing using Storm.
- Automated all the jobs starting from pulling the Data from different Data Sources like MySQL to pushing the result set Data to Hadoop Distributed File System using Sqoop.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Experience in designing, implementing and maintaining of high performing Hadoop clusters and integrating them with existing infrastructure.
- Performed complex Linux administrative activates as we as created, maintained and updated Linux shell scripts.
- Designed and supported highly available and scalable Linux infrastructure in 24*7 environment
- Specifying the Cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Strong experience on Apache server configuration.
- Exported the result set from HIVE to MySQL using Shell scripts.
- Develop HIVE queries for the analysts.
- Had good knowledge in technologies like Sqoop, Flume and Kafka.
- Built data Fabric with Flume, Kafka and Sqoop.
- Maintain System integrity of all sub-components (primarily HDFS, MR, HBase, and Flume).
Environment: Hadoop, HDFS, MapReduce, Storm, Hive, Pig, Sqoop, Oracle, SQL, MySQL, UNIX Shell Scripting, PL/SQL, Lucene, Vertica, Teradata, Linux, IBM BigInsights, MongoDB, Java, Servlets, C++
Confidential, Jacksonville, FL
Java/J2EE Developer
Responsibilities:
- Responsible for gathering all required information and requirements for the project.
- Experience inAgile Programmingand accomplishing the tasks to meet deadlines.
- Used Ajax and JavaScript to handle asynchronous request, CSS to handle look and feel of the application.
- Involved in design ofClassDiagrams, SequenceDiagramsandEventDiagramsas a part of Documentation.
- Developed the presentation layer using CSS and HTMLtaken from Bootstrap to develop for multiple browsers including mobiles and tablets.
- Extended standard action classes provided by theStruts frameworkfor appropriately handling client requests.
- Monitored and scheduled the UNIX scripting jobs.
- Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems as we as RDBMS and NoSQL data stores for data access and analysis.
- Wrote the Map Reduce jobs using Java.
- Experienced working on ETL/Data Warehousing environment (DataStage or Informatica)
- ConfiguredStruts tilesfor reusing view components as an application of J2EE composite pattern.
- Involved in the integration of Struts and Spring 2.0 for implementing Dependency Injection (DI/IoC). Developed code for obtaining bean references inSpringIoC framework.
- DesignedDTO, Business Delegate, Factory and Singletondesign patterns.
- Developed the application onEclipse.
- Involved in the implementation of beans inApplication.
- Migrated ETL Informatica code by using team based versioning.
- Hands on experience in web services, distributed computing, multi-threading, JMS etc.
- Implementedcross cuttingconcerns as aspects at Service layer usingSpring AOP.
- Involved in the implementation of DAO objects using spring - ORM.
- Involved in creating theHibernate POJO’s and developedHibernate mapping Files.
- UsedHibernate, object/relational-mapping (ORM) solution, technique of mapping data representation from MVC model to Oracle Relational data model with a SQL-based schema.
- DevelopedSQL queriesandStored ProceduresusingPL/SQLto retrieve and insert into multiple database schemas.
- Developed Ant Scripts for the build process.
- Version Controlwas mandated throughSubversion.
- Performed Unit Testing UsingJUnit andLoad testing usingLoadRunner.
- ImplementedLog4Jto trace logs and to track information.
Environment: Java, Struts, JSP, JSTL, JSON, JavaScript, JSF, POJO's, Hibernate, Hadoop, spring, Teradata, PL/SQL, CSS, Log4j, JUnit, Subversion, Informatica, Eclipse, Netezza, Jenkins, Git, Oracle 11g, LoadRunner, ANT
Confidential, San Jose, CA
Java/J2EE Developer
Responsibilities:
- Created design documents and reviewed with team in addition to assisting the business analyst / project manager in explanations to line of business.
- Responsible for understanding the scope of the project and requirement gathering.
- Involved in analysis, design, construction and testing of the online banking application
- Developed the web tier using JSP, Struts MVC to show account details and summary.
- Used Struts Tiles Framework in the presentation tier.
- Designed and developed the UI using Struts view component, JSP, HTML, CSS and JavaScript.
- Used AJAX for asynchronous communication with server
- Utilized Hibernate for Object/Relational Mapping purposes for transparent persistence onto the SQL Server database.
- Used Spring Core for dependency injection/Inversion of control (IOC), and integrated frameworks like Struts and Hibernate.
- Developed ETL mapping testing, correction and enhancement and resolved data integrity issues.
- Involved in writing Spring Configuration XML files that contains declarations and other dependent objects declaration.
- Used Tomcat web server for development purpose.
- Involved in creation running of Test Cases for JUnit Testing.
- Used Oracle as Database and used Toad for queries execution and also involved in writing SQL scripts, PL/SQL code for procedures and functions.
- Used CVS for version controlling.
- Developed application using Eclipse and used build and deploy tool as Maven.
- Used Log4J to print the logging, debugging, warning, info on the server console.
Environment: Java, J2EE Servlet, JSP, JUnit, AJAX, XML, JSON, CSS, JavaScript, Spring, Struts, Hibernate, Log4j, CVS, Maven, Eclipse, Apache Tomcat, and Oracle.
Confidential, Minneapolis MN
J2EE Developer
Responsibilities:
- Created UML class diagrams that depict the code’s design and its compliance with the functional requirements.
- Used J2EE design Patterns for the Middle Tier development.
- Developed EJB’s in WebLogic for handling business process, database access and asynchronous messaging.
- Used Java Mail notification mechanism to send confirmation email to customers about scheduled payments.
- Had heavy experience on UI development.
- Developed Message-Driven beans in collaboration with Java Messaging Service (JMS) to communicate with the merchant systems.
- Also involved in writing JSP’s/JavaScript and Servlets to generate dynamic web pages and web content.
- Wrote Stored Procedures and Triggers using PL/SQL.
- Involved in building and parsing XML documents using SAX parser after retrieving payment history data from the database.
- Deployed the application on JBOSS Application Server.
- Used Clear Case for version controlling and configuration management.
- Very Strong knowledge in using J2EE based App Servers like Jboss, WebSphere, WebLogic, and Web servers like Apache Tomcat.
- Experience in implementing Web Services using SOAP, REST and XML/HTTP technologies.
Environment: Java, JSP, JSTL, EJB, JMS, JavaScript, JSF, XML, JBOSS, WebSphere, WebLogic, Hibernate, spring, SQL, PL/SQL, CSS, Log4j, JUnit, Subversion, Eclipse, Oracle 11g, LoadRunner, ANT.
