Cloudera Developer Training for Apache Hadoop培訓

培訓大綱:

  1. Motivation for Hadoop
    Problems with Traditional Large-Scale Systems
    Requirements for a New Approach
  2. Hadoop: Basic Concepts
    Hadoop Distributed File System (HDFS)
    MapReduce
    Anatomy of a Hadoop Cluster
    Other Hadoop Ecosystem Components
  3. Writing a MapReduce Program
    MapReduce Flow
    Examining a Sample MapReduce Program
    Basic MapReduce API Concepts
    Driver Code
    Mapper
    Reducer
    Streaming API
    Using Eclipse for Rapid Development
    New MapReduce API
  4. Integrating Hadoop into the Workflow
    Relational Database Management Systems
    Storage Systems
    Importing Data from a Relational Database Management System with Sqoop
    Importing Real-Time Data with Flume
    Accessing HDFS Using FuseDFS and Hoop
  5. Delving Deeper into the Hadoop API
    ToolRunner
    Testing with MRUnit
    Reducing Intermediate Data with Combiners
    Configuration and Close Methods for Map/Reduce Setup and Teardown
    Writing Partitioners for Better Load Balancing
    Directly Accessing HDFS
    Using the Distributed Cache
  6. Common MapReduce Algorithms
    Sorting and Searching
    Indexing
    Machine Learning with Mahout
    Term Frequency
    Inverse Document Frequency
    Word Co-Occurrence
  7. Using Hive and Pig
    Hive Basics
    Pig Basics
  8. Practical Development Tips and Techniques
    Debugging MapReduce Code
    Using LocalJobRunner Mode for Easier Debugging
    Retrieving Job Information with Counters
    Logging
    Splittable File Formats
    Determining the Optimal Number of Reducers
    Map-Only MapReduce Jobs
  9. Advanced MapReduce Programming
    Custom Writables and WritableComparables
    Saving Binary Data Using SequenceFiles and Avro Files
    Creating InputFormats and OutputFormats
  10. Joining Data Sets in MapReduce
    Map-Side Joins
    Secondary Sort
    Reduce-Side Joins
  11. Graph Manipulation in Hadoop
    Graph Techniques
    Representing Graphs in Hadoop
    Implementing a Sample Algorithm: Single Source Shortest Path
  12. Creating Workflows with Oozie
    Motivation for Oozie
    Workflow Definition Format
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章