【注】 本文的兩個(gè)主要圖片可能不夠清晰,可以從這里下載。

1. 涉及到的狀態(tài)機
(1)RMApp:每個(gè)application對應一個(gè)RMApp對象,保存該application的各種信息。
(2)RMAppAttempt:每個(gè)RMApp可能會(huì )對應多個(gè)RMAppAttempt對象,這取決于前面的RMAppAttempt是否執行成功,如果不成功,會(huì )啟動(dòng)另外一個(gè),直到運行成功。RMAppAttempt對象稱(chēng)為“application執行嘗試”,這RMApp與RMAppAttempt關(guān)系類(lèi)似于MapReduce中的task與taskAttempt的關(guān)系。
(3)RMNode:保存各個(gè)節點(diǎn)的信息。
(4)RMContainer:保存各個(gè)container的信息。
2. 事件調度器
(1)AsyncDispatcher
中央事件調度器,各個(gè)狀態(tài)機的事件調度器會(huì )在中央事件調度器中注冊,注冊方式信息包括:<事件,事件調度器>。該調度器維護了一個(gè)事件隊列,它會(huì )不斷掃描整個(gè)隊列,取出一個(gè)事件,檢查事件類(lèi)型,并交給相應的事件調度器處理。
(2)各個(gè)子事件調度器
| 事件類(lèi)型 | 狀態(tài)機 | 事件處理器 |
| RMAppEvent | RMApp | ApplicationEventDispatcher |
| RMAppAttemptEvent | RMAppAttempt | ApplicationAttemptEventDispatcher |
| RMNodeEvent | RMNode | NodeEventDispatcher |
| SchedulerEvent | — | SchedulerEventDispatcher |
| AMLauncherEvent | — | ApplicationMasterLauncher |
3. ResourceManager中事件處理流
(1)Client通過(guò)RMClientProtocol協(xié)議向ResourceManager提交application。
<1> 代碼所在目錄:
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java
<2> jar包:org.apache.hadoop.mapred
<3>關(guān)鍵類(lèi)與關(guān)鍵函數:YARNRunner.submitJob()
(2) ResourceManager端的ClientRMService服務(wù)接收到application,使得RMAppManager調用handle函數處理RMAppManagerSubmitEvent事件,處理邏輯如下:為該application創(chuàng )建RMAppImpl對象,保存其信息,接著(zhù)產(chǎn)生RMAppEventType.START事件.
<1> 代碼所在目錄:
hadoop-mapreduce-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-resourcemanager\src\main\java\org\apache\hadoop\yarn\server\resourcemanager
<2> jar包:org.apache.hadoop.yarn.server.resourcemanager
<3>關(guān)鍵類(lèi)與關(guān)鍵函數:ClientRMService.submitApplication(),RMAppManager.submitApplication()
(3) RMAppEventType.START事件傳遞給AsyncDispatcher,AsyncDispatcher查看相關(guān)數據結構,確定該事件由ApplicationEventDispatcher處理,該dispatcher將RMApp從RMAppState.NEW狀態(tài)變?yōu)镽MAppState.SUBMITTED狀態(tài),同時(shí)創(chuàng )建RMAppAttemptImpl對象,并觸發(fā)RMAppAttemptEventType.START事件。
<1> jar包:org.apache.hadoop.yarn.server.resourcemanager.rmapp
<2> 關(guān)鍵類(lèi)與關(guān)鍵函數:RMAppImpl.StartAppAttemptTransition
(4)RMAppAttemptEventType.START事件傳遞給AsyncDispatcher,AsyncDispatcher查看相關(guān)數據結構,確定該事件由ApplicationAttemptEventDispatcher處理,該dispatcher將RMAppAttempt從RMAppAttemptState.NEW變?yōu)镽MAppAttemptState.SUBMITTED狀態(tài)。
<1> jar包:org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt
<2> 關(guān)鍵類(lèi)與關(guān)鍵函數:RMAppAttemptImpl.StateMachineFactory
(5) RMAppAttempt向ApplicationMasterService注冊,它將之保存在responseMap中。
<1> jar包:org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt
<2> 關(guān)鍵類(lèi)與關(guān)鍵函數:RMAppAttemptImpl.AttemptStartedTransition
(6)RMAppAttempt觸發(fā)AppAddedSchedulerEvent
<1> jar包:org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt
<2> 關(guān)鍵類(lèi)與關(guān)鍵函數:RMAppAttemptImpl.AttemptStartedTransition
(7)ResourceScheduler(如FifoScheduler)捕獲AppAddedSchedulerEvent事件,并創(chuàng )建SchedulerApp對象,使RMAppAttempt對像從RMAppAttemptState.SUBMITTED轉化為RMAppAttemptState.SCHEDULED狀態(tài),同時(shí)產(chǎn)生RMAppAttemptEventType.APP_ACCEPTED事件。
<1> jar包:org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo
<2> 關(guān)鍵類(lèi)與關(guān)鍵函數:FifoScheduler.addApplication
(8)RMAppAttemptEventType.APP_ACCEPTED事件由ApplicationAttemptEventDispatcher捕獲,并將RMAppAttempt從RMAppAttemptState.SUBMITTED轉化為 RMAppAttemptState.SCHEDULED狀態(tài),并產(chǎn)生RMAppEventType.APP_ACCEPTED事件。
<1> jar包:org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt
<2> 關(guān)鍵類(lèi):RMAppAttemptImpl.ScheduleTransition
(9)調用ResourceScheduler的allocate函數,為ApplicationMaster申請一個(gè)container。
<1> jar包:org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt
<2> 關(guān)鍵類(lèi):RMAppAttemptImpl.ScheduleTransition
(10)此刻,某個(gè)node(稱(chēng)為“AM-NODE”)正好通過(guò)heartbeat向ResourceManager.ResourceTrackerService匯報自己所在節點(diǎn)的資源使用情況。
(11) ResourceTrackerService.nodeHeartbeat收到heartbeat信息后,觸發(fā)RMNodeStatusEvent(RMNodeEventType.STATUS_UPDATE)事件。
<1> jar包:org.apache.hadoop.yarn.server.resourcemanager
<2> 關(guān)鍵類(lèi):ResourceTrackerService.nodeHeartbeat
(12) RMNodeStatusEvent被ResourceScheduler捕獲,調用assginContainers為該application分配一個(gè)container(用對象RMContainer表示),分配之后,會(huì )觸發(fā)一個(gè)RMContainerEventType.START事件。
(13) RMContainerEventType.START事件被NodeEventDispatcher捕獲,使得RMContainer對象從RMContainerState.NEW狀態(tài)轉變?yōu)镽MContainerState.ALLOCATED狀態(tài),同時(shí)觸發(fā)RMAppAttemptContainerAllocatedEvent(RMAppAttemptEventType.CONTAINER_ALLOCATED)事件.
<1> jar包:org.apache.hadoop.yarn.server.resourcemanager.rmcontainer
<2> 關(guān)鍵類(lèi):RMContainerImpl.ContainerStartedTransition
(14) RMAppAttemptContainerAllocatedEvent事件被 ApplicationAttemptEventDispatcher捕獲,并將RMAppAttempt對象從RMAppAttemptState.SCHEDULED狀態(tài)轉變?yōu)镽MAppAttemptState.ALLOCATED狀態(tài),同時(shí)調用Scheduler的allocate函數申請一個(gè)container,并觸發(fā)AMLauncherEventType.LAUNCH事件
(15)AMLauncherEventType.LAUNCH事件被ApplicationMasterLauncher捕獲,主要處理邏輯如下:創(chuàng )建一個(gè)AMLauncher對象,并添加到隊列masterEvents中,等待處理;一旦被處理,會(huì )調用AMLauncher.launch()函數,該函數會(huì )調用ContainerManager.startContainer()函數創(chuàng )建container,同時(shí)觸發(fā)RMAppAttemptEventType.LAUNCHED事件。
<1> jar包:org.apache.hadoop.yarn.server.resourcemanager.amlauncher
<2> 關(guān)鍵類(lèi):ApplicationMasterLauncher
(16) RMAppAttemptEventType.LAUNCHED事件被ApplicationAttemptEventDispatcher捕獲,并將RMAppAttempt對象從RMAppAttemptState.ALLOCATED狀態(tài)轉變?yōu)镽MAppAttemptState.LAUNCHED狀態(tài)。
(17)將該application的RMAppAttempt對象注冊到AMLivenessMonitor中,以便實(shí)時(shí)監控該application的存活狀態(tài)。
(18)AM-NODE節點(diǎn)為該Application創(chuàng )建ApplicationMaster,接下來(lái)ApplicationMaster會(huì )與ResourceManager協(xié)商資源并通知NodeManager創(chuàng )建Container。ApplicationMaster首先會(huì )向ApplicationMasterService注冊。
(19)ApplicationMasterService收到新的ApplicationMaster注冊請求后,會(huì )觸發(fā)RMAppAttemptRegistrationEvent(RMAppAttemptEventType.REGISTERED)事件。
(20)RMAppAttemptRegistrationEvent事件被 ApplicationAttemptEventDispatcher捕獲,并將RMAppAttempt對象從RMAppAttemptState.LAUNCHED狀態(tài)轉化為RMAppAttemptState.RUNNING狀態(tài),同時(shí)觸發(fā)RMAppEventType.ATTEMPT_REGISTERED事件。
(21)至此,該application的ApplicationMaster創(chuàng )建與注冊完畢,接下來(lái)ApplicationMaster會(huì )根據Application的資源需求向ResourceManager請求資源,同時(shí)監控各個(gè)子任務(wù)的執行情況。
4. ResourceManager中事件處理流直觀(guān)圖
下圖是從另一個(gè)方面對上圖的重新繪制:

聯(lián)系客服