Before coming to Mesosphere as an intern, I read the mesos paper and have a general understanding of it’s architecture, well none of these understanding is based on actual code. After 1.5 month in Mesosphere, contributing some code to the project, it’s necessary to connect both the general flow and code logic together.
There has been a pretty well explanation of different roles and relationships between these important components in Mesos.
In brief, we have several important concepts: Master, Agent/Slave, Framework (consisting of Scheduler and Executor). Master and Agent nodes are physical or virtual machines in datacenter. Master node is responsible for scheduling tasks (this “task” is our general understanding of task, meaning “a specific work to do, such as running a Spark job”, which is different from the definition of “task” in Mesos project), Agent node is responsible for running task. In order to scheduler and execute tasks in this distributed system, it’s necessary to have corresponding parts on both Master and Agent doing scheduling and executing work respectively. We call it Framework as a whole. The part that’s working on the Master node is called scheduler, the other part working on the Agent node is called executor. From this perspective, it’s pretty easy to understand how do these two names come from.
Since there might be multiple kind of tasks to run, we sometimes need to run multiple Frameworks (thus multiple schedulers, executors components) on Mesos, thus on the Master node, there might multiple schedulers, on Agent node, there might be multiple executors.