Token | Description | Example Model Output |
---|---|---|
[RES] |
Signals a voice-only response. Generated as the first token for conversational replies. | [RES] I see an apple on the table. |
[ACT] |
Signals that the response includes a physical action. Generated as the first token to enter action mode. | [ACT] Okay, I will put the toy in the box. [INST] Pick up toy and place in box. |
[INST] |
Delimits the spoken part of an action response from the internal action instruction that follows. | Used after [ACT] to separate speech from the action instruction. |
[HALT] |
Commands an immediate stop of the current action. Generated as the first token for emergency stops. | [HALT] Stopping immediately. |
[END] |
Signals that a multi-step action sequence has been successfully completed. | [END] The action is finished. |