In a blog post dated Sep 23, 2024, the author discusses two types of responses from Language Model systems: informational and instructional. Informational responses provide factual or conceptual knowledge, whereas instructional responses offer step-by-step guidance. The author points out that most datasets used for evaluations, like MMLU and GSM8k, are limited in size and context, which could be insufficient given the growing size of context windows required today. The author suggests that instructional responses need to be treated as a separate category for evaluation, especially in domain-specific RAG (Retrieval Augmented Generation) systems used by enterprises for tasks such as HR policies, vacation applications, and other procedures.