如何在 PHP 中解析响应头-之路教程

curl

使用 cURL 获取响应头更加困难，因为它们不会立即可用，这与我们使用内置文件函数时不同。
相反，我们必须手动从请求中提取标头。
这可以在执行请求后使用 curl_getinfo 函数完成：

$header_size = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
$headers = substr($response, 0, $header_size);
$body = substr($response, $header_size);

然后将标头作为字符串存储在 $headers 变量中。

为了从字符串创建关联数组，我们可以遍历字符串中的每一行，并在执行时将其内容保存到数组中：

//Define the $response_headers array for later use
$response_headers = [];
//Get the first line (The Status Code)
$line = strtok($headers, "\r\n");
$status_code = trim($line);
//Parse the string, saving it into an array instead
while (($line = strtok("\r\n")) !== false) {
    if(false !== ($matches = explode(':', $line, 2))) {
      $response_headers["{$matches[0]}"] = trim($matches[1]);
    }  
}

如何在 PHP 中解析响应头

当我们使用内置 PHP 文件函数执行 HTTP 请求时，响应标头将自动在特殊变量 $http_response_header 中可用；这在使用内置文件函数时非常有用，但在使用 cURL 库时不起作用。
在本教程中，我们将学习如何解析请求标头，无论我们使用的是 cURL 还是 file_get_contents 等文件函数：

有点奇怪的是，标头存储为索引数组，而不是更用户友好的关联数组；但这只是一个小小的不便，因为我们可以轻松地自行转换为关联数组。

为了解析响应头并创建一个关联数组，我们可以将这个解决方案用于内置文件函数：

$response_headers = [];
$status_message = array_shift($http_response_header);
foreach ($http_response_header as $value) {
  if(false !== ($matches = explode(':', $value, 2))) {
    $response_headers["{$matches[0]}"] = trim($matches[1]);
  }                
}

使用 cURL 库时的这个：

//Define the $response_headers array for later use
$response_headers = [];
//Get the first line (The Status Code)
$line = strtok($headers, "\r\n");
$status_code = trim($line);
//Parse the string, saving it into an array instead
while (($line = strtok("\r\n")) !== false) {
    if(false !== ($matches = explode(':', $line, 2))) {
      $response_headers["{$matches[0]}"] = trim($matches[1]);
    }  
}

这样做可以轻松检查给定的标头是否存在，只需在数组键上使用 isset 即可：

if (isset($response_headers["content-type"])) {
  echo '<p>The "content-type" header was found, and the content is:</p>';
  echo $response_headers["content-type"];
  exit();
}

文件函数

如前所述，要使用 PHP 的内置文件函数获取响应标头，我们可以遍历 $http_response_header 变量；在我们执行 HTTP 请求后，PHP 将自动将响应标头作为此变量中的索引数组提供给我们。

用于使用文件函数发出 HTTP 请求的函数通常包括 file_get_contents、stream_context_create 和 stream_get_contents——如何使用它们在其他教程中介绍。

$http_response_header 数组中的第一个元素始终是 HTTP 状态代码——即使在读取原始标头时，状态代码也始终排在最前面。
将状态代码存储在单独的变量中可能很有用：

$status_message = array_shift($http_response_header);

array_shift 函数在这里有两个目的：

它返回数组中的第一个元素。

它从数组中删除第一个元素。

所有数字数组键也将相应更新，因此不会有任何丢失的键。

HTTP 标头由 key: value 对组成，但我们不能仅通过冒号 (:) 将它们转换为数组，因为标头值也可能包含冒号。
所以，我们可以做的，而不是求助于正则表达式，就是用字符串中的第一个冒号分割字符串，因为它总是在键名之后。

虽然使用正则表达式更容易，但使用 stripos 和 substr 的组合或者爆炸要快 10% 到 68%——这在实践中并不重要——
但无论如何，我认为我们应该坚持最快的方式。

以下所有方法实际上都非常易于使用，因此应该使用哪种方法最有效。

解决方案1：

使用explode函数比使用preg_match快68%：

$headers = array();
$status_message = array_shift($http_response_header);
foreach ($http_response_header as $value) {
  if(false !== ($matches = explode(':', $value, 2))) {
    $headers["{$matches[0]}"] = trim($matches[1]);
  }                
}

解决方案2：

这是如何使用 stripos 和 substr ，它比使用正则表达式快 10% 左右：

$headers = array();
$status_message = array_shift($http_response_header);
foreach ($http_response_header as $value) {
  $pos = stripos($value, ':');
  $key = substr($value, 0, $pos);
  $value = substr($value, $pos+1);
  $headers["$key"] = trim($value);
}
print_r($headers);

解决方案3：

如果我们出于某种原因更喜欢使用正则表达式，请随意使用；对于大多数网站来说，速度差异是微不足道的——但请记住，我们获得的并发用户越多，即使是很小的优化，我们也会受益更多。
下面是如何使用 preg_match 来做同样的事情：

$headers = array();
$status_message = array_shift($http_response_header);
foreach ($http_response_header as $value) {
  if (preg_match('/^([^:]+):([^\n]+)/', $value, $matches)) {
    $headers["{$matches[1]}"] = trim($matches[2]);
  }  
}
print_r($headers);

当然我们也可以在使用正则表达式的时候去掉trim函数；我只是把它留在那里以获得公平的基准结果。

如何执行基准测试

每个测试都是通过解析标头 100 万次来执行的，如下所示：

$start_time = microtime(true);
$repeat = 0;
$status_message = array_shift($http_response_header);
while ($repeat < 1000000) {
  $headers = array();
  $status_message = array_shift($http_response_header);
  foreach ($http_response_header as $value) {
    if(false !== ($matches = explode(':', $value, 2))) {
      $headers["{$matches[0]}"] = trim($matches[1]);
    }
  }
  ++$repeat;
}
$end_time = microtime(true);
echo $end_time - $start_time . "\n\n";
var_dump($headers);exit();

日期：2020-06-02 22:17:30 来源：oir作者：oir

←PHP：Parse_url 函数

分区方案MBR与GPT的区别→

curl

文件函数

如何执行基准测试

目录