Request 网络操作扩展


Request扩展,可以实现如携带cookie、伪造来路、伪造浏览器等任意复杂的网络请求

安装:

  1. composer require jaeger/querylist-ext-request

GIT地址:

  1. https://github.com/jae-jae/QueryList-Ext-Request.git

依赖(通过Composer安装的请忽略)

Request扩展依赖Http类,Git地址为:https://github.com/jae-jae/Http.git

手动安装插件教程:http://doc.querylist.cc/site/index/doc/7

用法一

  1. $ql = QueryList::run('Request',[
  2. 'http' => [
  3. 'target' => '采集的目标页面',
  4. 'referrer' => '来源地址',
  5. 'method' => '请求方式,GET、POST等',
  6. 'params' => ['提交的参数'=>'参数值','key'=>'value'],
  7. //等等其它http相关参数,具体可查看Http类源码
  8. ],
  9. 'callback' => function($html,$args){
  10. //处理html的回调方法
  11. return $html;
  12. },
  13. 'args' => '传给回调函数的参数'
  14. ]);
  15. $data = $ql->setQuery(...)->data;

用法二

  1. $ql = QueryList::run('Request',[
  2. 'target' => '采集的目标页面',
  3. 'referrer' => '来源地址',
  4. 'method' => '请求方式,GET、POST等',
  5. 'params' => ['提交的参数'=>'参数值','key'=>'value'],
  6. //等等其它http相关参数,具体可查看Http类源码
  7. ]);
  8. $data = $ql->setQuery(...)->data;

返回值为设置好了html属性的QueryList对象,然后应该调用QueryList的setQuery方法设置采集规则。

  1. //HTTP操作扩展
  2. $urls = QueryList::run('Request',[
  3. 'target' => 'http://cms.querylist.cc/news/list_2.html',
  4. 'referrer'=>'http://cms.querylist.cc',
  5. 'method' => 'GET',
  6. 'params' => ['var1' => 'testvalue', 'var2' => 'somevalue'],
  7. 'user_agent'=>'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0',
  8. 'cookiePath' => './cookie.txt',
  9. 'timeout' =>'30'
  10. ])->setQuery(['link' => ['h2>a','href','',function($content){
  11. //利用回调函数补全相对链接
  12. $baseUrl = 'http://cms.querylist.cc';
  13. return $baseUrl.$content;
  14. }]],'.cate_list li')->getData(function($item){
  15. return $item['link'];
  16. });