详解php+nginx 服务发生500 502错误排查思路

概述

当线上的服务中访问中出现500或者502错误时,需要紧急处理,排查问题,该怎么做?可以通过分析一些错误日志或者跟踪php-fpm进程来进行问题定位。

nginx error_log

nginx的error_log在nginx的配置文件中定义的

server {
 listen  80;
 server_name localhost;
 root   /var/www;

 access_log /Users/jiao/logs/default.access.log;
 error_log /Users/jiao/logs/default.error.log;
 location / {
  index index.html index.htm index.php;
  autoindex on;
 }
 location = /info {
  allow 127.0.0.1;
  deny all;
  rewrite (.*) /.info.php;
 }
 location ~ \.php$ {
  root /var/www;
  fastcgi_pass 127.0.0.1:9000;
  fastcgi_index index.php;
  fastcgi_param SCRIPT_FILENAME /var/www$fastcgi_script_name;
  include /usr/local/etc/nginx/fastcgi_params;
 }
}

查看error_log

➜  tail /Users/jiao/logs/default.error.log
2019/07/17 11:08:18 [error] 77416#0: *76 kevent() reported about an closed connection (54: Connection reset by peer) while reading response header from upstream, client: 127.0.0.1, server: localhost, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "localhost"

发现出现了Connection reset by peer,连接被重置了,此时可以再查看php-fpm的error_log进一步分析问题

php-fpm error_log

php-fpm的error_log在php-fpm.conf文件中配置中定义的

; Error log file
; If it's set to "syslog", log is sent to syslogd instead of being written
; in a local file.
; Note: the default prefix is /usr/local/var
; Default Value: log/php-fpm.log
error_log = log/php-fpm.log

error_log里面的内容是这样的

➜ tail /usr/local/var/log/php-fpm.log
[17-Jul-2019 10:49:54] NOTICE: [pool www] child 81948 started
[17-Jul-2019 11:08:18] WARNING: [pool www] child 77537, script '/var/www/index.php' (request: "GET /index.php") execution timed out (3.801267 sec), terminating
[17-Jul-2019 11:08:18] WARNING: [pool www] child 77537 exited on signal 15 (SIGTERM) after 1503.113967 seconds from start
[17-Jul-2019 11:08:18] NOTICE: [pool www] child 94339 started

可以看到是请求/var/www/index.php文件出现了超时

dtruss

dtruss是动态跟踪命令,可以根据PID,name跟踪进程

mac环境下使用dtruss,linux环境可以使用strace,pstack

➜ dtruss
USAGE: dtruss [-acdefholLs] [-t syscall] { -p PID | -n name | command | -W name }
  -p PID   # examine this PID
  -n name   # examine this process name
  -t syscall  # examine this syscall only
  -W name   # wait for a process matching this name
  -a    # print all details
  -c    # print syscall counts
  -d    # print relative times (us)
  -e    # print elapsed times (us)
  -f    # follow children
  -l    # force printing pid/lwpid
  -o    # print on cpu times
  -s    # print stack backtraces
  -L    # don't print pid/lwpid
  -b bufsize  # dynamic variable buf size

eg,

 dtruss df -h  # run and examine "df -h"
 dtruss -p 1871  # examine PID 1871
 dtruss -n tar  # examine all processes called "tar"
 dtruss -f test.sh # run test.sh and follow children

跟踪php-fpm:sudo dtruss -a -n php-fpm

此时访问web页面,就可以看到跟踪内容

21416/0x3479b6:  1559  63  3 getrusage(0x0, 0x7FFEE1EC0760, 0x0)   = 0 0
21416/0x3479b6:  1561  4  0 getrusage(0xFFFFFFFFFFFFFFFF, 0x7FFEE1EC0760, 0x0)   = 0 0
21416/0x3479b6:  1627  77  17 poll(0x7FFEE1EC08C0, 0x1, 0x1388)   = 1 0
dtrace: error on enabled probe ID 2174 (ID 159: syscall::read:return): invalid kernel access in action #13 at DIF offset 68
dtrace: error on enabled probe ID 2174 (ID 159: syscall::read:return): invalid kernel access in action #13 at DIF offset 68
dtrace: error on enabled probe ID 2174 (ID 159: syscall::read:return): invalid kernel access in action #13 at DIF offset 68
dtrace: error on enabled probe ID 2174 (ID 159: syscall::read:return): invalid kernel access in action #13 at DIF offset 68
dtrace: error on enabled probe ID 2174 (ID 159: syscall::read:return): invalid kernel access in action #13 at DIF offset 68
21416/0x3479b6:  1872  29  24 lstat64("/var/www/index.php\0", 0x7FFEE1ECFF38, 0x0)   = 0 0
21416/0x3479b6:  1884  9  6 lstat64("/var/www\0", 0x7FFEE1ECFDF8, 0x0)   = 0 0
21416/0x3479b6:  1889  6  3 lstat64("/var\0", 0x7FFEE1ECFCB8, 0x0)   = 0 0
21416/0x3479b6:  1899  12  8 readlink("/var\0", 0x7FFEE1ED0090, 0x400)   = 11 0
21416/0x3479b6:  1905  6  4 lstat64("/private/var\0", 0x7FFEE1ECFB78, 0x0)   = 0 0
21416/0x3479b6:  1917  6  3 lstat64("/private\0", 0x7FFEE1ECFA38, 0x0)   = 0 0
21416/0x3479b6:  2178  18  14 stat64("/var/www/.user.ini\0", 0x7FFEE1ED0240, 0x0)   = -1 Err#2
21416/0x3479b6:  2217  5  1 setitimer(0x2, 0x7FFEE1ED07E0, 0x0)   = 0 0
21416/0x3479b6:  2225  4  0 sigaction(0x1B, 0x7FFEE1ED0788, 0x7FFEE1ED07B0)   = 0 0
21416/0x3479b6:  2237  5  1 sigprocmask(0x2, 0x7FFEE1ED0804, 0x0)   = 0x0 0
21416/0x3479b6:  3643  48  40 open_nocancel(".\0", 0x0, 0x1)   = 5 0
21416/0x3479b6:  3648  7  3 fstat64(0x5, 0x7FFEE1ED0110, 0x0)   = 0 0
21416/0x3479b6:  3653  7  2 fcntl_nocancel(0x5, 0x32, 0x10F252158)   = 0 0
21416/0x3479b6:  3661  12  7 close_nocancel(0x5)   = 0 0
21416/0x3479b6:  3670  10  7 stat64("/usr/local/var\0", 0x7FFEE1ED0080, 0x0)   = 0 0
21416/0x3479b6:  3681  11  8 chdir("/var/www\0", 0x0, 0x0)   = 0 0
21416/0x3479b6:  3698  4  0 setitimer(0x2, 0x7FFEE1ED02D0, 0x0)   = 0 0
21416/0x3479b6:  3710  6  3 fcntl(0x3, 0x8, 0x10F3FD858)   = 0 0
21416/0x3479b6:  3733  9  6 stat64("/private/var/www/index.php\0", 0x7FFEE1ECFF10, 0x0)   = 0 0
74904/0x332630: 723125 1073381  19 kevent(0x9, 0x0, 0x0)   = 0 0
74902/0x332629: 770666 1073387  17 kevent(0x8, 0x0, 0x0)   = 0 0
74904/0x332630: 723165 1061954  20 kevent(0x9, 0x0, 0x0)   = 0 0
74902/0x332629: 770709 1061954  20 kevent(0x8, 0x0, 0x0)   = 0 0
74904/0x332630: 723201 1074786  16 kevent(0x9, 0x0, 0x0)   = 0 0
74902/0x332629: 770747 1074783  16 kevent(0x8, 0x0, 0x0)   = 0 0
74904/0x332630: 723229 1069141  13 kevent(0x9, 0x0, 0x0)   = 0 0
74902/0x332629: 770777 1069145  11 kevent(0x8, 0x0, 0x0)   = 0 0
21416/0x3479b6:  3942 3902233  7 __semwait_signal(0x703, 0x0, 0x1)   = -1 Err#4
74902/0x332629: 770814  103  25 kill(21416, 15)   = 0 0
dtrace: error on enabled probe ID 2172 (ID 161: syscall::write:return): invalid kernel access in action #13 at DIF offset 68
dtrace: error on enabled probe ID 2172 (ID 161: syscall::write:return): invalid kernel access in action #13 at DIF offset 68
74902/0x332629: 771325  7  2 sigreturn(0x7FFEE1ECFC40, 0x1E, 0xC1A4B78E0404663A)   = 0 Err#-2
74902/0x332629: 771336  7  3 kevent(0x8, 0x0, 0x0)   = 1 0
dtrace: error on enabled probe ID 2174 (ID 159: syscall::read:return): invalid kernel access in action #13 at DIF offset 68
74902/0x332629: 771352  11  7 wait4(0xFFFFFFFFFFFFFFFF, 0x7FFEE1ED0748, 0x3)   = 21416 0
dtrace: error on enabled probe ID 2172 (ID 161: syscall::write:return): invalid kernel access in action #13 at DIF offset 68
74902/0x332629: 773511 1957 1899 fork()   = 28060 0
28060/0x3754c5:  125:  0:  0 fork()   = 0 0
28060/0x3754c5:  128  9  2 bsdthread_register(0x7FFF6774C418, 0x7FFF6774C408, 0x2000)   = -1 Err#22
dtrace: error on enabled probe ID 2172 (ID 161: syscall::write:return): invalid kernel access in action #13 at DIF offset 68
74902/0x332629: 773737  4  1 wait4(0xFFFFFFFFFFFFFFFF, 0x7FFEE1ED0748, 0x3)   = 0 0
74902/0x332629: 773742  6  3 read(0x5, "\0", 0x1)   = -1 Err#35
28060/0x3754c5:  320  4  0 getpid(0x0, 0x0, 0x0)   = 28060 0
28060/0x3754c5:  328  7  2 __mac_syscall(0x7FFF67758A17, 0x4, 0x7FFEE1ED0208)   = -1 Err#45
28060/0x3754c5:  332  5  2 csops(0x6D9C, 0xB, 0x7FFEE1ED0248)   = -1 Err#22
28060/0x3754c5:  755  14  11 dup2(0x1, 0x2, 0x0)   = 2 0
28060/0x3754c5:  797  89  22 close(0x4)   = 0 0
28060/0x3754c5:  806  11  6 dup2(0x7, 0x0, 0x0)   = 0 0
28060/0x3754c5:  817  4  0 geteuid(0x0, 0x0, 0x0)   = 501 0
28060/0x3754c5:  820  3  0 close(0x5)   = 0 0
28060/0x3754c5:  821  3  0 close(0x6)   = 0 0
28060/0x3754c5:  824  5  1 sigaction(0xF, 0x7FFEE1ED0688, 0x0)   = 0 0
28060/0x3754c5:  825  3  0 sigaction(0x2, 0x7FFEE1ED0688, 0x0)   = 0 0
28060/0x3754c5:  827  3  0 sigaction(0x1E, 0x7FFEE1ED0688, 0x0)   = 0 0
28060/0x3754c5:  828  3  0 sigaction(0x1F, 0x7FFEE1ED0688, 0x0)   = 0 0
28060/0x3754c5:  829  3  0 sigaction(0x14, 0x7FFEE1ED0688, 0x0)   = 0 0
28060/0x3754c5:  830  3  0 sigaction(0x3, 0x7FFEE1ED0688, 0x0)   = 0 0
28060/0x3754c5:  1043  3  0 close(0x7)   = 0 0

可以看到系统底层执行的函数,如lstat64获取文件内容信息,kill(21416, 15)kill掉php-fpm进程,fork()出新的php-fpm进程,有兴趣可以深入研究每个指令的作用

参考

https://www.jb51.net/article/165773.htm

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持我们。

(0)

相关推荐

  • nginx 502 Bad Gateway 错误解决办法

    一些运行在Nginx上的网站有时候会出现"502 Bad Gateway"错误,有些时候甚至频繁的出现.以下是小编搜集整理的一些Nginx 502错误的排查方法,供参考: Nginx 502错误的原因比较多,是因为在代理模式下后端服务器出现问题引起的.这些错误一般都不是nginx本身的问题,一定要从后端找原因!但nginx把这些出错都揽在自己身上了,着实让nginx的推广者备受置疑,毕竟从字眼上理解,bad gateway?不就是bad nginx吗?让不了解的人看到,会直接把责任推在

  • 深入探讨:Nginx 502 Bad Gateway错误的解决方法

    max_children=40 , 每个children平均占用20M-30M内存,children越多,可以同时接受的并发数量越多,一般children的值是网站最高并发数+浮动值,这值再×内存占用,就是你需要用到的内存.max_requests = N 是指当每个children接受了N次请求以后,就会把自己杀死,然后重新建立一个children.PV / max_children = 每一个children接受的request次数[ 默认预设浏览一个只调用一次PHP程序,或许异步调用呢?接

  • Nginx worker_connections配置太低导致500错误案例

    最近一次安全培训,需要用到安全攻防平台,结果30几个人登录上去直接爆出500错误.不知道什么原因,后来找来SSH登录用户,密码,逐步排查,发现了Nginx worker_connections配置问题. 原来是Nginx配置文件中的worker_connections配置太低,只有50,导致与php-fpm交互过程中超出了connections限制,出现了500错误.直接将此参数的值改成10240就解决了此问题.

  • 为Nginx自定义404,502错误页面的方法

    首先打开nginx.conf文件,在fastcgi_temp_file_write_size 128k; 下面添加 fastcgi_intercept_errors on;注意,包括;号 然后在需要定义的站点的里面添加 error_page 404 = /404.htm;例如: 复制代码 代码如下: server { listen 80; server_name www.deepvps.com; index index.html index.htm index.php; error_page 4

  • nginx提示502 页面的解决方法

    例如: 复制代码 代码如下: http      {      ......      fastcgi_connect_timeout 300;      fastcgi_send_timeout 300;      fastcgi_read_timeout 300;      ......      } 也有可能是PHP代码不正确,比如SVN冲突等原因.

  • nginx上传文件大小报错500的解决办法

    nginx上传文件大小报错500的解决办法 采用nginx作反向代理,出现了一个诡异的问题,小文件可以提交,大文件会报500内部错误.这个是什么原因导致的呢? 查wiki可知,上传文件大小相关的有三个配置 client_body_buffer_size 配置请求体缓存区大小, 不配的话, client_body_temp_path 设置临时文件存放路径.只有当上传的请求体超出缓存区大小时,才会写到临时文件中 client_max_body_size 设置上传文件的最大值 所以查出来,问题出现的原

  • PHP脚本监控Nginx 502错误并自动重启php-fpm

    最近服务器时不时出现Nginx 502 Bad Gateway,如果在电脑旁边还好,要是半夜或者出去了,怎么办? 没关系,写个脚本检测服务状态,发现异常,自动重启. 自动重启脚本: 复制代码 代码如下: <?php $url = 'http://blog.rebill.info'; $cmd = '/usr/local/php/sbin/php-fpm restart';   for($i = 0; $i < 5; $i ++){         $exec = "curl  con

  • 详解Nginx 502错误解决办法

    详解Nginx 502错误解决办法 1.配置错误 因为nginx找不到php-fpm了,所以报错,一般是fastcgi_pass后面的路径配置错误了,后面可以是socket或者是ip:port 2.资源耗尽 lnmp架构在处理php时,nginx直接调取后端的php-fpm服务,如果nginx的请求量偏高,我们又没有给php-fpm配置足够的子进程,那么php-fpm就会资源耗尽,一旦资源耗尽nginx找不到php-fpm就会出现502错误, 解决方案 去调整php-fpm.conf中的pm.m

  • nginx提示:500 Internal Server Error错误的解决方法

    现在越来越多的站点开始用 Nginx ,("engine x") 是一个高性能的 HTTP 和反向代理服务器,也是一个 IMAP/POP3/SMTP 代理服务器. Nginx 是由 Igor Sysoev 为俄罗斯访问量第二的 Rambler.ru 站点开发的,它已经在该站点运行超过两年半了.Igor 将源代码以类BSD许可证的形式发布. 在高并发连接的情况下,Nginx是Apache服务器不错的替代品.Nginx同时也可以作为7层负载均衡服务器来使用.根据测试结果,Nginx 0.6

  • nginx提示:500 Internal Server Error错误解决办法

     nginx提示:500 Internal Server Error错误解决办法 前言: 今天发现网站无法上传资源和发布文章,通过浏览器查看服务器的返回信息 500 Internal Server Error 刚开始我以为是应用服务器或者是程序异常,检查了下各自的日志,连请求记录都无,程序代码里也没抛出过该类型的异常,所以可以从服务器或者Nginx中找原因了. 1.是否磁盘空间不足? 使用 df -k 查看硬盘空间是否满了.清理硬盘空间就可以解决500错误.nginx如果开启了access lo

  • 详解linux中 Nginx 常见502错误问题解决办法

    常见的Nginx 502 Bad Gateway解决办法如下: Nginx 502错误情况1: 网站的访问量大,而php-cgi的进程数偏少. 针对这种情况的502错误,只需增加php-cgi的进程数.具体就是修改/usr/local/php/etc/php-fpm.conf 文件,将其中的max_children值适当增加.这个数据要依据你的VPS或独立服务器的配置进行设置.一般一个php-cgi进程占20M内存,你可以自己计算下,适量增多. /usr/local/php/sbin/php-f

  • Nginx 遇到502 Bad Gateway 自动重启的脚本代码

    原理就是用curl获取HTTP头,发现502状态码就执行重启php-fpm的命令. 复制代码 代码如下: #!/usr/bin/php <!--$url = 'http://www.jb51.net'; $cmd = '/usr/local/php/sbin/php-fpm restart'; for($i = 0; $i < 5; $i ++){ $exec = "curl connect-timeout 3 -I $url 2>/dev/null"; $res =

随机推荐