Lors de l’installation d’Oracle RAC 12c sous Solaris, j’ai rencontré quelques petits soucis. A travers cet article, je vais expliquer la démarche que j’ai suivi pour diagnostiquer cet incident.
Le contexte est un peu particulier : j’utilise 2 VMs VirtualBox sous Solaris 11.3 pour les serveurs Oracle ainsi que l’image VirtualBox ZFS Appliance pour distribuer le stockage entre ces deux VMs (via le protocole iSCSI). La démarche d’analyse est beaucoup plus intéressante que le contexte. Ne vous focalisez pas dessus (pour le moment).
L’incident se produit de la manière suivante : le premier serveur du cluster Oracle RAC « panic » lors de l’exécution du script root.sh. Pour rappel ce script permet la configuration du cluster Oracle RAC.
L’analyse débute ci-dessous. Notez que j’utilise Solaris CAT.
CAT(/var/crash/2/vmcore.2/11X)> panic panic on CPU 0 panic string: forced crash dump initiated at user request ==== panic user (LWP_SYS) thread: 0xffffa10016511500 PID: 11187 on CPU: 0 ==== cmd: /u01/app/12.1.0/grid/bin/cssdagent fmri: svc:/network/ssh:default t_procp: 0xffffa1001837e300 p_as: 0xffffa10019170900 size: 126853120 RSS: 28393472 a_hat: 0xffffa10018c3a048 cpuset: 0 p_zone: 0xfffffffffc285c70 (global) t_stk: 0xfffffffc8340cf00 sp: 0xfffffffc8340cd30 t_stkbase: 0xfffffffc83408000 t_pri: 110 (RT) pctcpu: 0.001949 t_transience: 0 t_wkld_flags: 0 t_lwp: 0xffffa10019560100 t_tid: 15 lwp_regs: 0xfffffffc8340cf00 lwp_ap: 0xfffffffc8340cf10 t_mstate: LMS_SYSTEM ms_prev: LMS_USER ms_state_start: 0.000009355 seconds earlier ms_start: 1 minutes 10.879644595 seconds earlier t_cpupart: 0xfffffffffc0e25d0(0) last CPU: 0 idle: 800170590 hrticks (0.800170590s) start: Mon Nov 7 12:16:10 2016 age: 71 seconds (1 minutes 11 seconds) t_state: TS_ONPROC t_flag: 0x1800 (T_PANIC|T_LWPREUSE) t_proc_flag: 0x100 (TP_MSACCT) t_schedflag: 0x8003 (TS_LOAD|TS_DONT_SWAP) (missed bits 0x8000) t_acflag: 3 (TA_NO_PROCESS_LOCK|TA_BATCH_TICKS) p_flag: 0x4a004000 (SEXECED|SMSACCT|SAUTOLPG|SMSFORK) pc: unix:vpanic_common+0x13a: addq $0xf0,%rsp unix:vpanic_common+0x13a() unix:0xfffffffffb8a4810() int genunix:kadmin+0x2b6((int)5, (int)1, (void *)0, (cred_t *)0xffffa10010f69558) int genunix:uadmin+0x179((int)5, (int)1, (uintptr_t)0) unix:_syscall_invoke+0x30() -- switch to user thread's user stack --
Le « panic » est provoqué par le processus « cssdagent » via la commande Unix « uadmin ».
Après consultation de la documentation, le deamon « cssdagent » est en charge de vérifier l’état du cluster (voir deamons cssd) et fournit spécifiquement le service d’I/O fencing. En cas d’échec du deamon « ocssd.bin » (ou de ses agents cssdmonitor et cssdagent), le serveur effectue un « panic ».
Il est intéressant de voir l’avant dernier thread exécuté sur le serveur avant ce panic.
CAT(/var/crash/2/vmcore.2/11X)> sdump 0xffffa10016511500 kthread_t t_cpu struct cpu *t_cpu = 0xfffffffffc0849e0 (unix(data):cpus+0) CAT(/var/crash/2/vmcore.2/11X)> sdump 0xfffffffffc0849e0 cpu_t cpu_intr_thread kthread_t *cpu_intr_thread = 0xfffffffc800b9b40 CAT(/var/crash/2/vmcore.2/11X)> thread 0xfffffffc800b9b40 ==== free thread: 0xfffffffc800b9b40 PID: 0 affinity CPU: 0 PIL: 2 ==== cmd: sched(unix:thread_create_intr) kname: thread_create_intr t_procp: 0xfffffffffc063270 (proc_sched) p_as: 0xfffffffffc064750 (kas) p_zone: 0xfffffffffc285c70 (global) t_stk: 0xfffffffc800b9b30 sp: 0xfffffffc800b9a60 t_stkbase: 0xfffffffc800b3000 t_pri: 161 (SYS) pctcpu: 15.889231 t_transience: 0 t_wkld_flags: 0 t_lwp: 0xffffa1001863a0c0 lwp_regs: 0xfffffffc81729f00 lwp_ap: 0xfffffffc81729f10 t_cpupart: 0xfffffffffc0e25d0(0) last CPU: 0 idle: 30938061725 hrticks (30.938061725s) interrupted (pinned) thread: 0xffffa10010eaf280 t_state: TS_FREE t_flag: 0x10009 (T_INTR_THREAD|T_TALLOCSTK|T_PUSHPAGE) t_proc_flag: 0 (none set) t_schedflag: 0x10003 (TS_LOAD|TS_DONT_SWAP) (missed bits 0x10000) t_acflag: 4 (missed bits 0x4) p_flag: 1 (SSYS) pc: unix:_resume_from_idle+0x130 resume_return: addq $0x8,%rsp unix:_resume_from_idle+0x130 resume_return() genunix(data):intr_queue+0() -- end of free thread's stack --
Il s’agit du thread 0xffffa10010eaf280 (pinned).
CAT(/var/crash/2/vmcore.2/11X)> thread 0xffffa10010eaf280 ==== user (LWP_SYS) thread: 0xffffa10010eaf280 PID: 11189 ==== cmd: /u01/app/12.1.0/grid/bin/ocssd.bin fmri: svc:/network/ssh:default t_procp: 0xffffa1001837d2a0 p_as: 0xffffa10019170040 size: 211288064 RSS: 63078400 a_hat: 0xffffa10019d394a0 p_zone: 0xfffffffffc285c70 (global) t_stk: 0xfffffffc81729f00 sp: 0xfffffffc81729d40 t_stkbase: 0xfffffffc81725000 t_pri: 110 (RT) pctcpu: 75.408707 t_transience: 0 t_wkld_flags: 0 t_lwp: 0xffffa1001863a0c0 t_tid: 13 lwp_regs: 0xfffffffc81729f00 lwp_ap: 0xfffffffc81729f10 t_mstate: LMS_TRAP ms_prev: LMS_USER ms_state_start: 0.000405357 seconds earlier ms_start: 1 minutes 11.314553986 seconds earlier t_cpupart: 0xfffffffffc0e25d0(0) last CPU: 0 idle: 405315 hrticks (0.000405315s) start: Mon Nov 7 12:16:10 2016 age: 71 seconds (1 minutes 11 seconds) t_waitrq: 0.000405315 seconds t_state: TS_RUN t_flag: 0x1000 (T_LWPREUSE) t_proc_flag: 0x100 (TP_MSACCT) t_schedflag: 0x8013 (TS_LOAD|TS_DONT_SWAP|TS_SIGNALLED) (missed bits 0x8000) t_acflag: 3 (TA_NO_PROCESS_LOCK|TA_BATCH_TICKS) p_flag: 0x5a004000 (SEXECED|SMSACCT|SAUTOLPG|SNOCD|SMSFORK) pc: unix:_resume_from_idle+0x130 resume_return: addq $0x8,%rsp unix:_resume_from_idle+0x130 resume_return() unix:swtch - frame recycled void unix:preempt+0xcb() boolean_t unix:kpreempt+0x8e((int)0xffffffff) int genunix:new_mstate+0x178((kthread_t *)0xffffa10010eaf280, (int)2) void unix:trap+0x1ae((struct regs *)0xfffffffc81729f00, (caddr_t)0, (processorid_t)0) -- trap data rp: 0xfffffffc81729ed0 trapno 0 (Divide error) %rfl 0 () savfp 0xffff80ffbdb9a3a0 savpc 0xffff80ffbdb9a388 (invalid text addr) %rbp 2 %rsp 0 %rip 0x4b (invalid text addr) 0%rdi 0xffff80ffbdb9a380 1%rsi 0xfffffffc81729f00 2%rdx 0xfffffffc81729f00 3%rcx 0xfffffffffb800a47 4%r8 0 5%r9 0xffff80ffadd4ea20 %rax 0xffff80ffbdb9a820 %rbx 0x1a17090 %r10 0xda94f8 %r11 0xffff80ffae1218c0 %r12 0x8c9 %r13 0 %r14 0xffff80ffbdb9a380 %r15 0xffff80ffbdb9a8c0 %cs 0x4b (UCS_SEL) %ds 0xffff80ffbdb9a388 (GDT unknown,KPL) %es 0xffff80ffbdb9a3a0 (GDT unknown,KPL) %ss 0x400 (GDT unknown,KPL) %fs 0 (KFS_SEL) %gs 0xdbb2b0 (GDT unknown,KPL) fsbase 0xff80ffbdb9a3a0ff gsbase 0xff80ffbdb9a3a0ff -- error reading next frame @ 0xffff80ffbdb9a3a0 --
Ce qui est intéressant c’est qu’avant le panic du serveur, le dernier processus en cours d’exécution est le deamon « ocssd.bin ».
CAT(/var/crash/2/vmcore.2/11X)> proc 0xffffa1001837d2a0 addr PID PPID RUID/UID size RSS swresv lwpcnt command ================== ====== ====== ========== ========== ======== ======== ====== ===== 0xffffa1001837d2a0 11189 1 54322 211288064 63078400 59850752 28 /u01/app/12.1.0/grid/bin/ocssd.bin thread: 0xffffa1001111fb00 state: slp wchan: 0xffffa1001111fcee sobj: condition var (from genunix:lwp_park+0x157) thread: 0xffffa1000683d940 state: run wchan: NULL sobj: none thread: 0xffffa10011015680 state: run wchan: NULL sobj: none thread: 0xffffa100164dde00 state: slp wchan: 0xffffa1001827770c sobj: condition var (from portfs:port_getn+0x240) thread: 0xffffa100164f7dc0 state: slp wchan: 0xffffa100182746ec sobj: condition var (from portfs:port_getn+0x240) thread: 0xffffa100121e7480 state: slp wchan: 0xffffa100121e766e sobj: condition var (from genunix:lwp_park+0x157) thread: 0xffffa100121e3980 state: slp wchan: 0xffffa100189656a4 sobj: condition var (from portfs:port_getn+0x240) thread: 0xffffa10010eb4480 state: slp wchan: 0xffffa10018960dac sobj: condition var (from portfs:port_getn+0x240) thread: 0xffffa100121e54c0 state: slp wchan: 0xffffa100121e56ae sobj: condition var (from genunix:lwp_park+0x157) thread: 0xffffa100164f5500 state: slp wchan: 0xffffa100164f56ee sobj: condition var (from genunix:lwp_park+0x157) thread: 0xffffa100164f8240 state: slp wchan: 0xffffa100183886bc sobj: condition var (from portfs:port_getn+0x240) thread: 0xffffa100164dd080 state: slp wchan: 0xffffa100164dd26e sobj: condition var (from genunix:lwp_park+0x157) thread: 0xffffa10010eaf280 state: run wchan: NULL sobj: none thread: 0xffffa1001055cb40 state: slp wchan: 0xffffa1001055cd2e sobj: condition var (from genunix:lwp_park+0x157) thread: 0xffffa100121dab00 state: slp wchan: 0xffffa100121dacee sobj: condition var (from genunix:lwp_park+0x157) thread: 0xffffa100121e2240 state: slp wchan: 0xffffa100121e242e sobj: condition var (from genunix:lwp_park+0x157) thread: 0xffffa100121e7000 state: slp wchan: 0xffffa1001917004a sobj: condition var (from kaio:aio_cleanup_thread+0x1c2) thread: 0xffffa1001055bdc0 state: run wchan: NULL sobj: none thread: 0xffffa1001621a200 state: run wchan: NULL sobj: none thread: 0xffffa100164e0240 state: slp wchan: 0xffffa10017d0fe58 sobj: semaphore (from genunix:biowait+0x7a) thread: 0xffffa100164ed000 state: run wchan: NULL sobj: none thread: 0xffffa100164dd500 state: slp wchan: 0xffffa10017d0ff50 sobj: semaphore (from genunix:biowait+0x7a) thread: 0xffffa100164f0280 state: run wchan: NULL sobj: none thread: 0xffffa100164eab80 state: slp wchan: 0xffffa100164ead6e sobj: condition var (from genunix:lwp_park+0x157) thread: 0xffffa100164ec240 state: slp wchan: 0xffffa100164ec42e sobj: condition var (from genunix:lwp_park+0x157) thread: 0xffffa100164e6b40 state: slp wchan: 0xffffa100164e6d2e sobj: condition var (from genunix:lwp_park+0x157) thread: 0xffffa100164ecb40 state: slp wchan: 0xffffa10018947af4 sobj: condition var (from portfs:port_getn+0x240) thread: 0xffffa100164f5080 state: run wchan: NULL sobj: none
Une information importante apparait à l’affichage des thread liés au processus « ocssd.bin » : 2 thread sont en attente d’I/O (biowait). L’analyse des attentes I/O est la suivante :
CAT(/var/crash/2/vmcore.2/11X)> tlist -b 0xffffa100164e0240 biowait thread: 0xffffa100164e0240 state: slp PID: 11189 cmd: /u01/app/12.1.0/grid/bin/ocssd.bin idle: 28810211744 hrticks (28.810211744s) buf @ 0xffffa10017d0fd98 b_edev: 212,320(sd5,0) b_blkno: 0x80011 b_flags: 0x2200121 (BUSY|PHYS|WRITE|SHADOW|STARTED|PAGE_NOWAIT) b_addr: 0x1fb0000 b_bcount: 512 b_bufsize: 0 b_dip: 0xffffa1000761f6c8 sd#5 //P/XZ%@g600144f0908f3f600000581cac750001 b_shadow: 0xffffa10019e47f80 (struct page **) thread: 0xffffa100164dd500 state: slp PID: 11189 cmd: /u01/app/12.1.0/grid/bin/ocssd.bin idle: 28810195310 hrticks (28.810195310s) buf @ 0xffffa10017d0fe90 b_edev: 212,320(sd5,0) b_blkno: 0x80004 b_flags: 0x2200061 (BUSY|PHYS|READ|SHADOW|STARTED|PAGE_NOWAIT) b_addr: 0x1fafa00 b_bcount: 512 b_bufsize: 0 b_dip: 0xffffa1000761f6c8 sd#5 //P/XZ%@g600144f0908f3f600000581cac750001 b_shadow: 0xffffa10018c4ad70 (struct page **) 2 matching threads found in biowait() threads in biowait() by device: count device (thread: max idle time) 2 212,320(sd5,0) (0xffffa100164e0240: 1 hours 28 minutes 58.329491622 seconds)
L’analyse des thread en attente d’I/O révèle les informations suivantes :
- Les thread sont bloqués depuis plus de 28 secondes (sémaphore)
- Les thread sont bloqués sur le même device sd#5
Les informations sur le device sd#5 sont les suivantes.
sd5 @ 0xffffa10003566080(sd_lun) //P/XZ%@g600144f0908f3f600000581cac750001 0xffffa1000761f6c8 name: scsiclass,00@g600144f0908f3f600000581cac750001 instance #: 5 scsi device @ 0xffffa10010257298 hba_tran: 0xffffa10000088640 target: 29336 lun: 37 sub_lun: 16 scsi inquiry data @ 0xffffa10010257658 : dtype: 0 qualifier: 0 removable: 0 ANSI version: 5 ECMA version: 0 ISO version: 0 length: 69 response format: 2 TERM I/O Proc msg: 0 async event notification: 0 scsi support: soft rst: 0 cmdque: 1 linked cmds: 0 sync xfer: 0 16 bit xfers: 0 32 bit xfers: 0 relative addr: 0 vendor id: SUN product id: Sun Storage 7000 revision: 1.0 serial #: sd_lun @ 0xffffa10003566080, un_sd: 0xffffa10010257298 throttle: 256, saved_throttle: 256, busy_throttle: 0 un_rqs_bp: 0x0, un_rqs_pktp: 0x0 un_sense_isbusy: 0 Last Request Sense Packet (using un_rqs_pktp): un_ncmds_in_driver: 2, un_ncmds_in_transport: 2 open counts: layered (none) regular(char): 1 Geometry is NOT valid Packet Flags for Tagged Queueing: FLAG_STAG - Run command as SIMPLE QUEUE tagged cmd Last pkt reason: CMD_CMPLT - no transport errors- normal completion State: SD_STATE_NORMAL Last state: SD_STATE_NORMAL SCSI State Change Translation: No state change Reservation status: SD_RELEASE [...]
Petit récapitulatif des informations collectées :
- Le « panic » a été provoqué par le « deamon cssdagent » (avec la commande uadmin)
- Le dernier processus actif avant le « panic » était le deamon « ocssd.bin »
- Le deamon « ocssd.bin » a 2 thread bloqués (+ 28 secondes) sur le device sd#5
- L’analyse du device sd#5 illustre deux commandes dans la queue d’exécution
Le fonctionnement du cluster Oracle RAC est le suivant : si un des serveurs du cluster n’accède plus au disque de vote pendant plus de 27 secondes (valeur de misscount, par défaut de 30 secondes, moins 3 secondes pour le temps du redémarrage) alors le mécanisme du cluster évacue ce serveur afin de garantir l’intégrité du système complet.
Dans notre cas le deamon « ocssd.bin » est en attente d’accès au disque de vote (sd#5) depuis plus de 28 secondes. Ce qui explique le panic du serveur.
Je reviendrai dans un prochain article pour décrire la suite de l’analyse.