All posts by hanafi

Failed to start package crsp_s1, rollback steps

Symtomps

Node# tail /var/adm/cmcluster/log/crsp_s1.log
Sep 25 01:16:50 – Node “” *** /opt/cmcluster/SGeRAC/toolkit/crsp/toolkit _oc.sh called with start argument. ***
Sep 25 01:16:50 – Node “” : Starting Oracle Clusterware at Tue Sep 25 01 :16:50 UTC 2018
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.
Sep 25 01:16:50 – Node “” ERROR: Function oc_start_cmd: Failed to start Oracle Clusterware
Sep 25 01:16:50 [email protected] master_control_script.sh[5486]: ##### Failed to st art package crsp_s1, rollback steps #####
Sep 25 01:16:50 – Node “” *** /opt/cmcluster/SGeRAC/toolkit/crsp/toolkit _oc.sh called with stop argument. ***
Sep 25 01:16:50 – Node “” : Stopping Oracle Clusterware at Tue Sep 25 01 :16:50 UTC 2018
Sep 25 01:16:50 – Node “” Oracle Clusterware is already stopped
Sep 25 01:16:50 [email protected] master_control_script.sh[5486]: ###### Failed to s tart package for crsp_s1 ######

Node:home/ # cmviewcl

CLUSTER STATUS
<clustername> up

SITE_NAME Node_pri

NODE STATUS STATE
Node1 up running
Node2 up running

PACKAGE STATUS STATE AUTO_RUN NODE
prismp_sc up running enabled Node2

NODE STATUS STATE
Node3 up running

SITE_NAME Node_sec

NODE STATUS STATE
Node4 up running
Node5 up running
Node6 up running

MULTI_NODE_PACKAGES

PACKAGE STATUS STATE AUTO_RUN SYSTEM
SG-CFS-pkg up running enabled yes
SG-CFS-crsp_s1 up running enabled no
SG-CFS-crsp_s2 up running enabled no
crsp_s1 up (2/3) running enabled no
crsp_s2 up running enabled no
SG-CFS-prismp_s1 up running enabled no
SG-CFS-prismp_s2 down halted enabled no
prismp_s1 up (2/3) running enabled no
prismp_s2 down halted enabled no
Node:home/ #

Causes

It looks like network connection issue as per below log:

Node1:/ $ tail /u01/app/grid/11203/log/Node1/cssd/ocssd.log
2018-09-21 10:47:08.187: [ CSSD][27]clssnmvDHBValidateNcopy: node 2, Node2, has a disk HB, but no network HB, DHB has rcfg 414478488, wrtcnt, 225000299, LATS 275224262, lastSeqNo 225000296, uniqueness 1519012441, timestamp 1537526827/1334746757
2018-09-21 10:47:08.187: [ CSSD][27]clssnmvDHBValidateNcopy: node 3, Node3, has a disk HB, but no network HB, DHB has rcfg 414478488, wrtcnt, 224639603, LATS 275224262, lastSeqNo 224639600, uniqueness 1519018579, timestamp 1537526827/1328775359
2018-09-21 10:47:08.190: [ CSSD][30]clssnmvDHBValidateNcopy: node 3, Node3, has a disk HB, but no network HB, DHB has rcfg 414478488, wrtcnt, 224639604, LATS 275224264, lastSeqNo 224639601, uniqueness 1519018579, timestamp 1537526827/1328775836
2018-09-21 10:47:08.197: [ CSSD][36]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
2018-09-21 10:47:08.200: [ CSSD][33]clssnmvDHBValidateNcopy: node 3, Node3, has a disk HB, but no network HB, DHB has rcfg 414478488, wrtcnt, 224639605, LATS 275224274, lastSeqNo 224639602, uniqueness 1519018579, timestamp 1537526828/1328775956
2018-09-21 10:47:09.196: [ CSSD][30]clssnmvDHBValidateNcopy: node 2, Node2, has a disk HB, but no network HB, DHB has rcfg 414478488, wrtcnt, 225000300, LATS 275225270, lastSeqNo 225000021, uniqueness 1519012441, timestamp 1537526828/1334747680
2018-09-21 10:47:09.196: [ CSSD][30]clssnmvDHBValidateNcopy: node 3, Node3, has a disk HB, but no network HB, DHB has rcfg 414478488, wrtcnt, 224639607, LATS 275225270, lastSeqNo 224639604, uniqueness 1519018579, timestamp 1537526828/1328776846
2018-09-21 10:47:09.197: [ CSSD][27]clssnmvDHBValidateNcopy: node 2, Node2, has a disk HB, but no network HB, DHB has rcfg 414478488, wrtcnt, 225000302, LATS 275225272, lastSeqNo 225000299, uniqueness 1519012441, timestamp 1537526828/1334747769
2018-09-21 10:47:09.207: [ CSSD][36]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
2018-09-21 10:47:09.210: [ CSSD][33]clssnmvDHBValidateNcopy: node 3, Node3, has a disk HB, but no network HB, DHB has rcfg 414478488, wrtcnt, 224639608, LATS 275225284, lastSeqNo 224639605, uniqueness 1519018579, timestamp 1537526829/1328776966
Node1:/ $

 

When  tried to ping the CI gateway, it was failed:

Node1:11203/bin # ping CI-GW
PING CI-GW: 64 byte packets

 

Resolutions

The current config of lan interface of CI is lan1, so it need to be changed to other working lan interface that having States Link UP.

After changed to other working lan, it works fine:

Node1:11203/bin # ping CI-GW
PING CI-GW: 64 byte packets
64 bytes from CI-GW: icmp_seq=0. time=0. ms
64 bytes from CI-GW: icmp_seq=1. time=0. ms


Then, the toolkit of crsp can be started:

Node1:11203/bin # /opt/cmcluster/SGeRAC/toolkit/crsp/toolkit_oc.sh start
Sep 25 02:46:46 – Node “Node1” *** /opt/cmcluster/SGeRAC/toolkit/crsp/toolkit _oc.sh called with start argument. ***
Sep 25 02:46:46 – Node “Node1” : Starting Oracle Clusterware at Tue Sep 25 02 :46:46 UTC 2018
Sep 25 02:46:46 – Node “Node1” Oracle Clusterware is already started
Node1:11203/bin #

After that, the switching mod of the crsp package need to be enabled:

Node:11203/bin # cmmodpkg -e -v -n Node1 crsp_s1
Enabling node Node1 for switching of package crsp_s1
Successfully enabled package crsp_s1 to run on node Node1
cmmodpkg: Completed successfully on all packages specified
Node1:11203/bin # cmrunpkg crsp_s1
Package crsp_s1 is already running on all active nodes
cmrunpkg: All specified packages are running
Node1:11203/bin #

We may verify the running packages by cmviewcl command:

Node1:11203/bin # cmviewcl

CLUSTER STATUS
<clustername> up

SITE_NAME Site_pri

NODE STATUS STATE
Node1 up running
Node2 up running

PACKAGE STATUS STATE AUTO_RUN NODE
prismp_sc up running enabled Node3

NODE STATUS STATE
Node3 up running

SITE_NAME Site_sec

NODE STATUS STATE
Node4 up running
Node5 up running
Node6 up running

MULTI_NODE_PACKAGES

PACKAGE STATUS STATE AUTO_RUN SYSTEM
SG-CFS-pkg up running enabled yes
SG-CFS-crsp_s1 up running enabled no
SG-CFS-crsp_s2 up running enabled no
crsp_s1 up running enabled no
crsp_s2 up running enabled no

#################################################

Unable to run package on node

Symptoms

When you try to bring up the package in service guard, the package wont coming up with below errors:

[[email protected] ~]# cmrunpkg <packagename>
Running package <packagename> on node node2
The package script for <packagename> failed with no restart. <packagename> should not be restarted
Unable to run package <packagename> on node node2
Check the syslog and pkg log files for more detailed information
cmrunpkg: Unable to start some package or package instances.

Its same also when we try to bring up the package on the other node.

Cause

When we look at to the logs file locate in /usr/local/cmcluster/run/log/<packagename>.log, below errors found:

Sep 20 00:09:03 – Node “node2”: Exporting filesystem on /opt/apps
exportfs: internal: no supported addresses in nfs_client
exportfs: <ip_address>:/opt/apps: No such file or directory

exportfs: internal: no supported addresses in nfs_client
exportfs: <ip_address>:/opt/apps: No such file or directory

exportfs: internal: no supported addresses in nfs_client
exportfs: <ip_address>:/opt/apps: No such file or directory

exportfs: internal: no supported addresses in nfs_client
exportfs: <ip_address>:/opt/apps: No such file or directory

exportfs: internal: no supported addresses in nfs_client
exportfs: <ip_address>:/opt/apps: No such file or directory
ERROR: Function export_fs
ERROR: Failed to export -o rw @nfs1:/opt/apps
Sep 20 00:09:04 – Node “node2”: Unexporting filesystem on @nfs1:/opt/apps

## Failed to start package <packagename>, rollback steps #####
Sep 19 23:44:20 [email protected] tkit_module.sh[32107]: Install directory operation mode selected.
WARNING: Stoping rmtab synchronization proces: /usr/local/cmcluster/conf/<packagename>/sync_rmtab.PID does not exist
Sep 19 23:44:20 – Node “node2”: Unexporting filesystem on @nfs1:/opt/apps
exportfs: Could not find ‘@nfs1:/opt/apps’ to unexport.
ERROR: Function un_export_fs
ERROR: Failed to unexport @nfs1:/opt/apps

Sep 20 00:09:05 [email protected] master_control_script.sh[31933]: ###### Failed to start package for <packagename> ######

Check the status of services of nfs.

[[email protected] ]# /etc/init.d/nfs status
rpc.svcgssd is stopped
rpc.mountd is stopped
nfsd is stopped
rpc.rquotad is stopped
[[email protected]]#

The reason why the cluster packages wont start up is because the service of nfs is stopped and those need to be running up.

 

Resolutions

We may start the nfs services;

[[email protected]]# /etc/init.d/nfs start
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS mountd: rpc.mountd: svc_tli_create: could not open connection for udp6
rpc.mountd: svc_tli_create: could not open connection for tcp6
rpc.mountd: svc_tli_create: could not open connection for udp6
rpc.mountd: svc_tli_create: could not open connection for tcp6
rpc.mountd: svc_tli_create: could not open connection for udp6
rpc.mountd: svc_tli_create: could not open connection for tcp6
[ OK ]
Starting NFS daemon: rpc.nfsd: address family inet6 not supported by protocol TCP
[ OK ]
Starting RPC idmapd: [ OK ]

Verify the nfs service;
[[email protected]]# /etc/init.d/nfs status
rpc.svcgssd is stopped
rpc.mountd (pid 17790) is running…
nfsd (pid 17810 17809 17808 17807 17806 17805 17804 17803) is running…
rpc.rquotad (pid 17773) is running…

Then, the package can be run;
[[email protected]]# cmrunpkg <packagename>
Running package <packagename> on node node2
Successfully started package <packagename> on node node2
cmrunpkg: All specified packages are running
[[email protected]]#

Lastly, verify the status of packages in the cluster;

[[email protected] ~]# cmviewcl

CLUSTER STATUS
<clustername> up

SITE_NAME Site1_pri

NODE STATUS STATE
node1 up running

SITE_NAME Site2_sec

NODE STATUS STATE
node2 up running

PACKAGE STATUS STATE AUTO_RUN NODE
<packagename> up running disabled node2

##################################################

Unable to Change Directory to the Mount Point as Root – Permission Denied on HP-UX

Hello… i will show you how to solve the issue of permission denied when you find “permission denied” when trying to change directory to the specific directory. Below is some of the example and already become as root:

# cd /usr/local/sap/tools/
ksh: /usr/local/sap/tools/: permission denied
# ll /usr/local/sap/tools/
/usr/local/sap/tools/ not found
#

when i trying to display all the mountpoints, there were no mountpoint that i want to change to except for /usr, but i believe the abovementioned directory is not using /usr, but must be coming from external network.  On top of that, changing mod to the directory also not working as well as per below example:

# pwd
/usr/local
# chmod 755 sap/
chmod: can’t change sap/: Permission denied

When i see the mounted partition in a working server, i can see the mount point as nfs and imported from nfs server, please see below:

tools-x.xx.xxx.net:/usr/local/sap
4145152 2287273 1741912 57% /usr/local/sap

In order to get clarified, i have to see the properties of exported mount points on the nfs server:

#showmount -e <nfs_server>
export list for <nfs_server>:
/usr/local/sap (everyone)

So, from the above result, i know that mount point should be accessible and mounted by everyone and no issue if we want to mount it from the client side.

Cause

The issue is when i try to mount the nfs on client side, the error show up as device busy:

# mount <nfs_server>:/usr/local/sap /usr/local/sap
nfs mount: /usr/local/sap: Device busy

And i can see the above mount point been mounted:

# mount |grep -i ‘local/sap’
/usr/local/sap on /etc/auto_direct ignore,direct,dev=4000044 on Fri Aug 31 15:49:08 2018

Resolution

This can be resolved by unmount first the partition and mount it back accordingly. You may verify the mount point by using ‘bdf’ command as per below example:

# umount /usr/local/sap; # mount tools-<nfs_server>:/usr/local/sap  /usr/local/sap
# bdf

tools-ent.<nfs_server>:/usr/local/sap
4145152 2287273 1741912 57% /usr/local/sap

Lastly, you also may change directory to the above partition and list down its files without any problem.

Cloud Computing Awareness and Adoption in SME

A brief survey questionnaire that explains the details of why cloud computing should or should not be adopted to helps participants better understand the purpose of cloud–and can motivate us to share our thoughts.

This is one of the platform you have the ability to get explained the bigger picture of this online survey, so take the time to get briefed on the data collected here means for your organization.Please  go ahead for the survey form on below link and thanks for your time:

https://goo.gl/forms/ATwDZI6R7xmxIanS2

Brief of State of Network Security

  1. What is network security?

Is a process of taking measures to protect an organisation’s network infrastructure from unauthorized access by creating a secure platform for server including mitigating risk to the critical devices.

2. How risk, threat and vulnerability related each other?

Risk an be expressed as; Risk = Threat x Vulnerability. Threat is a potential harm that can exploit vulnerability and / or intrude into the computer system. While vulnerability is a weaknesses that may allow threat to  run in  the system.

3. List  the key characteristics of attacks?

Attacks are growing dramatically: Activities involving cyber attacks increased exponentially as well as instances of malware.

Threats are more sophisticated: Threats crime been sophisticated and normally it is unexpected because it has been deployed in one step ahead or take it for granted on the loophole.

Known outnumbered by unknowns : Focus on what is known and always be ready for known and unknown attacks

Current approach is ineffective: Current approach is insufficient to address the level and type of attacks that are presently occurring due to the ever-changing nature of attacks.

Current approach in handling security?

Define the goals of integrity principle in network security?

Confidentiality: Prevent the unauthorized disclosure of sensitive information.

Integrity: Prevent information fabrication by unauthorized user, Prevent unauthorized fabrication of information by authorized user and Preserve of the internal and external consistency.

Availability: Provide authorized user timely and uninterrupted access to the information in the network system.

  1. What are the main reasons for unreported security breaches?

-To secure the company’s reputation

– Company do not know when a breach been committed.

2.  Briefly describe two main types of attacks?

-Passive attacks; Sniffing and information gathering.

– Active attacks; Denial of service, Breaking into a site.

3.  What are the aspects of approaching good cyber security in dealing with attacks?

Aspects of approaching good cyber security are:-

– Management buy-in

– Policy development with regular updates and revisions,

– Policy reviews

– Knowledgeable network staff

– Training

–  Tested process

– Third party assessment

What is Kerberos?

Kerberos is a authentication protocol that involve three sides which are client, server and a Kerberos Distribution Center (KDC) and running Authentication Server (AS) and Ticket Granting Server before establish connection to the application.

Client will connect to AS to obtain TGS session key and ticket. Once connected, client will request TGS to obtain a Application Session Key (ASK) and secret’s key.

Client will be sending its ticket, ASK and secret’s key to the application server to initiate a connection in between client and application server.

 

 

Setup a SVM mirror in Solaris 10

Part Tag Flag Cylinders Size Blocks
0 root wm 70 – 1143 8.23GB (1074/0/0) 17253810
1 swap wu 3 – 69 525.56MB (67/0/0) 1076355
2 backup wm 0 – 1170 8.97GB (1171/0/0) 18812115
3 unassigned wu 0 0 (0/0/0) 0
4 unassigned wu 0 0 (0/0/0) 0
5 unassigned wu 0 0 (0/0/0) 0
6 unassigned wu 0 0 (0/0/0) 0
7 home wm 1144 – 1170 211.79MB (27/0/0) 433755
8 boot wu 0 – 0 7.84MB (1/0/0) 16065
9 alternates wu 1 – 2 15.69MB (2/0/0) 32130

Partition 0 is /
Partition 1 is swap
Partition 8 is /boot
Partition 9 is where metadevice state database

metadb -a -f -c3 /dev/dsk/c0d0s9

# metainit -f d12 1 1 c0d0s0

# metainit -f d12 1 1 c0d0s1

# metainit -f d12 1 1 c0d0s8

# metastat -p

# metainit d10 -m d12

# metaroot d10

# metainit d20 -m d22

# metainit d30 -m d32

# shutdown -y -g0 -i6

Then create the metadevices for the other side of the mirror and attach them

metainit -f d11 1 1 c0d1s0
metainit -f d21 1 1 c0d1s1
metainit -f d31 1 1 c0d1s8

metattach d10 d11
metattach d20 d21
metattach d30 d31

metadb -a -f -c3 /dev/dsk/c0d1s9

Solaris Volume Manager (SVM) x86 How to Replace a Failed, SCSI Disk, Mirrored with SVM

Verify failed disk (in this example, c1t0d0 is the failed disk)

# metastat -c

#format

#tail /var/adm/messages

# metastat -c (We can see that the disk is no longer an active member of the mirror.)

 

Remove failed disk from existing mirror group

# metadetach <mirror> <submirror>

# iostat -iEn c1t0d0

#cfgadm -al

# cfgadm -c unconfigure c1::dsk/c1t0d0

Maybe there is a need to delete the metadb with ‘metadb -d c1t0d0s7’ before ‘cfgadm -c unconfigure …’ can complete.

This command will remove the block and character (raw) device nodes the symbolic links in /dev/[r]dsk point to.

Physically replace the disk. Configure the new disk back into Solaris.

# cfgadm -c configure c1::dsk/c1t0d0

# ls -lL /dev/dsk/c1t0d0s* <— check the device nodes
# ls -lL /dev/rdsk/c1t0d0s*

# format

# iostat -iEn c1t0d0

if boot disk, run below:
# fdisk -b /usr/lib/fs/ufs/mboot /dev/rdsk/c1t0d0p0

if not, run below:
# fdisk /dev/rdsk/c1t0d0p0
# prtvtoc /dev/rdsk/c1t1d0s2 | fmthard -s – /dev/rdsk/c1t0d0s2
# /sbin/installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t0d0s0
# metadb
# metadb -d /dev/dsk/c1t0d0s7 <—-remove old metadb replicas
# metadb -a -c3 /dev/dsk/c1t0d0s7 <—re-add new metadb replicas
# metadb
# metadevadm -u c1t0d0

#metainit -f d11 1 1 c1t0d0s0
#metainit -f d21 1 1 c1t0d0s1
#metainit -f d31 1 1 c1t0d0s3

#metattach d10 d11
#metattach d20 d21
#metattach d30 d31

#metastat -c     (below is the sample output)

d20        m 525MB d22 d21 (resync-19%)
d22 s 525MB c0d0s1
d21 s 525MB c0d1s1
d30        m 211MB d32 d31 (resync-33%)
d32 s 211MB c0d0s7
d31 s 211MB c0d1s7
d10       m 8.2GB d12 d11 (resync-0%)
d12 s 8.2GB c0d0s0
d11 s 8.2GB c0d1s0

Brief of WLAN

1.0)     What Is the Meaning of WLAN?

Wireless Local Area Networks or WLAN have been rapidly growing and getting a lot of interest from numerous people whether it was noticed or not. Basically, WLAN has been initiated by a cellular spectrum technology that being evolve to become friendly network connections. It helps us to minimize the physical wiring in designing the networks and indirectly reduce the cost of development. In spite of that, there were always been a pros and contras in terms of various criteria such as performance, data rates, and so forth need to be elaborate so we will get this things clearer. Therefore, the brief of architecture and along with its challenges faced by utilizing WLAN will be discussed in the next paragraph.

1.1)      When It Was Started?

Officially, IEEE has created a standard approach for wireless technology for the usage of enterprise, home and public on 1997. However, there was some claim said that the research and study of this wireless LAN has been started earlier.

Kevin J. Negus and Al Petrick in “History of Wireless Local Area Networks (WLANs) in the Unlicensed Bands”, George Mason University Law School Conference, Information Economy Project, Arlington, in 2008 have mentioned in that article the first product of WLAN was the Telesystems “ARLAN-SST” (circa 1988) in 1988. [8]

1.2)      How the Term Wi-Fi Get In Place?

There was no solid evidence the term “wifi” is owned by any organization. The only close to truth owner of the term “wifi” was from the WECA that chosen “WI-FI” on 802.11b Direct Sequence in 1999 and patented it as “WI-FI” [1] that including the computer hardware, namely, wireless local area networking products in class A However, Cory Doctorow [2] in his blog boingboing.net has stated that Phil Belanger, a founding member of the Wi-Fi Alliance who presided over the selection of the name “Wi-Fi” writes:

“Wi-Fi doesn’t stand for anything. It is not an acronym. There is no meaning.

Wi-Fi and the ying yang style logo were invented by Interbrand. We (the founding members of the Wireless Ethernet Compatibility Alliance, now called the Wi-Fi Alliance) hired Interbrand to come up with the name and logo that we could use for our interoperability seal and marketing efforts. We needed something that was a little catchier than “IEEE 802.11b Direct Sequence”.

Extending Zpool to Increase Size of Partition in Solaris 10

Hi, the needs for the storage has been rapidly growing from time to time especially when you are in the enterprise environment. From my experience, there are always be a request for increasing the size of a partition but i have not recall gotten the request for reducing the size.  Let say i want to increase the size of my non-global zone partition.What most important thing that we have to remember is whether the non-global zone was attached to its dedicated pool or sharing with the pool with its global pool. Is it any available space that i can extend the space without adding new lun? or there is no any available space that i can extend the space unless adding new lun to the pool.I want to share here how to increase the size of partition of zones by adding new disk to the existing pool.

  1. Is it any available space that i can extend the space without adding new lun?

Growing a ZFS pool

http://blog.ociru.net/2013/09/25/let-your-zfs-extend

http://www.c0t0d0s0.org/archives/6224-You-dont-need-zfs-resize-…-and-a-workaround-when-you-need-one-;.html